Here's an lxml snippet that extracts an attribute as well as element text (your question was a little ambiguous about which one you needed, so I'm including both):
from lxml import etree
doc = etree.parse(filename)
memoryElem = doc.find('memory')
print memoryElem.text # element text
print memoryElem.get('unit') # attribute
You asked (in a comment on Ali Afshar's answer) whether minidom (2.x, 3.x) is a good alternative. Here's the equivalent code using minidom; judge for yourself which is nicer:
import xml.dom.minidom as minidom
doc = minidom.parse(filename)
memoryElem = doc.getElementsByTagName('memory')[0]
print ''.join( [node.data for node in memoryElem.childNodes] )
print memoryElem.getAttribute('unit')
lxml seems like the winner to me.
Answer from ron rothman on Stack OverflowReading XML file and fetching its attributes value in Python - Stack Overflow
How to read xml file using python? - Stack Overflow
A Roadmap to XML Parsers in Python – Real Python
What should I use for XML parsing in python3?
If you are just doing basic XML parsing, I would recommend using ElementTree. If you are using namespaces, I would recommend switching to lxml.
BeautifulSoup is also good.
More on reddit.comVideos
Here's an lxml snippet that extracts an attribute as well as element text (your question was a little ambiguous about which one you needed, so I'm including both):
from lxml import etree
doc = etree.parse(filename)
memoryElem = doc.find('memory')
print memoryElem.text # element text
print memoryElem.get('unit') # attribute
You asked (in a comment on Ali Afshar's answer) whether minidom (2.x, 3.x) is a good alternative. Here's the equivalent code using minidom; judge for yourself which is nicer:
import xml.dom.minidom as minidom
doc = minidom.parse(filename)
memoryElem = doc.getElementsByTagName('memory')[0]
print ''.join( [node.data for node in memoryElem.childNodes] )
print memoryElem.getAttribute('unit')
lxml seems like the winner to me.
XML
<data>
<items>
<item name="item1">item1</item>
<item name="item2">item2</item>
<item name="item3">item3</item>
<item name="item4">item4</item>
</items>
</data>
Python :
from xml.dom import minidom
xmldoc = minidom.parse('items.xml')
itemlist = xmldoc.getElementsByTagName('item')
print "Len : ", len(itemlist)
print "Attribute Name : ", itemlist[0].attributes['name'].value
print "Text : ", itemlist[0].firstChild.nodeValue
for s in itemlist :
print "Attribute Name : ", s.attributes['name'].value
print "Text : ", s.firstChild.nodeValue
Use ElementTree:
import xml.etree.ElementTree as ET
tree = ET.parse('Config.xml')
root = tree.getroot()
print(root.findall('.//Log'))
Output:
pawel@pawel-XPS-15-9570:~/test$ python parse_xml.py
[<Element 'Log' at 0x7fb3f2eee9f
Below:
import xml.etree.ElementTree as ET
xml = '''<?xml version="1.0" encoding="UTF-8"?>
<Automation_Config>
<Path>
<Log>.\SERVER.log</Log>
<Flag_Path>.\Flag</Flag_Path>
<files>.\PO</files>
</Path>
</Automation_Config>'''
root = ET.fromstring(xml)
for idx,log_element in enumerate(root.findall('.//Log')):
print('{}) Log value: {}'.format(idx,log_element.text))
output
0) Log value: .\SERVER.log