You can parse the text as a string, which creates an Element, and create an ElementTree using that Element.
import xml.etree.ElementTree as ET
tree = ET.ElementTree(ET.fromstring(xmlstring))
I just came across this issue and the documentation, while complete, is not very straightforward on the difference in usage between the parse() and fromstring() methods.
You can parse the text as a string, which creates an Element, and create an ElementTree using that Element.
import xml.etree.ElementTree as ET
tree = ET.ElementTree(ET.fromstring(xmlstring))
I just came across this issue and the documentation, while complete, is not very straightforward on the difference in usage between the parse() and fromstring() methods.
If you're using xml.etree.ElementTree.parse to parse from a file, then you can use xml.etree.ElementTree.fromstring to get the root Element of the document. Often you don't actually need an ElementTree.
See xml.etree.ElementTree
xml - Convert Python ElementTree to string - Stack Overflow
Python XML: how to treat a node content as a string? - Stack Overflow
parsing XML with Python and the RESTful API
How to process / parse XML with Python
Videos
This should work:-
xmlstr = ET.tostring(root, encoding='utf8', method='xml')
How do I convert ElementTree.Element to a String?
For Python 3:
xml_str = ElementTree.tostring(xml, encoding='unicode')
For Python 2:
xml_str = ElementTree.tostring(xml, encoding='utf-8')
For compatibility with both Python 2 & 3:
xml_str = ElementTree.tostring(xml).decode()
Example usage
from xml.etree import ElementTree
xml = ElementTree.Element("Person", Name="John")
xml_str = ElementTree.tostring(xml).decode()
print(xml_str)
Output:
<Person Name="John" />
Explanation
Despite what the name implies, ElementTree.tostring() returns a bytestring by default in Python 2 & 3. This is an issue in Python 3, which uses Unicode for strings.
In Python 2 you could use the
strtype for both text and binary data. Unfortunately this confluence of two different concepts could lead to brittle code which sometimes worked for either kind of data, sometimes not. [...]To make the distinction between text and binary data clearer and more pronounced, [Python 3] made text and binary data distinct types that cannot blindly be mixed together.
Source: Porting Python 2 Code to Python 3
If we know what version of Python is being used, we can specify the encoding as unicode or utf-8. Otherwise, if we need compatibility with both Python 2 & 3, we can use decode() to convert into the correct type.
For reference, I've included a comparison of .tostring() results between Python 2 and Python 3.
ElementTree.tostring(xml)
# Python 3: b'<Person Name="John" />'
# Python 2: <Person Name="John" />
ElementTree.tostring(xml, encoding='unicode')
# Python 3: <Person Name="John" />
# Python 2: LookupError: unknown encoding: unicode
ElementTree.tostring(xml, encoding='utf-8')
# Python 3: b'<Person Name="John" />'
# Python 2: <Person Name="John" />
ElementTree.tostring(xml).decode()
# Python 3: <Person Name="John" />
# Python 2: <Person Name="John" />
Thanks to Martijn Peters for pointing out that the str datatype changed between Python 2 and 3.
Why not use str()?
In most scenarios, using str() would be the "cannonical" way to convert an object to a string. Unfortunately, using this with Element returns the object's location in memory as a hexstring, rather than a string representation of the object's data.
from xml.etree import ElementTree
xml = ElementTree.Element("Person", Name="John")
print(str(xml)) # <Element 'Person' at 0x00497A80>
Element objects have no .getroot() method. Drop that call, and the .tostring() call works:
xmlstr = ElementTree.tostring(et, encoding='utf8')
You only need to use .getroot() if you have an ElementTree instance.
Other notes:
This produces a bytestring, which in Python 3 is the
bytestype.
If you must have astrobject, you have two options:Decode the resulting bytes value, from UTF-8:
xmlstr.decode("utf8")Use
encoding='unicode'; this avoids an encode / decode cycle:xmlstr = ElementTree.tostring(et, encoding='unicode')
If you wanted the UTF-8 encoded bytestring value or are using Python 2, take into account that ElementTree doesn't properly detect
utf8as the standard XML encoding, so it'll add a<?xml version='1.0' encoding='utf8'?>declaration. Useutf-8orUTF-8(with a dash) if you want to prevent this. When usingencoding="unicode"no declaration header is added.
How do I convert ElementTree.Element to a String?
For Python 3:
xml_str = ElementTree.tostring(xml, encoding='unicode')
For Python 2:
xml_str = ElementTree.tostring(xml, encoding='utf-8')
Example usage (Python 3)
from xml.etree import ElementTree
xml = ElementTree.Element("Person", Name="John")
xml_str = ElementTree.tostring(xml, encoding='unicode')
print(xml_str)
Output:
<Person Name="John" />
Explanation
ElementTree.tostring() returns a bytestring by default in Python 2 & 3. This is an issue because Python 3 switched to using Unicode for strings.
In Python 2 you could use the
strtype for both text and binary data. Unfortunately this confluence of two different concepts could lead to brittle code which sometimes worked for either kind of data, sometimes not. [...]To make the distinction between text and binary data clearer and more pronounced, [Python 3] made text and binary data distinct types that cannot blindly be mixed together.
Source: Porting Python 2 Code to Python 3
If you know what version of Python is being used, you should specify the encoding as unicode or utf-8. For reference, I've included a comparison of .tostring() results between Python 2 and Python 3.
ElementTree.tostring(xml)
# Python 3: b'<Person Name="John" />'
# Python 2: <Person Name="John" />
ElementTree.tostring(xml, encoding='unicode')
# Python 3: <Person Name="John" />
# Python 2: LookupError: unknown encoding: unicode
ElementTree.tostring(xml, encoding='utf-8')
# Python 3: b'<Person Name="John" />'
# Python 2: <Person Name="John" />
ElementTree.tostring(xml).decode()
# Python 3: <Person Name="John" />
# Python 2: <Person Name="John" />
Note: While xml_str = ElementTree.tostring().decode() is compatible with both Python 2 & 3, Christopher Rucinski pointed out that this method fails when dealing with non-Latin characters).
Thanks to Martijn Peters for pointing out that the str datatype changed between Python 2 and 3.
Why not use str()?
In most scenarios, using str() would be the "canonical" way to convert an object to a string. However, using str() with Element returns the object's location in memory as a hexstring, rather than a string representation of the object's data.
from xml.etree import ElementTree
xml = ElementTree.Element("Person", Name="John")
print(str(xml)) # <Element 'Person' at 0x00497A80>