This has nothing to do with the xml file format, but in which encoding your file is. Python3 assumes everything to be in utf-8, but if you are on windows your file is probably in windows-1252. You should use:
f = open("text.txt", "r", encoding="cp1252")
Answer from syntonym on Stack OverflowThis has nothing to do with the xml file format, but in which encoding your file is. Python3 assumes everything to be in utf-8, but if you are on windows your file is probably in windows-1252. You should use:
f = open("text.txt", "r", encoding="cp1252")
this will sure do your job.
a=[]
with open('reboot.xml', 'r') as f:
a = f.read()
f.closed
print a
Converting a Python XML ElementTree to a String - Stack Overflow
python parse xml text - Stack Overflow
A Roadmap to XML Parsers in Python โ Real Python
How to process / parse XML with Python
Videos
This should work:-
xmlstr = ET.tostring(root, encoding='utf8', method='xml')
How do I convert ElementTree.Element to a String?
For Python 3:
xml_str = ElementTree.tostring(xml, encoding='unicode')
For Python 2:
xml_str = ElementTree.tostring(xml, encoding='utf-8')
For compatibility with both Python 2 & 3:
xml_str = ElementTree.tostring(xml).decode()
Example usage
from xml.etree import ElementTree
xml = ElementTree.Element("Person", Name="John")
xml_str = ElementTree.tostring(xml).decode()
print(xml_str)
Output:
<Person Name="John" />
Explanation
Despite what the name implies, ElementTree.tostring() returns a bytestring by default in Python 2 & 3. This is an issue in Python 3, which uses Unicode for strings.
In Python 2 you could use the
strtype for both text and binary data. Unfortunately this confluence of two different concepts could lead to brittle code which sometimes worked for either kind of data, sometimes not. [...]To make the distinction between text and binary data clearer and more pronounced, [Python 3] made text and binary data distinct types that cannot blindly be mixed together.
Source: Porting Python 2 Code to Python 3
If we know what version of Python is being used, we can specify the encoding as unicode or utf-8. Otherwise, if we need compatibility with both Python 2 & 3, we can use decode() to convert into the correct type.
For reference, I've included a comparison of .tostring() results between Python 2 and Python 3.
ElementTree.tostring(xml)
# Python 3: b'<Person Name="John" />'
# Python 2: <Person Name="John" />
ElementTree.tostring(xml, encoding='unicode')
# Python 3: <Person Name="John" />
# Python 2: LookupError: unknown encoding: unicode
ElementTree.tostring(xml, encoding='utf-8')
# Python 3: b'<Person Name="John" />'
# Python 2: <Person Name="John" />
ElementTree.tostring(xml).decode()
# Python 3: <Person Name="John" />
# Python 2: <Person Name="John" />
Thanks to Martijn Peters for pointing out that the str datatype changed between Python 2 and 3.
Why not use str()?
In most scenarios, using str() would be the "cannonical" way to convert an object to a string. Unfortunately, using this with Element returns the object's location in memory as a hexstring, rather than a string representation of the object's data.
from xml.etree import ElementTree
xml = ElementTree.Element("Person", Name="John")
print(str(xml)) # <Element 'Person' at 0x00497A80>
From a file, you could normally do it as
from xml.dom import minidom
xmldoc = minidom.parse('~/diveintopython/common/py/kgp/binary.xml')
For a string, you can change it to
from xml.dom import minidom
xmldoc = minidom.parseString( Your string goes here )
You could use: xml.dom.minidom.parseString(text)
This method creates a StringIO object for the string and passes that on to parse().
You could also use the same technique of using StringIO for any other XML parser that expects a file-like object.
import StringIO
your_favourite_xml_parser.parse(StringIO.StringIO('<xml>...</xml>'))