I simply solved it with the indent() function:
xml.etree.ElementTree.indent(tree, space=" ", level=0)Appends whitespace to the subtree to indent the tree visually. This can be used to generate pretty-printed XML output. tree can be anElementorElementTree.spaceis the whitespace string that will be inserted for each indentation level, two space characters by default. For indenting partial subtrees inside of an already indented tree, pass the initial indentation level aslevel.
tree = ET.ElementTree(root)
ET.indent(tree, space="\t", level=0)
tree.write(file_name, encoding="utf-8")
Note, the indent() function was added in Python 3.9.
I simply solved it with the indent() function:
xml.etree.ElementTree.indent(tree, space=" ", level=0)Appends whitespace to the subtree to indent the tree visually. This can be used to generate pretty-printed XML output. tree can be anElementorElementTree.spaceis the whitespace string that will be inserted for each indentation level, two space characters by default. For indenting partial subtrees inside of an already indented tree, pass the initial indentation level aslevel.
tree = ET.ElementTree(root)
ET.indent(tree, space="\t", level=0)
tree.write(file_name, encoding="utf-8")
Note, the indent() function was added in Python 3.9.
Whatever your XML string is, you can write it to the file of your choice by opening a file for writing and writing the string to the file.
from xml.dom import minidom
xmlstr = minidom.parseString(ET.tostring(root)).toprettyxml(indent=" ")
with open("New_Database.xml", "w") as f:
f.write(xmlstr)
There is one possible complication, especially in Python 2, which is both less strict and less sophisticated about Unicode characters in strings. If your toprettyxml method hands back a Unicode string (u"something"), then you may want to cast it to a suitable file encoding, such as UTF-8. E.g. replace the one write line with:
f.write(xmlstr.encode('utf-8'))
Proper indentation for XML files
Pretty printing XML in Python - Stack Overflow
inserting newlines in xml file generated via xml.etree.ElementTree in python - Stack Overflow
python - ElementTree : insert method and bad indentation output - Stack Overflow
I'm working on a Python script that reads data from an Excel file and generates individual XML files based on each valid row. The script extracts various fields, handles date formatting, and builds a structured XML tree for each record. For certain entries, I also include duplicate tags with additional details (e.g., a second <Description> tag with a formatted date).
Now, I want the XML output to be properly indented for readability. I came across xml.etree.ElementTree.indent(tree, space=' ', level=0) as a possible way to format the XML. Is this the correct and recommended method to add indentation to the XML files I'm creating? If so, where exactly in my code should I use it for best results? Also im pretty new to python, like this would be my first time doing something on python apart from v basic code in the past. If anyone knows some resources that they think could help, i would really appreciate that too. Any help is appreciated :)
import xml.dom.minidom
dom = xml.dom.minidom.parse(xml_fname) # or xml.dom.minidom.parseString(xml_string)
pretty_xml_as_string = dom.toprettyxml()
lxml is recent, updated, and includes a pretty print function
import lxml.etree as etree
x = etree.parse("filename")
print etree.tostring(x, pretty_print=True)
Check out the lxml tutorial: https://lxml.de/tutorial.html
UPDATE 2022 - python 3.9 and later versions
For python 3.9 and later versions the standard library includes xml.etree.ElementTree.indent:
Example:
import xml.etree.ElementTree as ET
root = ET.fromstring("""<fruits><fruit>banana</fruit><fruit>apple</fruit></fruits>""")
tree = ET.ElementTree(root)
ET.indent(tree, ' ')
# writing xml
tree.write("example.xml", encoding="utf-8", xml_declaration=True)
Thanks Michał Krzywański for this update!
BEFORE python 3.9
I found a new way to avoid new libraries and reparsing the xml. You just need to pass your root element to this function (see below explanation):
def indent(elem, level=0):
i = "\n" + level*" "
if len(elem):
if not elem.text or not elem.text.strip():
elem.text = i + " "
if not elem.tail or not elem.tail.strip():
elem.tail = i
for elem in elem:
indent(elem, level+1)
if not elem.tail or not elem.tail.strip():
elem.tail = i
else:
if level and (not elem.tail or not elem.tail.strip()):
elem.tail = i
There is an attribute named "tail" on xml.etree.ElementTree.Element instances. This attribute can set an string after a node:
"<a>text</a>tail"
I found a link from 2004 telling about an Element Library Functions that uses this "tail" to indent an element.
Example:
# added triple quotes
root = ET.fromstring("""<fruits><fruit>banana</fruit><fruit>apple</fruit></fruits>""")
tree = ET.ElementTree(root)
indent(root)
# writing xml
tree.write("example.xml", encoding="utf-8", xml_declaration=True)
Result on "example.xml":
<?xml version='1.0' encoding='utf-8'?>
<fruits>
<fruit>banana</fruit>
<fruit>apple</fruit>
</fruits>
The easiest solution I think is switching to the lxml library. In most circumstances you can just change your import from import xml.etree.ElementTree as etree to from lxml import etree or similar.
You can then use the pretty_print option when serializing:
tree.write(filename, pretty_print=True)
(also available on etree.tostring)
As of Python 3.9, a new method named indent() has been added to xml.etree.ElementTree, which can be used to generate pretty-printed XML output without using another module or library.
To fix your example in Python 3.9+, just add the following line right before writing the tree to the file:
ET.indent(tree, ' ')
Format output.
from xml.dom import minidom
dom = minidom.parse('country_data.xml')
with open('dom_write.xml', 'w', encoding='UTF-8') as fh:
dom.writexml(fh, indent='', addindent='\t', newl='', encoding='UTF-8')