You can easily use xml (from the Python standard library) to convert to a pandas.DataFrame. Here's what I would do (when reading from a file replace xml_data with the name of your file or file object):

import pandas as pd
import xml.etree.ElementTree as ET
import io

def iter_docs(author):
    author_attr = author.attrib
    for doc in author.iter('document'):
        doc_dict = author_attr.copy()
        doc_dict.update(doc.attrib)
        doc_dict['data'] = doc.text
        yield doc_dict

xml_data = io.StringIO(u'''YOUR XML STRING HERE''')

etree = ET.parse(xml_data) #create an ElementTree object 
doc_df = pd.DataFrame(list(iter_docs(etree.getroot())))

If there are multiple authors in your original document or the root of your XML is not an author, then I would add the following generator:

def iter_author(etree):
    for author in etree.iter('author'):
        for row in iter_docs(author):
            yield row

and change doc_df = pd.DataFrame(list(iter_docs(etree.getroot()))) to doc_df = pd.DataFrame(list(iter_author(etree)))

Have a look at the ElementTree tutorial provided in the xml library documentation.

Answer from JaminSore on Stack Overflow
๐ŸŒ
Pandas
pandas.pydata.org โ€บ docs โ€บ reference โ€บ api โ€บ pandas.read_xml.html
pandas.read_xml โ€” pandas 3.0.1 documentation - PyData |
The XPath to parse required set of nodes for migration to DataFrame.``XPath`` should return a collection of elements and not a single element. Note: The etree parser supports limited XPath expressions. For more complex XPath, use lxml which requires installation. ... The namespaces defined in XML document as dicts with key being namespace prefix and value the URI.
Discussions

python - How to convert a Pandas dataframe to XML? - Stack Overflow
Is there a simple way to take a Pandas dataframe table: field_1 field_2 field_3 field_4 cat 15,263 2.52 00:03:00 dog 1,652 3.71 00:03:47 test 312 3.27 00:03:41 book 3... More on stackoverflow.com
๐ŸŒ stackoverflow.com
python - How to read XML file into Pandas Dataframe - Stack Overflow
I have a xml file: 'product.xml' that I want to read using pandas, here is an example of the sample file: 32... More on stackoverflow.com
๐ŸŒ stackoverflow.com
Parsing XML into a Pandas dataframe
To parse an XML file into a Pandas DataFrame, you can use the from_dict method of the DataFrame class. First, you will need to use the ElementTree module to parse the XML file and extract the relevant data. Here is an example of how this can be done: import xml.etree.ElementTree as ET import pandas as pd Parse the XML file using ElementTree tree = ET.parse('my_file.xml') root = tree.getroot() Extract the column names from the 'columns' element columns = [col.attrib['friendlyName'] for col in root.find('columns')] Create an empty list to store the data for each row data = [] Iterate over the 'row' elements and extract the data for each one for row in root.find('rows'): row_data = {} for col in row: # Add the data for each column to the dictionary row_data[col.attrib['name']] = col.text # Add the dictionary for this row to the list data.append(row_data) Create a DataFrame using the column names and data df = pd.DataFrame.from_dict(data, columns=columns) This code will parse the XML file and extract the data for each row and column, storing it in a dictionary. The dictionary is then used to create a DataFrame using the from_dict method. This DataFrame will have the column names as the columns and each row of data as a row in the DataFrame. More on reddit.com
๐ŸŒ r/learnpython
8
3
December 9, 2022
Pandas dataframe to nested xml
xml is part of the standard library. You have a nice column name convention and we could think about being smarter using the dot to automatically work out the parent though best to manually to put it together per row using the apply function import io import xml.etree.ElementTree as ET import pandas as pd def build_item_xml(row): item1 = ET.SubElement(items, 'Item') descriptors = ET.SubElement(item1, 'Descriptors') barcode= ET.SubElement(descriptors, 'Barcode') barcode.text=row["Descriptors.Barcode"] pricing = ET.SubElement(item1, 'Pricing') packetcost= ET.SubElement(pricing, 'PackCost') packetcost.text=str(row["Pricing.PackCost"]) # cast as without error cannot serialize 0.5625 (type float) # etc # add other attributes here # always return a result return row # mock dataframe with 2 rows based on columns supplied df = pd.DataFrame({ "Descriptors.Barcode": ["9770307017919", "9770307017920"], "Descriptors.SupplierCode": ["030701791", "030701792"], "Descriptors.Description": ["Daily Express (Mon)", "Daily Express (Tues)"], "Descriptors.CommodityGroup": [1,2], "Pricing.PackCost": [0.5625, 0.5626], "Pricing.CostPricePerUnit": [0.5625, 0.5626], "Pricing.RetailPrice": [0.75, 0.75], "Pricing.ValidFrom": [44193, 44194], "Sizing.Packsize": [1, 2], }) # https://docs.python.org/3/library/xml.etree.elementtree.html#building-xml-documents import xml.etree.ElementTree as ET items = ET.Element('Items') df = df.apply(build_item_xml, axis=1). # this calls build_item_xml per row ET.dump(items) More on reddit.com
๐ŸŒ r/learnpython
3
1
January 3, 2021
๐ŸŒ
Reddit
reddit.com โ€บ r/learnpython โ€บ parsing xml into a pandas dataframe
r/learnpython on Reddit: Parsing XML into a Pandas dataframe
December 9, 2022 -

I am trying to parse an XML file into a Pandas DataFrame. It's a nicely formatted file that's not very deep, but whenever I work with XML it's like my brain goes blank and I never can remember all the goofy intricacies of dealing with it.

The file looks roughly like this

<?xml version="1.0" encoding="utf-8"?>

<diagnosticsLog type="db-profile" startDate="11/14/2022 23:31:12">

  <!--Build 18.0.1.69-->

  <columns>

    <column friendlyName="time" name="time" />

    <column friendlyName="Direction" name="Direction" />

    <column friendlyName="SQL" name="SQL" />

    <column friendlyName="ProcessID" name="ProcessID" />

    <column friendlyName="ThreadID" name="ThreadID" />


    <column friendlyName="TimeSpan" name="TimeSpan" />

    <column friendlyName="User" name="User" />

    <column friendlyName="HTTPSessionID" name="HTTPSessionID" />

    <column friendlyName="HTTPForward" name="HTTPForward" />

    <column friendlyName="SessionID" name="SessionID" />


    <column friendlyName="SessionGUID" name="SessionGUID" />

    <column friendlyName="Datasource" name="Datasource" />

    <column friendlyName="Sequence" name="Sequence" />

    <column friendlyName="LocalSequence" name="LocalSequence" />

    <column friendlyName="Message" name="Message" />

    <column friendlyName="AppPoolName" name="AppPoolName" />

  </columns>

  <rows>

    <row>

      <col name="time">11/14/2022 23:31:12</col>

      <col name="TimeSpan">0 ms</col>

      <col name="ThreadID">0x00000025</col>

      <col name="User">USERNAME</col>

      <col name="HTTPSessionID"></col>

      <col name="HTTPForward">20.186.0.0</col>

      <col name="SessionGUID">e4e51b-a64d-4b7b-9bfe-9612dd22b6cc</col>

      <col name="SessionID">6096783</col>

      <col name="Datasource">datasourceName</col>

      <col name="AppPoolName">C 1801AppServer Ext</col>

      <col name="Direction">Out</col>

      <col name="sql">UPDATE SET </col>

      <col name="Sequence">236419</col>

      <col name="LocalSequence">103825</col>

    </row>

    <row>

      <col name="time">11/14/2022 23:31:12</col>

      <col name="TimeSpan">N/A</col>

      <col name="ThreadID">0x00000025</col>

      <col name="User">USERNAME</col>

      <col name="HTTPSessionID"></col>

      <col name="HTTPForward">20.186.0.0</col>

      <col name="SessionGUID">e491b-a64d-4b7b-9bfe-9612dd22b6cc</col>

      <col name="SessionID">6096783</col>

      <col name="Datasource">datasourceName</col>

      <col name="AppPoolName">C 1801AppServer Ext</col>

      <col name="Direction">In</col>

      <col name="sql">UPDATE SET</col>

      <col name="Sequence">236420</col>

      <col name="LocalSequence">103826</col>

    </row>

  </rows>

</diagnosticsLog>

I want to convert that to the column names being the columns and each row being a row. I'm at a loss on how to do this.

๐ŸŒ
Medium
medium.com โ€บ @robertopreste โ€บ from-xml-to-pandas-dataframes-9292980b1c1c
From XML to Pandas dataframes. How to parse XML files to obtain properโ€ฆ | by Roberto Preste | Medium
August 25, 2019 - In order to get the name attribute, we use the attrib.get() function, while the text content of each element can be retrieved using the find() function of nodes. Each iteration will return a set of data that can be thought as an observation in a pandas DataFrame; we can build this procedure as follows: import pandas as pd import xml.etree.ElementTree as et xtree = et.parse("students.xml") xroot = xtree.getroot() df_cols = ["name", "email", "grade", "age"] rows = [] for node in xroot: s_name = node.attrib.get("name") s_mail = node.find("email").text if node is not None else None s_grade = node.find("grade").text if node is not None else None s_age = node.find("age").text if node is not None else None rows.append({"name": s_name, "email": s_mail, "grade": s_grade, "age": s_age}) out_df = pd.DataFrame(rows, columns = df_cols)
๐ŸŒ
Saturn Cloud
saturncloud.io โ€บ blog โ€บ converting-xml-to-python-dataframe-a-comprehensive-guide
Converting XML to Python DataFrame: A Guide | Saturn Cloud Blog
November 15, 2023 - Converting XML to a Python DataFrame can be a bit tricky, but with the right approach, it becomes a straightforward task. This guide has shown you how to parse an XML file, extract the necessary data, and convert it into a DataFrame using pandas.
๐ŸŒ
Pandas
pandas.pydata.org โ€บ pandas-docs โ€บ stable โ€บ reference โ€บ api โ€บ pandas.DataFrame.to_xml.html
pandas.DataFrame.to_xml โ€” pandas 3.0.1 documentation
The name of row element in XML document. ... Missing data representation. ... List of columns to write as attributes in row element. Hierarchical columns will be flattened with underscore delimiting the different levels. ... List of columns to write as children in row element.
๐ŸŒ
Pandas
pandas.pydata.org โ€บ pandas-docs โ€บ version โ€บ 1.4 โ€บ reference โ€บ api โ€บ pandas.read_xml.html
pandas.read_xml โ€” pandas 1.4.4 documentation
String, path object (implementing os.PathLike[str]), or file-like object implementing a read() function. The string can be any valid XML string or a path. The string can further be a URL. Valid URL schemes include http, ftp, s3, and file. ... The XPath to parse required set of nodes for migration to DataFrame.
Find elsewhere
๐ŸŒ
YouTube
youtube.com โ€บ watch
Transforming Nested XML to Pandas DataFrame - YouTube
Hello and welcome to this tutorial. In this tutorial, you will learn how to transform XML documents to pandas data frames using Python and the element tree l...
Published ย  September 7, 2025
๐ŸŒ
GeeksforGeeks
geeksforgeeks.org โ€บ python โ€บ how-to-create-pandas-dataframe-from-nested-xml
How to create Pandas DataFrame from nested XML? - GeeksforGeeks
July 23, 2025 - Get the respective food items with specifications as a unit appended to a list(here all_items() list). Convert the list into a DataFrame using pandas.DataFrame() function and mention the column names within quotes separated by commas.
๐ŸŒ
Pandas
pandas.pydata.org โ€บ docs โ€บ reference โ€บ api โ€บ pandas.DataFrame.to_xml.html
pandas.DataFrame.to_xml โ€” pandas documentation - PyData |
The name of row element in XML document. ... Missing data representation. ... List of columns to write as attributes in row element. Hierarchical columns will be flattened with underscore delimiting the different levels. ... List of columns to write as children in row element.
Top answer
1 of 4
36

You can create a function that creates the item node from a row in your DataFrame:

def func(row):
    xml = ['<item>']
    for field in row.index:
        xml.append('  <field name="{0}">{1}</field>'.format(field, row[field]))
    xml.append('</item>')
    return '\n'.join(xml)

And then apply the function along the axis=1.

>>> print '\n'.join(df.apply(func, axis=1))
<item>
  <field name="field_1">cat</field>
  <field name="field_2">15,263</field>
  <field name="field_3">2.52</field>
  <field name="field_4">00:03:00</field>
</item>
<item>
  <field name="field_1">dog</field>
  <field name="field_2">1,652</field>
  <field name="field_3">3.71</field>
  <field name="field_4">00:03:47</field>
</item>
...
2 of 4
25

To expand on Viktor's excellent answer (and tweaking it slightly to work with duplicate columns), you could set this up as a to_xml DataFrame method:

def to_xml(df, filename=None, mode='w'):
    def row_to_xml(row):
        xml = ['<item>']
        for i, col_name in enumerate(row.index):
            xml.append('  <field name="{0}">{1}</field>'.format(col_name, row.iloc[i]))
        xml.append('</item>')
        return '\n'.join(xml)
    res = '\n'.join(df.apply(row_to_xml, axis=1))

    if filename is None:
        return res
    with open(filename, mode) as f:
        f.write(res)

pd.DataFrame.to_xml = to_xml

Then you can print the xml:

In [21]: print df.to_xml()
<item>
  <field name="field_1">cat</field>
  <field name="field_2">15,263</field>
  <field name="field_3">2.52</field>
  <field name="field_4">00:03:00</field>
</item>
<item>
...

or save it to a file:

In [22]: df.to_xml('foo.xml')

Obviously this example should be tweaked to fit your xml standard.

๐ŸŒ
Stack Abuse
stackabuse.com โ€บ reading-and-writing-xml-files-in-python-with-pandas
Reading and Writing XML Files in Python with Pandas
August 21, 2024 - To get the root element, we will use getroot() on the parsed XML data. Now we can loop through the children elements of the root node and write them into a Python list. Like before, we'll create a DataFrame using the data list, and transpose the DataFrame. Let's look at the code to create a Pandas DataFrame using lxml:
๐ŸŒ
PyPI
pypi.org โ€บ project โ€บ pandas-read-xml
pandas-read-xml
JavaScript is disabled in your browser. Please enable JavaScript to proceed ยท A required part of this site couldnโ€™t load. This may be due to a browser extension, network issues, or browser settings. Please check your connection, disable any ad blockers, or try using a different browser
๐ŸŒ
Pandas
pandas.pydata.org โ€บ pandas-docs โ€บ stable โ€บ reference โ€บ api โ€บ pandas.read_xml.html
pandas.read_xml โ€” pandas 2.2.2 documentation - PyData |
April 28, 2021 - Deprecated since version 2.1.0: Passing xml literal strings is deprecated. Wrap literal xml input in io.StringIO or io.BytesIO instead. ... The XPath to parse required set of nodes for migration to DataFrame.``XPath`` should return a collection of elements and not a single element.
๐ŸŒ
pandas
pandas.pydata.org โ€บ pandas-docs โ€บ dev โ€บ reference โ€บ api โ€บ pandas.read_xml.html
pandas.read_xml โ€” pandas 3.0.0rc1+103.gaf9e3f0ca6 documentation
3 weeks ago - The XPath to parse required set of nodes for migration to DataFrame.``XPath`` should return a collection of elements and not a single element. Note: The etree parser supports limited XPath expressions. For more complex XPath, use lxml which requires installation. ... The namespaces defined in XML document as dicts with key being namespace prefix and value the URI.
๐ŸŒ
Blogger
timhomelab.blogspot.com โ€บ 2014 โ€บ 01 โ€บ how-to-read-xml-file-into-dataframe.html
lab notebook: How to read XML file into pandas dataframe using lxml
January 22, 2014 - for i in range(0,4): obj = root.getchildren()[i].getchildren() row = dict(zip(['id', 'name'], [obj[0].text, obj[1].text])) row_s = pd.Series(row) row_s.name = i df = df.append(row_s) (name of the Series object serves as an index element while appending the object to DataFrame) And here is out fresh dataframe: ... from lxml import objectify import pandas as pd path = 'file_path' xml = objectify.parse(open(path)) root = xml.getroot() root.getchildren()[0].getchildren() df = pd.DataFrame(columns=('id', 'name')) for i in range(0,4): obj = root.getchildren()[i].getchildren() row = dict(zip(['id', 'name'], [obj[0].text, obj[1].text])) row_s = pd.Series(row) row_s.name = i df = df.append(row_s)
๐ŸŒ
Saturn Cloud
saturncloud.io โ€บ blog โ€บ converting-complex-xml-files-to-pandas-dataframecsv-in-python
Converting Complex XML Files to Pandas DataFrame/CSV in Python | Saturn Cloud Blog
December 28, 2023 - By understanding these common errors and implementing appropriate safeguards, you can enhance the robustness and reliability of your XML to DataFrame conversion process. With this script, you can easily convert any complex XML file into a Pandas DataFrame or CSV file.
๐ŸŒ
Table Convert
tableconvert.com โ€บ home โ€บ convert xml to pandas dataframe online
Convert XML to Pandas DataFrame Online - Table Convert
January 21, 2025 - Convert XML to PandasDataFrame online with our free online table converter. XML to PandasDataFrame converter: convert XML to PandasDataFrame in seconds โ€” paste, edit, and download PandasDataFrame. Need to convert XML to PandasDataFrame for an API, spreadsheet, or documentation?