convert large xml to json python

stackoverflow.com › questions › 191536 › converting-xml-to-json-using-python

xmltodict (full disclosure: I wrote it) can help you convert your XML to a dict+list+string structure, following this "standard". It is Expat-based, so it's very fast and doesn't need to load the whole XML tree in memory.

Once you have that data structure, you can serialize it to JSON:

import xmltodict, json

o = xmltodict.parse('<e> <a>text</a> <a>text</a> </e>')
json.dumps(o) # '{"e": {"a": ["text", "text"]}}'

Answer from Martin Blech on Stack Overflow

Stack Overflow

stackoverflow.com › questions › 191536 › converting-xml-to-json-using-python

Converting XML to JSON using Python? - Stack Overflow

Top answer

1 of 16

400

Once you have that data structure, you can serialize it to JSON:

import xmltodict, json

o = xmltodict.parse('<e> <a>text</a> <a>text</a> </e>')
json.dumps(o) # '{"e": {"a": ["text", "text"]}}'

2 of 16

There is no "one-to-one" mapping between XML and JSON, so converting one to the other necessarily requires some understanding of what you want to do with the results.

That being said, Python's standard library has several modules for parsing XML (including DOM, SAX, and ElementTree). As of Python 2.6, support for converting Python data structures to and from JSON is included in the json module.

So the infrastructure is there.

Stack Overflow

stackoverflow.com › questions › 19286118 › python-convert-very-large-6-4gb-xml-files-to-json

Python - Convert Very Large (6.4GB) XML files to JSON - Stack Overflow

Top answer

1 of 3

Many Python XML libraries support parsing XML sub elements incrementally, e.g. xml.etree.ElementTree.iterparse and xml.sax.parse in the standard library. These functions are usually called "XML Stream Parser".

The xmltodict library you used also has a streaming mode. I think it may solve your problem

https://github.com/martinblech/xmltodict#streaming-mode

2 of 3

Instead of trying to read the file in one go and then process it, you want to read it in chunks and process each chunk as it's loaded. This is a fairly common situation when processing large XML files and is covered by the Simple API for XML (SAX) standard, which specifies a callback API for parsing XML streams - it's part of the Python standard library under xml.sax.parse and xml.etree.ETree as mentioned above.

Here's a quick XML to JSON converter:

from collections import defaultdict
import json
import xml.etree.ElementTree as ET

def parse_xml(file_name):
    events = ("start", "end")
    context = ET.iterparse(file_name, events=events)

    return pt(context)

def pt(context, cur_elem=None):
    items = defaultdict(list)

    if cur_elem:
        items.update(cur_elem.attrib)

    text = ""

    for action, elem in context:
        # print("{0:>6} : {1:20} {2:20} '{3}'".format(action, elem.tag, elem.attrib, str(elem.text).strip()))

        if action == "start":
            items[elem.tag].append(pt(context, elem))
        elif action == "end":
            text = elem.text.strip() if elem.text else ""
            elem.clear()
            break

    if len(items) == 0:
        return text

    return { k: v[0] if len(v) == 1 else v for k, v in items.items() }

if __name__ == "__main__":
    json_data = parse_xml("large.xml")
    print(json.dumps(json_data, indent=2))

If you're looking at a lot of XML processing check out the lxml library, it's got a ton of useful stuff over and above the standard modules, while also being much easier to use.

http://lxml.de/tutorial.html

Discussions

Convert JSON to XML in Python - Stack Overflow

I see a number of questions on SO asking about ways to convert XML to JSON, but I'm interested in going the other way. Is there a python library for converting JSON to XML? Edit: Nothing came back ... More on stackoverflow.com

stackoverflow.com

How to convert XML to JSON in Python? - Stack Overflow

Possible Duplicate: Converting XML to JSON using Python? I'm doing some work on App Engine and I need to convert an XML document being retrieved from a remote server into an equivalent JSON obj... More on stackoverflow.com

stackoverflow.com

It's 2025. What's your favorite module or method for converting xml to json and vice versa?

I hate working with XML. I recently had to parse some responses from namecheap and just used ElementTree (xml.etree.ElementTree). I didn't find any recent XML to JSON modules, and my need was minimal, so just stuck to ET. I would hate to have to work with XML regularly! More on reddit.com

r/learnpython

March 18, 2025

Convert XML to Json

if you can at least use the standard python libraries you can parse the xml with the builtin ElementTree , it's not as good as lxml but it should work More on reddit.com

r/sysadmin

September 6, 2024

Videos