xmltodict (full disclosure: I wrote it) can help you convert your XML to a dict+list+string structure, following this "standard". It is Expat-based, so it's very fast and doesn't need to load the whole XML tree in memory.
Once you have that data structure, you can serialize it to JSON:
import xmltodict, json
o = xmltodict.parse('<e> <a>text</a> <a>text</a> </e>')
json.dumps(o) # '{"e": {"a": ["text", "text"]}}'
Answer from Martin Blech on Stack Overflowxmltodict (full disclosure: I wrote it) can help you convert your XML to a dict+list+string structure, following this "standard". It is Expat-based, so it's very fast and doesn't need to load the whole XML tree in memory.
Once you have that data structure, you can serialize it to JSON:
import xmltodict, json
o = xmltodict.parse('<e> <a>text</a> <a>text</a> </e>')
json.dumps(o) # '{"e": {"a": ["text", "text"]}}'
There is no "one-to-one" mapping between XML and JSON, so converting one to the other necessarily requires some understanding of what you want to do with the results.
That being said, Python's standard library has several modules for parsing XML (including DOM, SAX, and ElementTree). As of Python 2.6, support for converting Python data structures to and from JSON is included in the json module.
So the infrastructure is there.
» pip install xmljson
Videos
xmltodict (full disclosure: I wrote it) can help you convert your XML to a dict+list+string structure, following this "standard". It is Expat-based, so it's very fast and doesn't need to load the whole XML tree in memory.
Once you have that data structure, you can serialize it to JSON:
import xmltodict, json
o = xmltodict.parse('<e> <a>text</a> <a>text</a> </e>')
json.dumps(o) # '{"e": {"a": ["text", "text"]}}'
Soviut's advice for lxml objectify is good. With a specially subclassed simplejson, you can turn an lxml objectify result into json.
import simplejson as json
import lxml
class objectJSONEncoder(json.JSONEncoder):
"""A specialized JSON encoder that can handle simple lxml objectify types
>>> from lxml import objectify
>>> obj = objectify.fromstring("<Book><price>1.50</price><author>W. Shakespeare</author></Book>")
>>> objectJSONEncoder().encode(obj)
'{"price": 1.5, "author": "W. Shakespeare"}'
"""
def default(self,o):
if isinstance(o, lxml.objectify.IntElement):
return int(o)
if isinstance(o, lxml.objectify.NumberElement) or isinstance(o, lxml.objectify.FloatElement):
return float(o)
if isinstance(o, lxml.objectify.ObjectifiedDataElement):
return str(o)
if hasattr(o, '__dict__'):
#For objects with a __dict__, return the encoding of the __dict__
return o.__dict__
return json.JSONEncoder.default(self, o)
See the docstring for example of usage, essentially you pass the result of lxml objectify to the encode method of an instance of objectJSONEncoder
Note that Koen's point is very valid here, the solution above only works for simply nested xml and doesn't include the name of root elements. This could be fixed.
I've included this class in a gist here: http://gist.github.com/345559
Nothing came back right away, so I went ahead and wrote a script that solves this problem.
Python already allows you to convert from JSON into a native dict (using json or, in versions < 2.6, simplejson), so I wrote a library that converts native dicts into an XML string.
https://github.com/quandyfactory/dict2xml
It supports int, float, boolean, string (and unicode), array and dict data types and arbitrary nesting (yay recursion).
If you don't have such a package, you can try:
def json2xml(json_obj, line_padding=""):
result_list = list()
json_obj_type = type(json_obj)
if json_obj_type is list:
for sub_elem in json_obj:
result_list.append(json2xml(sub_elem, line_padding))
return "\n".join(result_list)
if json_obj_type is dict:
for tag_name in json_obj:
sub_obj = json_obj[tag_name]
result_list.append("%s<%s>" % (line_padding, tag_name))
result_list.append(json2xml(sub_obj, "\t" + line_padding))
result_list.append("%s</%s>" % (line_padding, tag_name))
return "\n".join(result_list)
return "%s%s" % (line_padding, json_obj)
For example:
s='{"main" : {"aaa" : "10", "bbb" : [1,2,3]}}'
j = json.loads(s)
print(json2xml(j))
Result:
<main>
<aaa>
10
</aaa>
<bbb>
1
2
3
</bbb>
</main>