If you want two objects with the same elements but in a different order to compare equal, then the obvious thing to do is compare sorted copies of them - for instance, for the dictionaries represented by your JSON strings a and b:
import json
a = json.loads("""
{
"errors": [
{"error": "invalid", "field": "email"},
{"error": "required", "field": "name"}
],
"success": false
}
""")
b = json.loads("""
{
"success": false,
"errors": [
{"error": "required", "field": "name"},
{"error": "invalid", "field": "email"}
]
}
""")
>>> sorted(a.items()) == sorted(b.items())
False
... but that doesn't work, because in each case, the "errors" item of the top-level dict is a list with the same elements in a different order, and sorted() doesn't try to sort anything except the "top" level of an iterable.
To fix that, we can define an ordered function which will recursively sort any lists it finds (and convert dictionaries to lists of (key, value) pairs so that they're orderable):
def ordered(obj):
if isinstance(obj, dict):
return sorted((k, ordered(v)) for k, v in obj.items())
if isinstance(obj, list):
return sorted(ordered(x) for x in obj)
else:
return obj
If we apply this function to a and b, the results compare equal:
>>> ordered(a) == ordered(b)
True
Answer from Zero Piraeus on Stack OverflowIf you want two objects with the same elements but in a different order to compare equal, then the obvious thing to do is compare sorted copies of them - for instance, for the dictionaries represented by your JSON strings a and b:
import json
a = json.loads("""
{
"errors": [
{"error": "invalid", "field": "email"},
{"error": "required", "field": "name"}
],
"success": false
}
""")
b = json.loads("""
{
"success": false,
"errors": [
{"error": "required", "field": "name"},
{"error": "invalid", "field": "email"}
]
}
""")
>>> sorted(a.items()) == sorted(b.items())
False
... but that doesn't work, because in each case, the "errors" item of the top-level dict is a list with the same elements in a different order, and sorted() doesn't try to sort anything except the "top" level of an iterable.
To fix that, we can define an ordered function which will recursively sort any lists it finds (and convert dictionaries to lists of (key, value) pairs so that they're orderable):
def ordered(obj):
if isinstance(obj, dict):
return sorted((k, ordered(v)) for k, v in obj.items())
if isinstance(obj, list):
return sorted(ordered(x) for x in obj)
else:
return obj
If we apply this function to a and b, the results compare equal:
>>> ordered(a) == ordered(b)
True
Another way could be to use json.dumps(X, sort_keys=True) option:
import json
a, b = json.dumps(a, sort_keys=True), json.dumps(b, sort_keys=True)
a == b # a normal string comparison
This works for nested dictionaries and lists.
» pip install jsoncomparison
Videos
I need to compare two JSON files that contain a list of dictionaries whose basic format is:
[{"protocol": "S", "type": "", "network": "0.0.0.0", "mask": "0", "distance": "254", "metric": "0", "nexthop_ip": "192.168.122.1", "nexthop_if": "", "uptime": ""}, {"protocol": "O", "type": "", "network": "10.129.30.0", "mask": "24", "distance": "110", "metric": "2", "nexthop_ip": "172.20.10.1", "nexthop_if": "GigabitEthernet0/1", "uptime": "08:58:25"}]Though there are many, many more dictionary items in the list than that shown above. I am not quite sure how best to go about comparing the files to spot differences and return or save those difference to another file, preferably in JSON format, though a CSV would be fine too.
The one gotcha that there may be is I need to exclude, at a minimum, the uptime value as it is constantly changing so it will of course trigger anything looking for changes. Can anyone help get me started please?
» pip install json-diff
» pip install json-files-compare
I have a Python lambda function downloading a large excel file and converting it to JSON.
This file will be downloaded at least once a day (as the data can change)
I need to push the changed/updated data to an API.
Is there a way for me to compare two JSON files and output the diff?
It would be perfect if it would output multiple arrays of objects.
1 array of objects that have changed (I don’t care what has changed, just need to know that it has)
1 array of removed/deleted objects.