As of Python 3.5, you can merge two dicts with:
merged = {**dictA, **dictB}
(https://www.python.org/dev/peps/pep-0448/)
So:
jsonMerged = {**json.loads(jsonStringA), **json.loads(jsonStringB)}
asString = json.dumps(jsonMerged)
etc.
EDIT 2 Nov 2024: Pretty sure we can now do merged = dictA | dictB
» pip install jsonmerge
As of Python 3.5, you can merge two dicts with:
merged = {**dictA, **dictB}
(https://www.python.org/dev/peps/pep-0448/)
So:
jsonMerged = {**json.loads(jsonStringA), **json.loads(jsonStringB)}
asString = json.dumps(jsonMerged)
etc.
EDIT 2 Nov 2024: Pretty sure we can now do merged = dictA | dictB
Assuming a and b are the dictionaries you want to merge:
c = {key: value for (key, value) in (a.items() + b.items())}
To convert your string to python dictionary you use the following:
import json
my_dict = json.loads(json_str)
Update: full code using strings:
# test cases for jsonStringA and jsonStringB according to your data input
jsonStringA = '{"error_1395946244342":"valueA","error_1395952003":"valueB"}'
jsonStringB = '{"error_%d":"Error Occured on machine %s in datacenter %s on the %s of process %s"}' % (timestamp_number, host_info, local_dc, step, c)
# now we have two json STRINGS
import json
dictA = json.loads(jsonStringA)
dictB = json.loads(jsonStringB)
merged_dict = {key: value for (key, value) in (dictA.items() + dictB.items())}
# string dump of the merged dict
jsonString_merged = json.dumps(merged_dict)
But I have to say that in general what you are trying to do is not the best practice. Please read a bit on python dictionaries.
Alternative solution:
jsonStringA = get_my_value_as_string_from_somewhere()
errors_dict = json.loads(jsonStringA)
new_error_str = "Error Ocurred in datacenter %s blah for step %s blah" % (datacenter, step)
new_error_key = "error_%d" % (timestamp_number)
errors_dict[new_error_key] = new_error_str
# and if I want to export it somewhere I use the following
write_my_dict_to_a_file_as_string(json.dumps(errors_dict))
And actually you can avoid all these if you just use an array to hold all your errors.
Hello everyone,
I am trying to merge two JSON files, but I couldn't find any quick package that can do this. One file contains the base policy, while the other includes additional files for excluding special configurations.
My goal is to merge these two JSON files of AntiVirus policy, which contain arrays and numerous elements, without overwriting any data. I was wondering what the best approach would be to accomplish this.
If its element just uses the value of the other files.
If its array just append new elements.
What is best way to achieve this goal?
Thanks all
- First off, if you want reusability, turn this into a function. The function should have it's respective arguments.
- Secondly, instead of allocating a variable to store all of the JSON data to write, I'd recommend directly writing the contents of each of the files directly to the merged file. This will help prevent issues with memory.
- Finally, I just have a few nitpicky tips on your variable naming. Preferably,
headshould have a name more along the lines ofmerged_files, and you shouldn't be usingfas an iterator variable. Something likejson_filewould be better.
This is essentially alexwlchan's comment spelled out:
Parsing and serializing JSON doesn't come for free, so you may want to avoid it. I think you can just output "[", the first file, ",", the second file etc., "]" and call it a day. If all inputs are valid JSON, unless I'm terribly mistaken, this should also be valid JSON.
In code, version 1:
def cat_json(outfile, infiles):
file(outfile, "w")\
.write("[%s]" % (",".join([mangle(file(f).read()) for f in infiles])))
def mangle(s):
return s.strip()[1:-1]
Version 2:
def cat_json(output_filename, input_filenames):
with file(output_filename, "w") as outfile:
first = True
for infile_name in input_filenames:
with file(infile_name) as infile:
if first:
outfile.write('[')
first = False
else:
outfile.write(',')
outfile.write(mangle(infile.read()))
outfile.write(']')
The second version has a few advantages: its memory requirements should be something like the size of the longest input file, whereas the first requires twice the sum of all file sizes. The number of simultaneously open file handles is also smaller, so it should work for any number of files.
By using with, it also does deterministic (and immediate!) deallocation of file handles upon leaving each with block, even in python implementations with non-immediate garbage collection (such as pypy and jython etc.).
Hi there, I am new to Python and I am trying to merge approx 350 JSON files into one. Any ideas how can I do that? Thanks for any advice!
You can't do it once they're in JSON format - JSON is just text. You need to combine them in Python first:
data = { 'obj1' : obj1, 'obj2' : obj2 }
json.dumps(data)
Not sure if I'm missing something, but I think this works (tested in python 2.5) with the output you specify:
import simplejson
finalObj = { 'obj1': obj1, 'obj2': obj2 }
simplejson.dumps(finalObj)
If your records are stored separated by newlines in a text file I would recommend the following approach by opening the file, parsing the records, and adding them to a dict which you can later dump with the native json library.
import json
data = {'records': []}
with open("data.txt", 'r') as f:
lines = f.readlines()
for line in lines:
data['records'].append(json.loads(line))
print(json.dumps(data))
I would do it following way, let file.txt content be
{"eventVersion":"1.08","userIdentity":{"type":"AssumedRole","principalId":"AA:i-096379450e69ed082","arn":"arn:aws:sts::34502sdsdsd:assumed-role/RDSAccessRole/i-096379450e69ed082","accountId":"34502sdsdsd","accessKeyId":"ASIAVAVKXAXXXXXXXC","sessionContext":{"sessionIssuer":{"type":"Role","principalId":"AROAVAVKXAKDDDDD","arn":"arn:aws:iam::3450291sdsdsd:role/RDSAccessRole","accountId":"345029asasas","userName":"RDSAccessRole"},"webIdFederationData":{},"attributes":{"mfaAuthenticated":"false","creationDate":"2021-04-27T04:38:52Z"},"ec2RoleDelivery":"2.0"}},"eventTime":"2021-04-27T07:24:20Z","eventSource":"ssm.amazonaws.com","eventName":"ListInstanceAssociations","awsRegion":"us-east-1","sourceIPAddress":"188.208.227.188","userAgent":"aws-sdk-go/1.25.41 (go1.13.15; linux; amd64) amazon-ssm-agent/","requestParameters":{"instanceId":"i-096379450e69ed082","maxResults":20},"responseElements":null,"requestID":"a5c63b9d-aaed-4a3c-9b7d-a4f7c6b774ab","eventID":"70de51df-c6df-4a57-8c1e-0ffdeb5ac29d","readOnly":true,"resources":[{"accountId":"34502914asasas","ARN":"arn:aws:ec2:us-east-1:3450291asasas:instance/i-096379450e69ed082"}],"eventType":"AwsApiCall","managementEvent":true,"eventCategory":"Management","recipientAccountId":"345029149342"}
{"eventVersion":"1.08","userIdentity":{"type":"AssumedRole","principalId":"AROAVAVKXAKPKZ25XXXX:AmazonMWAA-airflow","arn":"arn:aws:sts::3450291asasas:assumed-role/dev-1xdcfd/AmazonMWAA-airflow","accountId":"34502asasas","accessKeyId":"ASIAVAVKXAXXXXXXX","sessionContext":{"sessionIssuer":{"type":"Role","principalId":"AROAVAVKXAKPKZXXXXX","arn":"arn:aws:iam::345029asasas:role/service-role/AmazonMWAA-dlp-dev-1xdcfd","accountId":"3450291asasas","userName":"dlp-dev-1xdcfd"},"webIdFederationData":{},"attributes":{"mfaAuthenticated":"false","creationDate":"2021-04-27T07:04:08Z"}},"invokedBy":"airflow.amazonaws.com"},"eventTime":"2021-04-27T07:23:46Z","eventSource":"logs.amazonaws.com","eventName":"CreateLogStream","awsRegion":"us-east-1","sourceIPAddress":"airflow.amazonaws.com","userAgent":"airflow.amazonaws.com","errorCode":"ResourceAlreadyExistsException","errorMessage":"The specified log stream already exists","requestParameters":{"logStreamName":"scheduler.py.log","logGroupName":"dlp-dev-DAGProcessing"},"responseElements":null,"requestID":"40b48ef9-fc4b-4d1a-8fd1-4f2584aff1e9","eventID":"ef608d43-4765-4a3a-9c92-14ef35104697","readOnly":false,"eventType":"AwsApiCall","apiVersion":"20140328","managementEvent":true,"eventCategory":"Management","recipientAccountId":"3450291asasas"}
then
with open('file.txt', 'r') as f:
jsons = [i.strip() for i in f.readlines()]
with open('total.json', 'w') as f:
f.write('{"Records":[')
f.write(','.join(jsons))
f.write(']}')
will produce total.json with desired shape and being legal JSON if every line inside file.txt is legal JSON.
You can't just concatenate two JSON strings to make valid JSON (or combine them by tacking ',\n' to the end of each).
Instead, you could combine the two (as Python objects) into a Python list, then use json.dump to write it to a file as JSON:
import json
import glob
result = []
for f in glob.glob("*.json"):
with open(f, "rb") as infile:
result.append(json.load(infile))
with open("merged_file.json", "wb") as outfile:
json.dump(result, outfile)
If you wanted to do it without the (unnecesssary) intermediate step of parsing each JSON file, you could merge them into a list like this:
import glob
read_files = glob.glob("*.json")
with open("merged_file.json", "wb") as outfile:
outfile.write('[{}]'.format(
','.join([open(f, "rb").read() for f in read_files])))
There is a module called jsonmerge which merges dictionaries. It can be used very simple by just providing two dictionaries or you can define schema's that described how to merge, like instead of overwriting same key's, automatically create a list and append to it.
base = {
"foo": 1,
"bar": [ "one" ],
}
head = {
"bar": [ "two" ],
"baz": "Hello, world!"
}
from jsonmerge import merge
result = merge(base, head)
print(result)
>>> {'foo': 1, 'bar': ['two'], 'baz': 'Hello, world!'}
More examples with complex rules: https://pypi.org/project/jsonmerge/#description
You may do like this but it should not be in order.
>>> t = [{'ComSMS': 'true'}, {'ComMail': 'true'}, {'PName': 'riyaas'}, {'phone': '1'}]
>>> [{i:j for x in t for i,j in x.items()}]
[{'ComSMS': 'true', 'phone': '1', 'PName': 'riyaas', 'ComMail': 'true'}]
Loop on your list of dict. and update an empty dictionary z in this case.
z.update(i): Get K:V from i(type : dict.) and add it in z.
t = [{'ComSMS': 'true'}, {'ComMail': 'true'}, {'PName': 'riyaas'}, {'phone': '1'}]
z = {}
In [13]: for i in t:
z.update(i)
....:
In [14]: z
Out[14]: {'ComMail': 'true', 'ComSMS': 'true', 'PName': 'riyaas', 'phone': '1'}
You should use extend instead of append. It will add the items of the passed list to result instead of a new list:
files=['my.json','files.json',...,'name.json']
def merge_JsonFiles(filename):
result = list()
for f1 in filename:
with open(f1, 'r') as infile:
result.extend(json.load(infile))
with open('counseling3.json', 'w') as output_file:
json.dump(result, output_file)
merge_JsonFiles(files)
import json
import pandas as pd
with open('example1.json') as f1: # open the file
data1 = json.load(f1)
with open('example2.json') as f2: # open the file
data2 = json.load(f2)
df1 = pd.DataFrame([data1]) # Creating DataFrames
df2 = pd.DataFrame([data2]) # Creating DataFrames
MergeJson = pd.concat([df1, df2], axis=1) # Concat DataFrames
MergeJson.to_json("MergeJsonDemo.json") # Writing Json