The json format you gave is wrong. There is no comma before the key named locationDetails. You can use json_normalize after you fix it.
df = pd.json_normalize(json_data,meta=['lossInfo']).explode('lossInfo').reset_index(drop=True)
df = df.join(pd.json_normalize(df.pop('lossInfo')))
'''
| | metaData.formName | metaData.user | report.from | report.to | locationName.name | locationName.locNbr | locationDetails.locNm | locationDetails.locAddress.locCity | locationDetails.locAddress.locStateCd | locationDetails.state | locationDetails.lossLocation |
|---:|:--------------------|:----------------|:--------------|:------------|:--------------------|----------------------:|:------------------------|:-------------------------------------|:----------------------------------------|:------------------------|:-------------------------------|
| 0 | A1 | Test User | 12/12/2021 | 12/12/2022 | test1 | 12 | xyz | abc | abcd | ab | cd |
| 1 | A1 | Test User | 12/12/2021 | 12/12/2022 | test11 | 121 | xyz1 | abc1 | abcd1 | ab1 | cd1 |
'''
Answer from Bushmaster on Stack Overflowrow = 1
def TraverseJSONTree(jsonObject, main_title=None, count=0):
if main_title is None:
main_title = title = jsonObject.get('title')
else:
title = jsonObject.get('title')
url = jsonObject.get('url')
print 'Title: ' + title + ' , Position: ' + str(count)
if main_title is not None:
worksheet.write_string(row, 0, title)
worksheet.write_string(row, count, title)
worksheet.write_string(row, 6, url)
global row
row+=1
subCategories = jsonObject.get('subCategory',[])
for category in subCategories:
TraverseJSONTree(category, main_title, count+1)
for jsonObject in json.loads(jsonArray):
TraverseJSONTree(jsonObject)
it will return your expected output as it needs a check if category is there then you have to right the original title on the 0th col in excel reamin as same.
Modification : Simplest way to do this would be to use csv module, say we have the whole json in the variable a
import csv
import cPickle as pickle
fieldnames = ['Category1', 'Category1.1', 'url']
csvfile = open("category.csv", 'wb')
csvfilewriter = csv.DictWriter(csvfile, fieldnames=fieldnames,dialect='excel', delimiter=',')
csvfilewriter.writeheader()
for b in a:
data = []
data.append(b['title'])
data.append("")
data.append(b['url'])
csvfilewriter.writerow(dict(zip(fieldnames,data)))
data = []
for i in xrange(len(b['subCategory'])):
data.append(b['title'])
data.append(b['subCategory'][i]['title'])
data.append(b['subCategory'][i]['url'])
csvfilewriter.writerow(dict(zip(fieldnames,data)))
You will have the desired csv in the same location. This works for only two subcategories (because i have checked the data given by you and say there were only two categories (ie 1 and 1.1)) but in case you want for more than repeat the same(I know it's not the most efficient way couldn't think of any in such a short time)
You can also use pandas module to convert the dictionary import pandas as pd pd.DataFrame.from_dict(dcitionaty_element)
And then do it on all the dictionaries in that json and merge them and save it to a csv file.
Videos
Expanding to what Corralien wrote, try something like this:
import json
import pandas as pd
data = json.loads(response_json)
df = pd.concat({k: pd.json_normalize(v) for k, v in data.items()}).droplevel(1)
df = pd.concat((df, pd.concat({k: pd.json_normalize(v) for k, v in df['service.services'].items()}).droplevel(1).add_prefix('service.')), axis=1).drop(columns='service.services')
This works under the assumption you will always have a list under the service.services column.
Output:
| | Price | category | service.id | service.name | service.description | service.Validity | service.order | service.selection | creditTO.id | creditTO.duration | creditTO.Type | creditTO.Tax | creditTO.total | creditTO.promotion | service.id | service.financeable | service.Execution | service.serviceId | service.label | service.benefit.id | service.benefit.name | service.benefit.Priced |
|:---|--------:|-----------:|-------------:|:---------------|:----------------------|:-------------------|----------------:|:--------------------|--------------:|--------------------:|:----------------|---------------:|-----------------:|:---------------------|-------------:|:----------------------|:--------------------|--------------------:|:----------------|---------------------:|:-----------------------|:-------------------------|
| A | 200 | 620 | 15 | KAL | Description | | 0 | False | 0 | 6 | standard | 51 | 400 | False | 100 | True | | 112 | Colab | 235 | ZSX | |
| B | 200 | 620 | 15 | BTX | Description | | 0 | False | 0 | 9 | standard | 51 | 400 | False | 100 | True | | 112 | Colab | 235 | ZSX | |
| C | 600 | 620 | 15 | FLS | Description | | 0 | False | 0 | 12 | standard | 51 | 400 | False | 100 | True | | 112 | Colab | 235 | ZSX | |
| D | 705 | 620 | 15 | TRW | Description | | 0 | False | 0 | 18 | standard | 67 | 245 | False | 100 | True | | 112 | Colab | 235 | ZSX | |
You can iterate on first level records to create individual dataframes then concatenate them to get the expected output:
import json
import pandas as pd
data = json.loads(response_json)
df = pd.concat({k: pd.json_normalize(v) for k, v in data.items()}).droplevel(1)
Output:
>>> df
Price category service.id service.name service.description ... creditTO.duration creditTO.Type creditTO.Tax creditTO.total creditTO.promotion
A 200 620 15 KAL Description ... 6 standard 51 400 False
B 200 620 15 BTX Description ... 9 standard 51 400 False
C 600 620 15 FLS Description ... 12 standard 51 400 False
D 705 620 15 TRW Description ... 18 standard 67 245 False