nested json to csv python pandas

Conversion from nested json to csv with pandas

stackoverflow.com › questions › 56883988 › conversion-from-nested-json-to-csv-with-pandas

I actually wrote a package called cherrypicker recently to deal with this exact sort of thing since I had to do it so often!

I think the following code would give you exactly what you're after:

from cherrypicker import CherryPicker
import json
import pandas as pd

with open('file.json') as file:
    data = json.load(file)

picker = CherryPicker(data)
flat = picker['tickets'].flatten().get()
df = pd.DataFrame(flat)
print(df)

This gave me the output:

  Location_City Location_State  Name hobbies_0 hobbies_1   playerId  salary teamId  year
0   Los Angeles             CA  Liam     Piano    Sports  barkele01  870000    ATL  1985
1   Los Angeles             CA  John     Music   Running  bedrost01  550000    ATL  1985

You can install the package with:

pip install cherrypicker

...and there's more docs and guidance at https://cherrypicker.readthedocs.io.

Answer from big-o on Stack Overflow

GeeksforGeeks

geeksforgeeks.org › python › convert-nested-json-to-csv-in-python

Convert nested JSON to CSV in Python - GeeksforGeeks

July 23, 2025 - Therefore, the column "education.graduation.major" was simply renamed to "graduation". After renaming the columns, the to_csv() method saves the pandas dataframe object as CSV to the provided ...

Stack Overflow

stackoverflow.com › questions › 56883988 › conversion-from-nested-json-to-csv-with-pandas

python - Conversion from nested json to csv with pandas - Stack Overflow

Top answer

1 of 3

I actually wrote a package called cherrypicker recently to deal with this exact sort of thing since I had to do it so often!

I think the following code would give you exactly what you're after:

from cherrypicker import CherryPicker
import json
import pandas as pd

with open('file.json') as file:
    data = json.load(file)

picker = CherryPicker(data)
flat = picker['tickets'].flatten().get()
df = pd.DataFrame(flat)
print(df)

This gave me the output:

  Location_City Location_State  Name hobbies_0 hobbies_1   playerId  salary teamId  year
0   Los Angeles             CA  Liam     Piano    Sports  barkele01  870000    ATL  1985
1   Los Angeles             CA  John     Music   Running  bedrost01  550000    ATL  1985

You can install the package with:

pip install cherrypicker

...and there's more docs and guidance at https://cherrypicker.readthedocs.io.

2 of 3

An you already have a function to flatten a Json object, you have just to flatten the tickets:

...
with open(args.json_file, "r") as inputFile:  # open json file
    json_data = json.loads(inputFile.read())  # load json content
final_data = pd.DataFrame([flatten_json(elt) for elt in json_data['tickets']])
...

With your sample data, final_data is as expected:

  Location_City Location_State  Name hobbies_0 hobbies_1   playerId  salary teamId  year
0   Los Angeles             CA  Liam     Piano    Sports  barkele01  870000    ATL  1985
1   Los Angeles             CA  John     Music   Running  bedrost01  550000    ATL  1985

Discussions

Convert nested JSON to CSV file in Python - Stack Overflow

I know this question has been asked many times. I tried several solutions but I couldn't solve my problem. I have a large nested JSON file (1.4GB) and I would like to make it flat and then convert... More on stackoverflow.com

stackoverflow.com

How to convert CSV to nested JSON in Python - Stack Overflow

Communities for your favorite technologies. Explore all Collectives · Stack Overflow for Teams is now called Stack Internal. Bring the best of human thought and AI automation together at your work More on stackoverflow.com

stackoverflow.com

Converting a flat csv to nested json that has multi-level hierarchy using pandas

Hello, I am relatively new to python. However, I am asked to create a script that reads csv file and composes json request. I have explored couple of solutions available and created a script (pasted below) for converting to json. However, as I am unaware of ‘agg’ and ‘lambda’ ... More on discuss.python.org

discuss.python.org

August 19, 2022

How to write nested json data to csv using python pandas? - Stack Overflow

i have tried this code to achieve my results. import json import csv import pandas as pd df=pd.read_json("sample_data.json") df1=pd.DataFrame(df) df1.to_csv("nested_data.csv") i want my result to ... More on stackoverflow.com

stackoverflow.com

Videos

13:57

YouTube

Query APIs using Python (Part 2) - Nested JSON to Dataframe to ...

December 16, 2022

06:42

YouTube

CSV to Nested JSON for any API - YouTube

July 22, 2022

2.46K

youtube.com

Converting CSV to Nested JSON/ Dictionary format in PySpark ...

08:16

YouTube

CSV To JSON With Python Pandas - YouTube

HOW TO PARSE NESTED JSON AND CONVERT TO DATAFRAME | STOCK EXAMPLE ...

HOW TO PARSE DIFFERENT TYPES OF NESTED JSON USING PYTHON | DATA ...

August 18, 2020

View all

Like Geeks

likegeeks.com › home › python › pandas › convert nested json to csv using python pandas

Convert Nested JSON to CSV using Python Pandas

This example shows how to handle a JSON structure with both nested objects and lists, where the goal is to create a CSV that includes details from both the nested objects and the list items. ... import pandas as pd data = [ { "id": 1, "name": "Customer A", "contact": {"email": "a@example.com", "phone": "12345"}, "products": [{"name": "Product 1", "quantity": 2}, {"name": "Product 2", "quantity": 3}] }, { "id": 2, "name": "Customer B", "contact": {"email": "b@example.com", "phone": "67890"}, "products": [{"name": "Product 3", "quantity": 1}] } ] df = pd.json_normalize( data, 'products', ['id', 'name', ['contact', 'email'], ['contact', 'phone']], meta_prefix='customer_' ) csv_data = df.to_csv(index=False) print(csv_data)

Verpex

verpex.com › blog › website tips › how to convert json to...

How to Convert JSON to CSV in Python

Convert JSON to CSV in Python easily. Step-by-step guide with examples using pandas and json libraries for simple and nested data.

Stack Overflow

stackoverflow.com › questions › 41180960 › convert-nested-json-to-csv-file-in-python

Convert nested JSON to CSV file in Python - Stack Overflow

Top answer

1 of 5

Please scroll down for the newer, faster solution

This is an older question, but I struggled the entire night to get a satisfactory result for a similar situation, and I came up with this:

import json
import pandas

def cross_join(left, right):
    return left.assign(key=1).merge(right.assign(key=1), on='key', how='outer').drop('key', 1)

def json_to_dataframe(data_in):
    def to_frame(data, prev_key=None):
        if isinstance(data, dict):
            df = pandas.DataFrame()
            for key in data:
                df = cross_join(df, to_frame(data[key], prev_key + '.' + key))
        elif isinstance(data, list):
            df = pandas.DataFrame()
            for i in range(len(data)):
                df = pandas.concat([df, to_frame(data[i], prev_key)])
        else:
            df = pandas.DataFrame({prev_key[1:]: [data]})
        return df
    return to_frame(data_in)

if __name__ == '__main__':
    with open('somefile') as json_file:
        json_data = json.load(json_file)

    df = json_to_dataframe(json_data)
    df.to_csv('data.csv', mode='w')

Explanation:

The cross_join function is a neat way I found to do a cartesian product. (credit: here)

The json_to_dataframe function does the logic, using pandas dataframes. In my case, the json was deeply nested, and I wanted to split dictionary key:value pairs into columns, but the lists I wanted to transform into rows for a column -- hence the concat -- which I then cross join with the upper level, thus multiplying the records number so that each value from the list has its own row, while the previous columns are identical.

The recursiveness creates stacks that cross join with the one below, until the last one is returned.

Then with the dataframe in a table format, it's easy to convert to CSV with the "df.to_csv()" dataframe object method.

This should work with deeply nested JSON, being able to normalize all of it into rows by the logic described above.

I hope this will help someone, someday. Just trying to give back to this awesome community.

---------------------------------------------------------------------------------------------

LATER EDIT: NEW SOLUTION

I'm coming back to this as while the dataframe option kinda worked, it took the app minutes to parse not so large JSON data. Therefore I thought of doing what the dataframes do, but by myself:

from copy import deepcopy
import pandas


def cross_join(left, right):
    new_rows = [] if right else left
    for left_row in left:
        for right_row in right:
            temp_row = deepcopy(left_row)
            for key, value in right_row.items():
                temp_row[key] = value
            new_rows.append(deepcopy(temp_row))
    return new_rows


def flatten_list(data):
    for elem in data:
        if isinstance(elem, list):
            yield from flatten_list(elem)
        else:
            yield elem


def json_to_dataframe(data_in):
    def flatten_json(data, prev_heading=''):
        if isinstance(data, dict):
            rows = [{}]
            for key, value in data.items():
                rows = cross_join(rows, flatten_json(value, prev_heading + '.' + key))
        elif isinstance(data, list):
            rows = []
            for item in data:
                [rows.append(elem) for elem in flatten_list(flatten_json(item, prev_heading))]
        else:
            rows = [{prev_heading[1:]: data}]
        return rows

    return pandas.DataFrame(flatten_json(data_in))


if __name__ == '__main__':
    json_data = {
        "id": "0001",
        "type": "donut",
        "name": "Cake",
        "ppu": 0.55,
        "batters":
            {
                "batter":
                    [
                        {"id": "1001", "type": "Regular"},
                        {"id": "1002", "type": "Chocolate"},
                        {"id": "1003", "type": "Blueberry"},
                        {"id": "1004", "type": "Devil's Food"}
                    ]
            },
        "topping":
            [
                {"id": "5001", "type": "None"},
                {"id": "5002", "type": "Glazed"},
                {"id": "5005", "type": "Sugar"},
                {"id": "5007", "type": "Powdered Sugar"},
                {"id": "5006", "type": "Chocolate with Sprinkles"},
                {"id": "5003", "type": "Chocolate"},
                {"id": "5004", "type": "Maple"}
            ],
        "something": []
    }
    df = json_to_dataframe(json_data)
    print(df)

OUTPUT:

      id   type  name   ppu batters.batter.id batters.batter.type topping.id              topping.type
0   0001  donut  Cake  0.55              1001             Regular       5001                      None
1   0001  donut  Cake  0.55              1001             Regular       5002                    Glazed
2   0001  donut  Cake  0.55              1001             Regular       5005                     Sugar
3   0001  donut  Cake  0.55              1001             Regular       5007            Powdered Sugar
4   0001  donut  Cake  0.55              1001             Regular       5006  Chocolate with Sprinkles
5   0001  donut  Cake  0.55              1001             Regular       5003                 Chocolate
6   0001  donut  Cake  0.55              1001             Regular       5004                     Maple
7   0001  donut  Cake  0.55              1002           Chocolate       5001                      None
8   0001  donut  Cake  0.55              1002           Chocolate       5002                    Glazed
9   0001  donut  Cake  0.55              1002           Chocolate       5005                     Sugar
10  0001  donut  Cake  0.55              1002           Chocolate       5007            Powdered Sugar
11  0001  donut  Cake  0.55              1002           Chocolate       5006  Chocolate with Sprinkles
12  0001  donut  Cake  0.55              1002           Chocolate       5003                 Chocolate
13  0001  donut  Cake  0.55              1002           Chocolate       5004                     Maple
14  0001  donut  Cake  0.55              1003           Blueberry       5001                      None
15  0001  donut  Cake  0.55              1003           Blueberry       5002                    Glazed
16  0001  donut  Cake  0.55              1003           Blueberry       5005                     Sugar
17  0001  donut  Cake  0.55              1003           Blueberry       5007            Powdered Sugar
18  0001  donut  Cake  0.55              1003           Blueberry       5006  Chocolate with Sprinkles
19  0001  donut  Cake  0.55              1003           Blueberry       5003                 Chocolate
20  0001  donut  Cake  0.55              1003           Blueberry       5004                     Maple
21  0001  donut  Cake  0.55              1004        Devil's Food       5001                      None
22  0001  donut  Cake  0.55              1004        Devil's Food       5002                    Glazed
23  0001  donut  Cake  0.55              1004        Devil's Food       5005                     Sugar
24  0001  donut  Cake  0.55              1004        Devil's Food       5007            Powdered Sugar
25  0001  donut  Cake  0.55              1004        Devil's Food       5006  Chocolate with Sprinkles
26  0001  donut  Cake  0.55              1004        Devil's Food       5003                 Chocolate
27  0001  donut  Cake  0.55              1004        Devil's Food       5004                     Maple

As per what the above does, well, the cross_join function does pretty much the same thing as in the dataframe solution, but without dataframes, thus being faster.

I added the flatten_list generator as I wanted to make sure that the JSON arrays are all nice and flattened, then provided as a single list of dictionaries comprising of the previous key from one iteration before assigned to each of the list's values. This pretty much mimics the pandas.concat behaviour in this case.

The logic in the main function, json_to_dataframe is then the same as before. All that needed to change was having the operations performed by dataframes as coded functions.

Also, in the dataframes solution I was not appending the previous heading to the nested object, but unless you are 100% sure you do not have conflicts in column names, then it is pretty much mandatory.

I hope this helps :).

EDIT: Modified the cross_join function to deal with the case when a nested list is empty, basically maintaining the previous result set unmodified. The output is unchanged even after adding the empty JSON list in the example JSON data. Thank you, @Nazmus Sakib for pointing it out.

2 of 5

For the JSON data you have given, you could do this by parsing the JSON structure to just return a list of all the leaf nodes.

This assumes that your structure is consistent throughout, if each entry can have different fields, see the second approach.

For example:

import json
import csv

def get_leaves(item, key=None):
    if isinstance(item, dict):
        leaves = []
        for i in item.keys():
            leaves.extend(get_leaves(item[i], i))
        return leaves
    elif isinstance(item, list):
        leaves = []
        for i in item:
            leaves.extend(get_leaves(i, key))
        return leaves
    else:
        return [(key, item)]


with open('json.txt') as f_input, open('output.csv', 'w', newline='') as f_output:
    csv_output = csv.writer(f_output)
    write_header = True

    for entry in json.load(f_input):
        leaf_entries = sorted(get_leaves(entry))

        if write_header:
            csv_output.writerow([k for k, v in leaf_entries])
            write_header = False

        csv_output.writerow([v for k, v in leaf_entries])

If your JSON data is a list of entries in the format you have given, then you should get output as follows:

address_line_1,company_number,country_of_residence,etag,forename,kind,locality,middle_name,month,name,nationality,natures_of_control,notified_on,postal_code,premises,region,self,surname,title,year
Address 1,12345678,England,26281dhge33b22df2359sd6afsff2cb8cf62bb4a7f00,John,individual-person-with-significant-control,Henley-On-Thames,M,2,John M Smith,Vietnamese,ownership-of-shares-50-to-75-percent,2016-04-06,RG9 1DP,161,Oxfordshire,/company/12345678/persons-with-significant-control/individual/bIhuKnFctSnjrDjUG8n3NgOrl,Smith,Mrs,1977
Address 1,12345679,England,26281dhge33b22df2359sd6afsff2cb8cf62bb4a7f00,John,individual-person-with-significant-control,Henley-On-Thames,M,2,John M Smith,Vietnamese,ownership-of-shares-50-to-75-percent,2016-04-06,RG9 1DP,161,Oxfordshire,/company/12345678/persons-with-significant-control/individual/bIhuKnFctSnjrDjUG8n3NgOrl,Smith,Mrs,1977

If each entry can contain different (or possibly missing) fields, then a better approach would be to use a DictWriter. In this case, all of the entries would need to be processed to determine the complete list of possible fieldnames so that the correct header can be written.

import json
import csv

def get_leaves(item, key=None):
    if isinstance(item, dict):
        leaves = {}
        for i in item.keys():
            leaves.update(get_leaves(item[i], i))
        return leaves
    elif isinstance(item, list):
        leaves = {}
        for i in item:
            leaves.update(get_leaves(i, key))
        return leaves
    else:
        return {key : item}


with open('json.txt') as f_input:
    json_data = json.load(f_input)

# First parse all entries to get the complete fieldname list
fieldnames = set()

for entry in json_data:
    fieldnames.update(get_leaves(entry).keys())

with open('output.csv', 'w', newline='') as f_output:
    csv_output = csv.DictWriter(f_output, fieldnames=sorted(fieldnames))
    csv_output.writeheader()
    csv_output.writerows(get_leaves(entry) for entry in json_data)

Stack Overflow

stackoverflow.com › questions › 71583670 › how-to-convert-csv-to-nested-json-in-python

How to convert CSV to nested JSON in Python - Stack Overflow

Top answer

1 of 3

A simple way is to add more columns; then use to_json method in pandas:

import pandas as pd
df = pd.read_csv('your_file.csv')
df['Purchase'] = df[['b','c','d']].to_dict('records')
df['Sales'] = df[['d','e']].to_dict('records')
out = df[['a', 'Purchase', 'Sales']].to_json(orient='records', indent=4)

Output:

[
    {
        "a":1,
        "Purchase":{
            "b":2,
            "c":3,
            "d":4
        },
        "Sales":{
            "d":4,
            "e":5
        }
    },
    {
        "a":9,
        "Purchase":{
            "b":8,
            "c":7,
            "d":6
        },
        "Sales":{
            "d":6,
            "e":5
        }
    }
]

2 of 3

You don't need any libraries for this, just specify the right dialect, e.g. for tab-separated:

import csv
import json


with open("tmp4.csv", "r") as f:
    result = [
        {
            "a": row["a"],
            "Purchase": {
                "b": row["b"],
                "c": row["c"],
            },
            "Sales": {
                "d": row["d"],
                "e": row["e"],
            },
        }
        for row in csv.DictReader(f, dialect='excel-tab')
    ]
assert (
    json.dumps(result)
    == '[{"a": "1", "Purchase": {"b": "2", "c": "3"}, "Sales": {"d": "4", "e": "5"}}, {"a": "9", "Purchase": {"b": "8", "c": "7"}, "Sales": {"d": "6", "e": "5"}}]'
)

Python.org

discuss.python.org › python help

Converting a flat csv to nested json that has multi-level hierarchy using pandas - Python Help - Discussions on Python.org

August 19, 2022 - Hello, I am relatively new to python. However, I am asked to create a script that reads csv file and composes json request. I have explored couple of solutions available and created a script (pasted below) for converting to json. However, as I am unaware of ‘agg’ and ‘lambda’ ...

Find elsewhere

Google Bing Mojeek

Stack Overflow

stackoverflow.com › questions › 44548336 › how-to-write-nested-json-data-to-csv-using-python-pandas

How to write nested json data to csv using python pandas? - Stack Overflow

import json jsons = {"data": {"product1": [{ "label": "jan","value": 13}, {"label": "Feb","value": 15 },{"label": "Mar", "value": 1}], "product2": [ { "label": "February","value": 7 }]}} product1 = jsons['data']['product1'] product2 = jsons['data']['project2'] (Where product1 = [{'label': 'jan', 'value': 13}, {'label': 'Feb', 'value': 15}, {'label': 'Mar', 'value': 1}] and product2 = [{'label': 'February', 'value': 7}]) Then, here's some documentation to learn how to export the data to csv.

Gigasheet

gigasheet.com › post › convert-json-to-csv-python

How to Convert JSON to CSV in Python

Once done, it creates a normalized ... from the Pandas library. Here’s the output from the console after we print the jsonBody and csvBody variables: Again, these examples are fairly simple JSON objects. If you’re dealing with large objects with multiple nested fields, you might come across several problems in Python’s JSON to ...

GitHub

github.com › vinay20045 › json-to-csv

GitHub - vinay20045/json-to-csv: Nested JSON to CSV Converter · GitHub

Nested JSON to CSV Converter. This python script converts valid, preformatted JSON to CSV which can be opened in excel and other similar applications. This script can handle nested json with multiple objects and arrays.

Starred by 290 users

Forked by 213 users

Languages Python

Stack Overflow

stackoverflow.com › questions › 37706351 › nested-json-to-csv-generic-approach

python - Nested json to csv - generic approach - Stack Overflow

Top answer

1 of 3

Thanks to the great blog post by Amir Ziai which you can find here I managed to output my data in form of a flat table. With the following function:

#Function that recursively extracts values out of the object into a flattened dictionary
def flatten_json(data):
    flat = [] #list of flat dictionaries
    def flatten(y):
        out = {}

        def flatten2(x, name=''):
            if type(x) is dict:
                for a in x:
                    if a == "name": 
                            flatten2(x["value"], name + x[a] + '_')
                    else:  
                        flatten2(x[a], name + a + '_')
            elif type(x) is list:
                for a in x:
                    flatten2(a, name + '_')
            else:
                out[name[:-1]] = x

        flatten2(y)
        return out

#Loop needed to flatten multiple objects
    for i in range(len(data)):
        flat.append(flatten(data[i]).copy())

    return json_normalize(flat)

I am aware of the fact that it is not perfectly generalisable, due to name-value if statement. However, if this exemption for creating the name-value dictionaries is deleted, the code can be used with other embedded arrays.

2 of 3

I had a task to turn a json with nested key and values into a csv file a couple of weeks ago. For this task it was necessary to handle the nested keys properly to concatenate the to be used as unique headers for the values. The result was the code bellow, which can also be found here.

def get_flat_json(json_data, header_string, header, row):
    """Parse json files with nested key-vales into flat lists using nested column labeling"""
    for root_key, root_value in json_data.items():
        if isinstance(root_value, dict):
            get_flat_json(root_value, header_string + '_' + str(root_key), header, row)
        elif isinstance(root_value, list):
            for value_index in range(len(root_value)):
                for nested_key, nested_value in root_value[value_index].items():
                    header[0].append((header_string +
                                      '_' + str(root_key) +
                                      '_' + str(nested_key) +
                                      '_' + str(value_index)).strip('_'))
                    if nested_value is None:
                        nested_value = ''
                    row[0].append(str(nested_value))
        else:
            if root_value is None:
                root_value = ''
            header[0].append((header_string + '_' + str(root_key)).strip('_'))
            row[0].append(root_value)
    return header, row

This is a more generalized approach based on An Economist answer to this question.

linkedin.com › pulse › converting-nested-json-data-csv-using-pythonpandas-kaleab-woldemariam

Converting Nested JSON data to CSV using python/pandas

March 8, 2018 - Finally, We keep the columns ‘post_count’ and ‘score’ and drop df2’s ‘user’ column to avoid the nested ‘user’ dictionary, since we need to expand it one level. We concatenate (pd.concat) the df2 without ‘user’ column with expanded df2’s ‘user’ column by applying .apply (pd.Series) on df2 as df2[‘user’].apply(pd.Series). df3=pd.concat([df2.drop(['user'],axis=1),df2['user'].apply(pd.Series)],axis=1) ... import sys import pandas as pd from pandas import DataFrame import json data=r'C:\Users\Kaleab\Desktop\data.json' print ("This is json data input", data) # Reads and converts json to dict.

Stack Overflow

stackoverflow.com › questions › 49650304 › nested-json-to-csv-using-pandas-normalize › 49654981

python - nested json to csv using pandas normalize - Stack Overflow

Top answer

1 of 1

Assuming the multiple URLs delineate between rows and all else meta data repeats, consider a recursive function call to extract every key-value pair in nested json object, d.

The recursive function will call global to update the needed global objects to be binded into a list of dictionaries for pd.DataFrame() call. Last loop at end updates the recursive function's dictionary, inner, to integrate the different urls (stored in multi)

import json 
import pandas as pd 

# load json object
with open('nvdcve-1.0-modified.json') as f:
   d = json.load(f)

multi = []; inner = {}

def recursive_extract(i):
    global multi, inner

    if type(i) is list:
        if len(i) == 1:
            for k,v in i[0].items():
                if type(v) in [list, dict]:
                    recursive_extract(v)
                else:                
                    inner[k] = v
        else:
            multi = i

    if type(i) is dict:
        for k,v in i.items():
            if type(v) in [list, dict]:
                recursive_extract(v)
            else:                
                inner[k] = v

recursive_extract(d['CVE_Items'])

data_dict = []
for i in multi:    
    tmp = inner.copy()
    tmp.update(i)
    data_dict.append(tmp)

df = pd.DataFrame(data_dict)
df.to_csv('Output.csv')

Output (all columns the same except for URL, widened for emphasis)

CodeProject

codeproject.com › Questions › 5325794 › Is-there-a-way-to-generically-convert-nested-JSON

Is there a way to generically convert nested JSON file to CSV in Python ? - CodeProject

February 22, 2022 - Free source code and tutorials for Software developers and Architects.; Updated: 22 Feb 2022

Posit Community

forum.posit.co › general

nested json to csv - General - Posit Community

February 13, 2024 - I would to convert the nested json to csv. { "dataElements": [ { "name": "004-DN02. Names of deceased", "id": "HCbRydAAt1T", "categoryCombo": { "name": "default", "id": "bjDvmb4bfuf", "categoryOptionCombos": [ { "name": "default2", "id": ...

Enterprise DNA

blog.enterprisedna.co › python-convert-json-to-csv

Python: Convert JSON to CSV, Step-by-Step Guide – Master Data Skills + AI

This code would create a CSV file ... | 22 | Los Angeles | 3,990,000 | +---------+-----+---------------+-----------------+ Using read_json and handling nested objects with json_normalize, you can effectively convert various JSON data structures into tabular data format (CSV) ...

reddit.com › r/learnpython › converting nested json data?

r/learnpython on Reddit: Converting nested JSON data?

July 28, 2024 -

Hi, hope this is appropriate. I doing a python course at the minute and thought i would set some challenges for myself during the summer break. I am trying to convert a json file to CSV but am running into some problems with nested data.

The JSON dataset I am working has a structure like the below;

{

    "PolicyNumber": "00001",
    
    "PolicyData": [
        {
            "TableName": "MEMBERS",
            "Properties": {
                "Surname": "Bloggs",
                "Forename": "Joe",
                "Gender": "Male",
                
            }
        },
        {
            "TableName": "MEMBERS",
            "Properties": {
                "Surname": "Jobs",
                "Forename": "Steve",
                "Gender": "Male",
            }
        }
    ],

}

As you can see for each policy there there are tables for any Member on the policy. There can be up to 5 members on any policy in this case. The property names are naturally repeated for all members.

Basically I want a column in the CSV file for each parameter. However as there can be multiple "Members" on each "Policy", the same parameters are used multiple times. The code I wrote is only making one column for each pararmeter and overwriting each value with the next. So in the above example the only data that pulls through is for the second member "Steve Jobs". Ideally I would like to be able to create columns something like "Member1_Forename", "Member2_Forename" etc... but I am unsure how to do this...

Any pointers or tips?

Top answer

1 of 3

Does the PolicyData array contain other types of objects besides "Members"? If so I would extract the members to their own list first. That's a one-liner using a list comprehension. If not then never mind, you already have a list of members. I usually use csv.DictWriter when building CSV files. That you just have to make a dict with a uniform set of keys for each row. So in this case when making each row dict, you could iterate over a range of all possible members and use that as an index into the members list. Since there will probably be fewer actual members on a given policy than the maximum possible, you'll need to check if the index is out of bounds before you use it to access elements in the members list. If it's out of bounds, populate the relevant values for each column for that member with None.

2 of 3

Any pointers or tips? When you ask for help always include the code you have so far, example input, and example of what you want out. You scored 2/3 on this post, better than most, but still we need to see your ode to help you fix it.

PyPI

pypi.org › project › libjson2csv

Client Challenge

April 25, 2017 - JavaScript is disabled in your browser · Please enable JavaScript to proceed · A required part of this site couldn’t load. This may be due to a browser extension, network issues, or browser settings. Please check your connection, disable any ad blockers, or try using a different browser

Spark By {Examples}

sparkbyexamples.com › home › pandas › pandas – convert json to csv

Pandas - Convert JSON to CSV - Spark By {Examples}

November 1, 2024 - The pd.json_normalize() function in Pandas can be used to flatten nested JSON structures. In this article, you have learned steps on how to convert JSON to CSV in pandas using the pandas library.

reddit.com › r › learnpython › comments › a7xjm3 › nested_json_to_csv

r/learnpython - Nested JSON to CSV

December 20, 2018 -

Hello,

I have a large JSON file that is nested. I tried to read it with pandas and export it to CSV but I got an error "ValueError: Expected object or value".Any suggestions how to tackle this?

Top answer

1 of 3

Please show your code and the full error output, preferably using pastebin.com

2 of 3

Mind that csv is not the best structure to store a tree, especially as the nodes on the same level can have different types of siblings: types, structures.