nested json to csv python without pandas

Convert nested JSON to CSV file in Python

stackoverflow.com › questions › 41180960 › convert-nested-json-to-csv-file-in-python

Please scroll down for the newer, faster solution

This is an older question, but I struggled the entire night to get a satisfactory result for a similar situation, and I came up with this:

import json
import pandas

def cross_join(left, right):
    return left.assign(key=1).merge(right.assign(key=1), on='key', how='outer').drop('key', 1)

def json_to_dataframe(data_in):
    def to_frame(data, prev_key=None):
        if isinstance(data, dict):
            df = pandas.DataFrame()
            for key in data:
                df = cross_join(df, to_frame(data[key], prev_key + '.' + key))
        elif isinstance(data, list):
            df = pandas.DataFrame()
            for i in range(len(data)):
                df = pandas.concat([df, to_frame(data[i], prev_key)])
        else:
            df = pandas.DataFrame({prev_key[1:]: [data]})
        return df
    return to_frame(data_in)

if __name__ == '__main__':
    with open('somefile') as json_file:
        json_data = json.load(json_file)

    df = json_to_dataframe(json_data)
    df.to_csv('data.csv', mode='w')

Explanation:

The cross_join function is a neat way I found to do a cartesian product. (credit: here)

The json_to_dataframe function does the logic, using pandas dataframes. In my case, the json was deeply nested, and I wanted to split dictionary key:value pairs into columns, but the lists I wanted to transform into rows for a column -- hence the concat -- which I then cross join with the upper level, thus multiplying the records number so that each value from the list has its own row, while the previous columns are identical.

The recursiveness creates stacks that cross join with the one below, until the last one is returned.

Then with the dataframe in a table format, it's easy to convert to CSV with the "df.to_csv()" dataframe object method.

This should work with deeply nested JSON, being able to normalize all of it into rows by the logic described above.

I hope this will help someone, someday. Just trying to give back to this awesome community.

---------------------------------------------------------------------------------------------

LATER EDIT: NEW SOLUTION

I'm coming back to this as while the dataframe option kinda worked, it took the app minutes to parse not so large JSON data. Therefore I thought of doing what the dataframes do, but by myself:

from copy import deepcopy
import pandas


def cross_join(left, right):
    new_rows = [] if right else left
    for left_row in left:
        for right_row in right:
            temp_row = deepcopy(left_row)
            for key, value in right_row.items():
                temp_row[key] = value
            new_rows.append(deepcopy(temp_row))
    return new_rows


def flatten_list(data):
    for elem in data:
        if isinstance(elem, list):
            yield from flatten_list(elem)
        else:
            yield elem


def json_to_dataframe(data_in):
    def flatten_json(data, prev_heading=''):
        if isinstance(data, dict):
            rows = [{}]
            for key, value in data.items():
                rows = cross_join(rows, flatten_json(value, prev_heading + '.' + key))
        elif isinstance(data, list):
            rows = []
            for item in data:
                [rows.append(elem) for elem in flatten_list(flatten_json(item, prev_heading))]
        else:
            rows = [{prev_heading[1:]: data}]
        return rows

    return pandas.DataFrame(flatten_json(data_in))


if __name__ == '__main__':
    json_data = {
        "id": "0001",
        "type": "donut",
        "name": "Cake",
        "ppu": 0.55,
        "batters":
            {
                "batter":
                    [
                        {"id": "1001", "type": "Regular"},
                        {"id": "1002", "type": "Chocolate"},
                        {"id": "1003", "type": "Blueberry"},
                        {"id": "1004", "type": "Devil's Food"}
                    ]
            },
        "topping":
            [
                {"id": "5001", "type": "None"},
                {"id": "5002", "type": "Glazed"},
                {"id": "5005", "type": "Sugar"},
                {"id": "5007", "type": "Powdered Sugar"},
                {"id": "5006", "type": "Chocolate with Sprinkles"},
                {"id": "5003", "type": "Chocolate"},
                {"id": "5004", "type": "Maple"}
            ],
        "something": []
    }
    df = json_to_dataframe(json_data)
    print(df)

OUTPUT:

      id   type  name   ppu batters.batter.id batters.batter.type topping.id              topping.type
0   0001  donut  Cake  0.55              1001             Regular       5001                      None
1   0001  donut  Cake  0.55              1001             Regular       5002                    Glazed
2   0001  donut  Cake  0.55              1001             Regular       5005                     Sugar
3   0001  donut  Cake  0.55              1001             Regular       5007            Powdered Sugar
4   0001  donut  Cake  0.55              1001             Regular       5006  Chocolate with Sprinkles
5   0001  donut  Cake  0.55              1001             Regular       5003                 Chocolate
6   0001  donut  Cake  0.55              1001             Regular       5004                     Maple
7   0001  donut  Cake  0.55              1002           Chocolate       5001                      None
8   0001  donut  Cake  0.55              1002           Chocolate       5002                    Glazed
9   0001  donut  Cake  0.55              1002           Chocolate       5005                     Sugar
10  0001  donut  Cake  0.55              1002           Chocolate       5007            Powdered Sugar
11  0001  donut  Cake  0.55              1002           Chocolate       5006  Chocolate with Sprinkles
12  0001  donut  Cake  0.55              1002           Chocolate       5003                 Chocolate
13  0001  donut  Cake  0.55              1002           Chocolate       5004                     Maple
14  0001  donut  Cake  0.55              1003           Blueberry       5001                      None
15  0001  donut  Cake  0.55              1003           Blueberry       5002                    Glazed
16  0001  donut  Cake  0.55              1003           Blueberry       5005                     Sugar
17  0001  donut  Cake  0.55              1003           Blueberry       5007            Powdered Sugar
18  0001  donut  Cake  0.55              1003           Blueberry       5006  Chocolate with Sprinkles
19  0001  donut  Cake  0.55              1003           Blueberry       5003                 Chocolate
20  0001  donut  Cake  0.55              1003           Blueberry       5004                     Maple
21  0001  donut  Cake  0.55              1004        Devil's Food       5001                      None
22  0001  donut  Cake  0.55              1004        Devil's Food       5002                    Glazed
23  0001  donut  Cake  0.55              1004        Devil's Food       5005                     Sugar
24  0001  donut  Cake  0.55              1004        Devil's Food       5007            Powdered Sugar
25  0001  donut  Cake  0.55              1004        Devil's Food       5006  Chocolate with Sprinkles
26  0001  donut  Cake  0.55              1004        Devil's Food       5003                 Chocolate
27  0001  donut  Cake  0.55              1004        Devil's Food       5004                     Maple

As per what the above does, well, the cross_join function does pretty much the same thing as in the dataframe solution, but without dataframes, thus being faster.

I added the flatten_list generator as I wanted to make sure that the JSON arrays are all nice and flattened, then provided as a single list of dictionaries comprising of the previous key from one iteration before assigned to each of the list's values. This pretty much mimics the pandas.concat behaviour in this case.

The logic in the main function, json_to_dataframe is then the same as before. All that needed to change was having the operations performed by dataframes as coded functions.

Also, in the dataframes solution I was not appending the previous heading to the nested object, but unless you are 100% sure you do not have conflicts in column names, then it is pretty much mandatory.

I hope this helps :).

EDIT: Modified the cross_join function to deal with the case when a nested list is empty, basically maintaining the previous result set unmodified. The output is unchanged even after adding the empty JSON list in the example JSON data. Thank you, @Nazmus Sakib for pointing it out.

Answer from Bogdan Mircea on Stack Overflow

GitHub

github.com › vinay20045 › json-to-csv

GitHub - vinay20045/json-to-csv: Nested JSON to CSV Converter · GitHub

Nested JSON to CSV Converter. This python script converts valid, preformatted JSON to CSV which can be opened in excel and other similar applications. This script can handle nested json with multiple objects and arrays.

Starred by 290 users

Forked by 213 users

Languages Python

Stack Overflow

stackoverflow.com › questions › 41180960 › convert-nested-json-to-csv-file-in-python

Convert nested JSON to CSV file in Python - Stack Overflow

Top answer

1 of 5

Please scroll down for the newer, faster solution

This is an older question, but I struggled the entire night to get a satisfactory result for a similar situation, and I came up with this:

import json
import pandas

def cross_join(left, right):
    return left.assign(key=1).merge(right.assign(key=1), on='key', how='outer').drop('key', 1)

def json_to_dataframe(data_in):
    def to_frame(data, prev_key=None):
        if isinstance(data, dict):
            df = pandas.DataFrame()
            for key in data:
                df = cross_join(df, to_frame(data[key], prev_key + '.' + key))
        elif isinstance(data, list):
            df = pandas.DataFrame()
            for i in range(len(data)):
                df = pandas.concat([df, to_frame(data[i], prev_key)])
        else:
            df = pandas.DataFrame({prev_key[1:]: [data]})
        return df
    return to_frame(data_in)

if __name__ == '__main__':
    with open('somefile') as json_file:
        json_data = json.load(json_file)

    df = json_to_dataframe(json_data)
    df.to_csv('data.csv', mode='w')

Explanation:

The cross_join function is a neat way I found to do a cartesian product. (credit: here)

The recursiveness creates stacks that cross join with the one below, until the last one is returned.

Then with the dataframe in a table format, it's easy to convert to CSV with the "df.to_csv()" dataframe object method.

This should work with deeply nested JSON, being able to normalize all of it into rows by the logic described above.

I hope this will help someone, someday. Just trying to give back to this awesome community.

---------------------------------------------------------------------------------------------

LATER EDIT: NEW SOLUTION

I'm coming back to this as while the dataframe option kinda worked, it took the app minutes to parse not so large JSON data. Therefore I thought of doing what the dataframes do, but by myself:

from copy import deepcopy
import pandas


def cross_join(left, right):
    new_rows = [] if right else left
    for left_row in left:
        for right_row in right:
            temp_row = deepcopy(left_row)
            for key, value in right_row.items():
                temp_row[key] = value
            new_rows.append(deepcopy(temp_row))
    return new_rows


def flatten_list(data):
    for elem in data:
        if isinstance(elem, list):
            yield from flatten_list(elem)
        else:
            yield elem


def json_to_dataframe(data_in):
    def flatten_json(data, prev_heading=''):
        if isinstance(data, dict):
            rows = [{}]
            for key, value in data.items():
                rows = cross_join(rows, flatten_json(value, prev_heading + '.' + key))
        elif isinstance(data, list):
            rows = []
            for item in data:
                [rows.append(elem) for elem in flatten_list(flatten_json(item, prev_heading))]
        else:
            rows = [{prev_heading[1:]: data}]
        return rows

    return pandas.DataFrame(flatten_json(data_in))


if __name__ == '__main__':
    json_data = {
        "id": "0001",
        "type": "donut",
        "name": "Cake",
        "ppu": 0.55,
        "batters":
            {
                "batter":
                    [
                        {"id": "1001", "type": "Regular"},
                        {"id": "1002", "type": "Chocolate"},
                        {"id": "1003", "type": "Blueberry"},
                        {"id": "1004", "type": "Devil's Food"}
                    ]
            },
        "topping":
            [
                {"id": "5001", "type": "None"},
                {"id": "5002", "type": "Glazed"},
                {"id": "5005", "type": "Sugar"},
                {"id": "5007", "type": "Powdered Sugar"},
                {"id": "5006", "type": "Chocolate with Sprinkles"},
                {"id": "5003", "type": "Chocolate"},
                {"id": "5004", "type": "Maple"}
            ],
        "something": []
    }
    df = json_to_dataframe(json_data)
    print(df)

OUTPUT:

      id   type  name   ppu batters.batter.id batters.batter.type topping.id              topping.type
0   0001  donut  Cake  0.55              1001             Regular       5001                      None
1   0001  donut  Cake  0.55              1001             Regular       5002                    Glazed
2   0001  donut  Cake  0.55              1001             Regular       5005                     Sugar
3   0001  donut  Cake  0.55              1001             Regular       5007            Powdered Sugar
4   0001  donut  Cake  0.55              1001             Regular       5006  Chocolate with Sprinkles
5   0001  donut  Cake  0.55              1001             Regular       5003                 Chocolate
6   0001  donut  Cake  0.55              1001             Regular       5004                     Maple
7   0001  donut  Cake  0.55              1002           Chocolate       5001                      None
8   0001  donut  Cake  0.55              1002           Chocolate       5002                    Glazed
9   0001  donut  Cake  0.55              1002           Chocolate       5005                     Sugar
10  0001  donut  Cake  0.55              1002           Chocolate       5007            Powdered Sugar
11  0001  donut  Cake  0.55              1002           Chocolate       5006  Chocolate with Sprinkles
12  0001  donut  Cake  0.55              1002           Chocolate       5003                 Chocolate
13  0001  donut  Cake  0.55              1002           Chocolate       5004                     Maple
14  0001  donut  Cake  0.55              1003           Blueberry       5001                      None
15  0001  donut  Cake  0.55              1003           Blueberry       5002                    Glazed
16  0001  donut  Cake  0.55              1003           Blueberry       5005                     Sugar
17  0001  donut  Cake  0.55              1003           Blueberry       5007            Powdered Sugar
18  0001  donut  Cake  0.55              1003           Blueberry       5006  Chocolate with Sprinkles
19  0001  donut  Cake  0.55              1003           Blueberry       5003                 Chocolate
20  0001  donut  Cake  0.55              1003           Blueberry       5004                     Maple
21  0001  donut  Cake  0.55              1004        Devil's Food       5001                      None
22  0001  donut  Cake  0.55              1004        Devil's Food       5002                    Glazed
23  0001  donut  Cake  0.55              1004        Devil's Food       5005                     Sugar
24  0001  donut  Cake  0.55              1004        Devil's Food       5007            Powdered Sugar
25  0001  donut  Cake  0.55              1004        Devil's Food       5006  Chocolate with Sprinkles
26  0001  donut  Cake  0.55              1004        Devil's Food       5003                 Chocolate
27  0001  donut  Cake  0.55              1004        Devil's Food       5004                     Maple

As per what the above does, well, the cross_join function does pretty much the same thing as in the dataframe solution, but without dataframes, thus being faster.

The logic in the main function, json_to_dataframe is then the same as before. All that needed to change was having the operations performed by dataframes as coded functions.

I hope this helps :).

2 of 5

For the JSON data you have given, you could do this by parsing the JSON structure to just return a list of all the leaf nodes.

This assumes that your structure is consistent throughout, if each entry can have different fields, see the second approach.

For example:

import json
import csv

def get_leaves(item, key=None):
    if isinstance(item, dict):
        leaves = []
        for i in item.keys():
            leaves.extend(get_leaves(item[i], i))
        return leaves
    elif isinstance(item, list):
        leaves = []
        for i in item:
            leaves.extend(get_leaves(i, key))
        return leaves
    else:
        return [(key, item)]


with open('json.txt') as f_input, open('output.csv', 'w', newline='') as f_output:
    csv_output = csv.writer(f_output)
    write_header = True

    for entry in json.load(f_input):
        leaf_entries = sorted(get_leaves(entry))

        if write_header:
            csv_output.writerow([k for k, v in leaf_entries])
            write_header = False

        csv_output.writerow([v for k, v in leaf_entries])

If your JSON data is a list of entries in the format you have given, then you should get output as follows:

address_line_1,company_number,country_of_residence,etag,forename,kind,locality,middle_name,month,name,nationality,natures_of_control,notified_on,postal_code,premises,region,self,surname,title,year
Address 1,12345678,England,26281dhge33b22df2359sd6afsff2cb8cf62bb4a7f00,John,individual-person-with-significant-control,Henley-On-Thames,M,2,John M Smith,Vietnamese,ownership-of-shares-50-to-75-percent,2016-04-06,RG9 1DP,161,Oxfordshire,/company/12345678/persons-with-significant-control/individual/bIhuKnFctSnjrDjUG8n3NgOrl,Smith,Mrs,1977
Address 1,12345679,England,26281dhge33b22df2359sd6afsff2cb8cf62bb4a7f00,John,individual-person-with-significant-control,Henley-On-Thames,M,2,John M Smith,Vietnamese,ownership-of-shares-50-to-75-percent,2016-04-06,RG9 1DP,161,Oxfordshire,/company/12345678/persons-with-significant-control/individual/bIhuKnFctSnjrDjUG8n3NgOrl,Smith,Mrs,1977

If each entry can contain different (or possibly missing) fields, then a better approach would be to use a DictWriter. In this case, all of the entries would need to be processed to determine the complete list of possible fieldnames so that the correct header can be written.

import json
import csv

def get_leaves(item, key=None):
    if isinstance(item, dict):
        leaves = {}
        for i in item.keys():
            leaves.update(get_leaves(item[i], i))
        return leaves
    elif isinstance(item, list):
        leaves = {}
        for i in item:
            leaves.update(get_leaves(i, key))
        return leaves
    else:
        return {key : item}


with open('json.txt') as f_input:
    json_data = json.load(f_input)

# First parse all entries to get the complete fieldname list
fieldnames = set()

for entry in json_data:
    fieldnames.update(get_leaves(entry).keys())

with open('output.csv', 'w', newline='') as f_output:
    csv_output = csv.DictWriter(f_output, fieldnames=sorted(fieldnames))
    csv_output.writeheader()
    csv_output.writerows(get_leaves(entry) for entry in json_data)

Discussions

Converting nested JSON data?

Does the PolicyData array contain other types of objects besides "Members"? If so I would extract the members to their own list first. That's a one-liner using a list comprehension. If not then never mind, you already have a list of members. I usually use csv.DictWriter when building CSV files. That you just have to make a dict with a uniform set of keys for each row. So in this case when making each row dict, you could iterate over a range of all possible members and use that as an index into the members list. Since there will probably be fewer actual members on a given policy than the maximum possible, you'll need to check if the index is out of bounds before you use it to access elements in the members list. If it's out of bounds, populate the relevant values for each column for that member with None. More on reddit.com

r/learnpython

July 28, 2024

nested json to csv

I would to convert the nested json to csv. { "dataElements": [ { "name": "004-DN02. Names of deceased", "id": "HCbRydAAt1T", "categoryCombo": { "name": "default", "id": "bjDvmb4bfuf", "categoryOptionCombos": [ { "name": "default2", "id": "gGhClrV5odI" }, { "name": "default", "id": "HllvX50cXC0" ... More on forum.posit.co

forum.posit.co

February 13, 2024

How to convert JSON to csv in Python?

You have two steps: flatten the json code write the CSV file. You can create a recursive function to flatten the json, or use the library: https://github.com/amirziai/flatten flatten_json After that is completed, writing the CSV is trivial. More on reddit.com

r/learnpython

June 17, 2024

Converting JSON to .csv file

Sure you could make your own json to csv conversion code, and it would probably be a lot faster to run than using the pandas intermediate. But if what you have is working I wouldn't recommend changing it. It's probably not worth 2 hours of your time writing better code just to save 1 second of runtime. More on reddit.com

r/learnpython

October 1, 2025

Videos

youtube.com

Transforming Nested JSON to CSV Using Python

youtube.com

How to Convert Nested JSON to CSV in Python: A Step-by ...

youtube.com

Using Python to Convert Nested JSON to CSV

13:57

YouTube

Query APIs using Python (Part 2) - Nested JSON to Dataframe to ...

December 16, 2022

06:42

YouTube

CSV to Nested JSON for any API - YouTube

July 22, 2022

2.46K

youtube.com

Converting CSV to Nested JSON/ Dictionary format in PySpark ...

View all

CodeProject

codeproject.com › Questions › 5325794 › Is-there-a-way-to-generically-convert-nested-JSON

Is there a way to generically convert nested JSON file to CSV in Python ? - CodeProject

February 22, 2022 - Free source code and tutorials for Software developers and Architects.; Updated: 22 Feb 2022

PyPI

pypi.org › project › libjson2csv

Client Challenge

April 25, 2017 - JavaScript is disabled in your browser · Please enable JavaScript to proceed · A required part of this site couldn’t load. This may be due to a browser extension, network issues, or browser settings. Please check your connection, disable any ad blockers, or try using a different browser

Python Forum

python-forum.io › thread-37268.html

Convert nested sample json api data into csv in python

May 20, 2022 - Want to convert Sample JSON data into CSV file using python. I am retrieving JSON data from API. As my JSON has nested objects, so it normally cannot be directly converted to CSV.I don't want to do any hard coding and I want to make a python code ful...

reddit.com › r/learnpython › converting nested json data?

r/learnpython on Reddit: Converting nested JSON data?

July 28, 2024 -

Hi, hope this is appropriate. I doing a python course at the minute and thought i would set some challenges for myself during the summer break. I am trying to convert a json file to CSV but am running into some problems with nested data.

The JSON dataset I am working has a structure like the below;

{

    "PolicyNumber": "00001",
    
    "PolicyData": [
        {
            "TableName": "MEMBERS",
            "Properties": {
                "Surname": "Bloggs",
                "Forename": "Joe",
                "Gender": "Male",
                
            }
        },
        {
            "TableName": "MEMBERS",
            "Properties": {
                "Surname": "Jobs",
                "Forename": "Steve",
                "Gender": "Male",
            }
        }
    ],

}

As you can see for each policy there there are tables for any Member on the policy. There can be up to 5 members on any policy in this case. The property names are naturally repeated for all members.

Basically I want a column in the CSV file for each parameter. However as there can be multiple "Members" on each "Policy", the same parameters are used multiple times. The code I wrote is only making one column for each pararmeter and overwriting each value with the next. So in the above example the only data that pulls through is for the second member "Steve Jobs". Ideally I would like to be able to create columns something like "Member1_Forename", "Member2_Forename" etc... but I am unsure how to do this...

Any pointers or tips?

Top answer

1 of 3

2 of 3

Any pointers or tips? When you ask for help always include the code you have so far, example input, and example of what you want out. You scored 2/3 on this post, better than most, but still we need to see your ode to help you fix it.

Saralgyaan

saralgyaan.com › posts › convert-json-to-csv-using-python

Convert JSON to CSV using Python-SaralGyaan

February 21, 2019 - You can convert a large nested JSON to CSV in Python using json and csv module. You can also convert a nested large JSON to csv using Python Pandas.

Find elsewhere

Google Bing Mojeek

Posit Community

forum.posit.co › general

nested json to csv - General - Posit Community

February 13, 2024 - I would to convert the nested json to csv. { "dataElements": [ { "name": "004-DN02. Names of deceased", "id": "HCbRydAAt1T", "categoryCombo": { "name": "default", "id": "bjDvmb4bfuf", "categoryOptionCombos": [ { "name": "default2", "id": ...

GeeksforGeeks

geeksforgeeks.org › convert-nested-json-to-csv-in-python

Convert nested JSON to CSV in Python | GeeksforGeeks

August 23, 2021 - The read_json() function is used for the task, which taken the file path along with the extension as a parameter and returns the contents of the JSON file as a python dict object. We normalize the dict object using the normalize_json() function. It checks for the key-value pairs in the dict object. If the value is again a dict then it concatenates the key string with the key string of the nested dict. The desired CSV data is created using the generate_csv_data() function...

reddit.com › r/learnpython › how to convert json to csv in python?

r/learnpython on Reddit: How to convert JSON to csv in Python?

June 17, 2024 -

Hey everybody,

I try to convert a huge file (~65 GB) into smaller subsets. The goal is to split the files into the smaller subsets e.g. 1 mil tweets per file and convert this data to a csv format. I currently have a working splitting code that splits the ndjson file to smaller ndjson files, but I have trouble to convert the data to csv. The important part is to create columns for each exsisting varaiable, so columns named __crawled_url or w1_balanced. There are quite a few nested variabels in the data, like w1_balanced is contained in the variable theme_topic, that need to be flattened.

Splitting code:

import json
#function to split big ndjson file to multiple smaller files
def split_file(input_file, lines_per_file): #variables that the function calls
    file_count = 0
    line_count = 0
    output_lines = []
    with open(input_file, 'r', encoding="utf8") as infile:
        for line in infile:
            output_lines.append(line)
            line_count += 1
            if line_count == lines_per_file:
                with open(f'1mio_split_{file_count}.ndjson', 'w', encoding="utf8") as outfile:
                    outfile.writelines(output_lines)
                file_count += 1
                line_count = 0
                output_lines = []
        #handle any remaining lines
        if output_lines:
            with open(f'1mio_split_{file_count}.ndjson', 'w',encoding="utf8") as outfile:
                outfile.writelines(output_lines)
#file containing tweets
input_file = input("path to big file:" )
#example filepath: C:/Users/YourName/Documents/tweet.ndjson
#how many lines/tweets should the new file contain?
lines_per_file = int(input ("Split after how many lines?: "))
split_file(input_file, lines_per_file)
print("Splitting done!")

Here are 2 sample lines from the data I use:

[{"__crawled_url":"https://twitter.com/example1","theme_topic":{"w1_balanced":{"label":"__label__a","confidence":0.3981},"w5_balanced":{"label":"__label__c","confidence":1}},"author":"author1","author_userid":"116718988","author_username":"author1","canonical_url":"https://twitter.com/example1","collected_by":"User","collection_method":"tweety 1.0.9.4","collection_time":"2024-05-27T14:40:32","collection_time_epoch":1716813632,"isquoted":false,"isreply":true,"isretweet":false,"language":"de","mentioning/replying":"twitteruser","num_likes":"0","num_retweets":"0","plain_text":"@twitteruser here is an exmaple text 🤔","published_time":"2024-04-18T20:14:51","published_time_epoch":1713471291,"published_time_original":"2024-04-18 20:14:51+00:00","replied_tweet":{"author":"Twitter User","author_userid":"1053198649700827136","author_username":"twitteruser"},"spacy_annotations":{"de_core_news_lg":{"noun_chunks":[{"text":"@twitteruser","start_char":0,"end_char":9},{"text":"more exapmle text","start_char":20,"end_char":34},{"text":"Gel","start_char":40,"end_char":43},{"text":"Haar","start_char":47,"end_char":51}],"named_entities":[{"text":"@twitteruser","start_char":0,"end_char":9,"label_":"MISC"}]},"xx_ent_wiki_sm":{"named_entities":{}},"da_core_news_lg":{"noun_chunks":{},"named_entities":{}},"en_core_web_lg":{"noun_chunks":{},"named_entities":{}},"fr_core_news_lg":{"noun_chunks":{},"named_entities":{}},"it_core_news_lg":{"noun_chunks":{},"named_entities":{}},"pl_core_news_lg":{"named_entities":{}},"es_core_news_lg":{"noun_chunks":{},"named_entities":{}},"fi_core_news_lg":{"noun_chunks":{},"named_entities":{}}},"tweet_id":"1781053802398814682","hashtags":{},"outlinks":{},"quoted_tweet":{"outlinks":{},"hashtags":{},"mentioning/replying":{},"replied_tweet":{}}}]

[{"__crawled_url":"https://twitter.com/example2","theme_topic":{"w1_balanced":{"label":"__label__a","confidence":0.3981},"w5_balanced":{"label":"__label__c","confidence":1}},"author":"author2","author_userid":"116712288","author_username":"author2","canonical_url":"https://twitter.com/example2","collected_by":"User","collection_method":"tweety 1.0.9.4","collection_time":"2024-05-27T14:40:32","collection_time_epoch":1716813632,"isquoted":false,"isreply":true,"isretweet":false,"language":"de","mentioning/replying":"twitteruser","num_likes":"0","num_retweets":"0","plain_text":"@twitteruser here is another exmaple text 🤔","published_time":"2024-04-18T20:14:51","published_time_epoch":1713471291,"published_time_original":"2024-04-18 20:14:51+00:00","replied_tweet":{"author":"Twitter User","author_userid":"1053198649700827136","author_username":"twitteruser"},"spacy_annotations":{"de_core_news_lg":{"noun_chunks":[{"text":"@twitteruser","start_char":0,"end_char":9},{"text":"more exapmle text","start_char":20,"end_char":34},{"text":"Gel","start_char":40,"end_char":43},{"text":"Haar","start_char":47,"end_char":51}],"named_entities":[{"text":"@twitteruser","start_char":0,"end_char":9,"label_":"MISC"}]},"xx_ent_wiki_sm":{"named_entities":{}},"da_core_news_lg":{"noun_chunks":{},"named_entities":{}},"en_core_web_lg":{"noun_chunks":{},"named_entities":{}},"fr_core_news_lg":{"noun_chunks":{},"named_entities":{}},"it_core_news_lg":{"noun_chunks":{},"named_entities":{}},"pl_core_news_lg":{"named_entities":{}},"es_core_news_lg":{"noun_chunks":{},"named_entities":{}},"fi_core_news_lg":{"noun_chunks":{},"named_entities":{}}},"tweet_id":"1781053802398814682","hashtags":{},"outlinks":{},"quoted_tweet":{"outlinks":{},"hashtags":{},"mentioning/replying":{},"replied_tweet":{}}}]

As you can see the lines contain stuff like emojis and are in different languages so the encoding="uft8" must be included while opening the file, here are a few examples what I tried and the error message I get. I should also mention, that since every line is it's own list just calling the elements like with a normal json object didn't work.

Thanks a lot for every answer and even reading this post!

#try1
import json
import csv

data = "C:/Users/Sample-tweets.ndjson"
json_data = json.loads(data)
csv_file ="try3.csv"
csv_obj = open(csv_file, "w")
csv_writer = csv.writer(csv_obj)
header = json_data[0].keys()
csv_writer.writerow(header)
for item in json_data:
    csv_writer.writerow(item.values())
csv_obj.close()
#raise JSONDecodeError("Expecting value", s, err.value) from None
#json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)


#try2
import json
import csv

with open('Sample-tweets.ndjson', encoding="utf8") as ndfile:
data = json.load(ndfile)

csv_data = data['emp_details']
data_file = open('try1.csv', 'w', encoding="utf8")
csv_writer = csv.writer(data_file)
count = 0
for data in csv_data:
    if count == 0:
        header = emp.keys()
csv_writer.writerow(header) #spacing error?! can't even run the script 
        count += 1
    csv_writer.writerow(emp.values())
data_file.close()

with open('Sample-tweets.ndjson', encoding="utf8") as ndfile:
jsondata = json.load(ndfile)

data_file = open('try2.csv', 'w', newline='', encoding="uft8")
csv_writer = csv.writer(data_file)

count = 0
for data in ndfile:
if count == 0:
header = data.keys()
csv_writer.writerow(header)
count += 1
csv_writer.writerow(data.values())
data_file.close()
#error message: raise JSONDecodeError("Extra data", s, end)
#json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 1908)



#try3 to see if the auto dictionary works
import json

output_lines=[]
with open('C:/Users/Sample1-tweets.ndjson', 'r', encoding="utf8") as f:
    json_in=f.read()
json_in=json.loads(json_in)
print(json_in[2])
#error message: raise JSONDecodeError("Extra data", s, end)
#json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 1908)
#->same error message as above

Top answer

1 of 1

Inventive HQ

inventivehq.com › home › blog › python › json to csv python converter | transform and export data with code

JSON to CSV Python Converter | Transform and Export Data with Code

November 1, 2025 - Convert JSON to CSV in Python with complete code examples. Handle nested data, preserve types, and automate data transformation for analysis and reporting.

Quora

quora.com › How-do-you-convert-nested-JSON-to-CSV

How to convert nested JSON to CSV - Quora

Answer (1 of 7): Use this PHP script. rajbdilip/json-to-csv-converter I have tested this script to convert JSON files as large as 4 GB. It also shows the progress as it does the conversion. If you are on Windows read the following post to setup PHP on Command Line. How to access PHP with the ...

alpharithms

alpharithms.com › home › tutorials › convert json to csv in python

Convert JSON to CSV in Python - αlphαrithms

July 7, 2022 - The location column contains nested JSON data that didn’t import properly. The id column should could be used to index our data (optional). To approach the first issue, we’ll have to modify the approach by which we loaded our data. In the first step, we loaded our data directly via the read_json function in the Pandas library.

Gigasheet

gigasheet.com › post › convert-json-to-csv-python

How to Convert JSON to CSV in Python

How to Convert JSON to CSV in Python by Syed Hasan | Learn how to convert JSON to CSV using Python code and Pandas functions, and the the #NoCode way of converting JSON to CSV using Gigasheet.

reddit.com › r/learnpython › converting json to .csv file

r/learnpython on Reddit: Converting JSON to .csv file

October 1, 2025 -

I have a script that currently queries json from an api, and store the individual records in a list. Then uses pandas to convert the json into a dataframe so that I can export to a csv. Is this the best way to handle this? Or should I be implementing some other method? Example code below:

json_data = [
    {
        'column1' : 'value1', 
        'column2' : 'value2'
    },
    {
        'column1' : 'value1', 
        'column2' : 'value2'
    }
]

df = pd.DataFrame.from_records(json_data)
df.to_csv('my_cool_csv.csv')

Top answer

1 of 4

2 of 4

If that's your json data schema, then it's easy. import csv import json json_data = """[ { "column1" : "value1", "column2" : "value2" }, { "column1" : "value1", "column2" : "value2" } ]""" rows = json.loads(json_data) with open("test.csv", "w") as csv_file: writer = csv.DictWriter(csv_file, fieldnames=rows[0].keys()) writer.writeheader() writer.writerows(rows)

Python.org

discuss.python.org › python help

How to transform a JSON file into a CSV one in Python? - Python Help - Discussions on Python.org

Top answer

1 of 14

There is no generalised way to convert arbitrary JSON into a CSV format. You will need to do the conversion in your code.

2 of 14

Thank you very much for your help @barry-scott! Oh really, as I am a beginner in this topic, do you have any suggestion to ease the procedure? Basically I need this CSV file for further use in statistical softwares, like R, stata and Python. Thank you very much. Michael

Medium

medium.com › bright-ai › big-data-munging-convert-nested-json-files-to-combined-data-frame-and-convert-to-csv-in-python-4d5899f3b621

Big Data: Convert nested JSON files to data frame and CSV in python - ProdAI - Medium

October 31, 2022 - Big Data: Convert nested JSON files to data frame and CSV in python Step1: Iterate through multiple JSON files using glob.glob(*.json) Step 2: Read the nested JSON file as line Step 3: Flatten JSON …

linkedin.com › pulse › converting-nested-json-data-csv-using-pythonpandas-kaleab-woldemariam

Converting Nested JSON data to CSV using python/pandas

March 8, 2018 - I want to share how I converted a json (Javascript Object Notation) data into easily readable csv file. The data was obtained from Stackexchange API. https://api.stackexchange.com/docs/top-answerers-on-tags. I used the tag –‘python’ and period –‘month’. I saved this data.json to my working directory in PyCharm IDE.

Geekflare

geekflare.com › development › how to convert json to csv in python: a step-by-step guide from experts

How to Convert JSON to CSV in Python: A Step-by-Step Guide from Experts

January 17, 2025 - Moreover, if your JSON data contains ... nested JSON to CSV. One of the main differences between JSON and CSV is that JSON data doesn’t have headers like CSV. So, handle the CSV headers well while converting JSON to CSV. You can give them values or leave them empty as you wish. But if you are giving headers to the CSV file, ensure their data type matches with the actual data in the file. Converting JSON to CSV is easy when done in Python. Even simpler if we use the Pandas ...