There are many possible solutions. Generally though, you'll probably want to:

  1. Not loop over fields; instead let Pandas split the fields for you
  2. Use an actual missing value
    • But later if you want to represent it differently, you can do that, e.g. using the na_rep parameter to df.style.format

For the first step, you can look at Split / Explode a column of dictionaries into separate columns with pandas. I'll use Lech Birek's solution (json_normalize) then drop the "id" columns and rename the "value" columns.

headers_mapping = {'1': 'field1', '2': 'field2', '3': 'field3', '4': 'field4'}
(
    pd.json_normalize(df['json_field'])
    .filter(like='value')
    .rename(columns=lambda label: headers_mapping[label.rstrip('.value')])
)
   field1  field2  field3  field4
0  value1  value2     NaN     NaN
1  value1     NaN  value3     NaN
2     NaN     NaN  value3  value4

If you also need to sort the columns, tack this on at the end:

.reindex(columns=headers_mapping.values())
Answer from wjandrea on Stack Overflow
🌐
Stack Overflow
stackoverflow.com › questions › 64916148 › how-to-split-a-json-string-column-in-pandas-spark-dataframe
python - How to split a json string column in pandas/spark dataframe? - Stack Overflow
import pandas as pd import json raw_data = [{'id': 1, 'name': 'NATALIE', 'json_result': '{"0": {"_source": {"person_id": 101, "firstname": "NATALIE", "lastname": "OSHO", "city_name": "WESTON"}}}'}, \ {'id': 2, 'name': 'MARK', 'json_result': '{"0": {"_source": {"person_id": 102, "firstname": "MARK", "lastname": "BROWN", "city_name": "NEW YORK"}}}'}, \ {'id': 3, 'name': 'NANCY', 'json_result': '{"0": {"_source": {"person_id": 103, "firstname": "NANCY", "lastname": "GATES", "city_name": "LA"}}}'}] df = pd.DataFrame.from_dict(raw_data) ser = df['json_result'].apply(lambda s: pd.json_normalize(json.loads(s))) a = df.drop(columns=['json_result']) b = pd.concat(list(ser), ignore_index=True) c = a.join(b) import sys c.to_csv(sys.stdout, index=False)
🌐
Stack Overflow
stackoverflow.com › questions › 54971005 › splitting-a-pandas-data-frames-column-containing-json-data-into-multiple-column
python 3.x - Splitting a pandas data frame's column containing json data into multiple columns - Stack Overflow
March 4, 2019 - I loaded and normalized a json data as: json_string = json.loads(data) df_norm = json_normalize(json_string, errors='ignore') Say it has now 2 columns: Group Members A [{'id':'1', '
Top answer
1 of 4
1

should add ignore_index=True argument in explode function to make sure the following join is not messed up.

df = pd.DataFrame(data).explode('countries', ignore_index=True)
df = df.join(pd.json_normalize(df.pop('countries')))
print(df)
2 of 4
0

You could try this with explode:

df=df.explode('countries')
#we add to each dictionary the respective value of year with key 'year'
df['countries']=[{**dc,**{'year':y}} for dc,y in zip(df['countries'],df['year'])]
pd.DataFrame(df['countries'].tolist())

Example:

j = [{'continent': 'europe',
 'country': 'Yugoslavia',
 'income': None,
  'life_exp': None,
'population': 4687422},
{'continent': 'asia',
'country': 'United Korea (former)',
'income': None,
'life_exp': None,
'population': 13740000}]
df=pd.DataFrame({'countries':[j,j],'year':[1800,1900]})
print(df)

df=df.explode('countries')
print(df)

#Here we add the key 'year' with the respective year row value to each dictionary
df['countries']=[{**dc,**{'year':y}} for dc,y in zip(df['countries'],df['year'])]
print(df['countries'])

finaldf=pd.DataFrame(df['countries'].tolist())
print(finaldf)

Output:

original df:
                                           countries  year
0  [{'continent': 'europe', 'country': 'Yugoslavi...  1800
1  [{'continent': 'europe', 'country': 'Yugoslavi...  1900


    

df(after explode): 
                                                                                            
                                           countries  year
0  {'continent': 'europe', 'country': 'Yugoslavia...  1800
0  {'continent': 'asia', 'country': 'United Korea...  1800
1  {'continent': 'europe', 'country': 'Yugoslavia...  1900
1  {'continent': 'asia', 'country': 'United Korea...  1900


df.countries(with year added):
0    {'continent': 'europe', 'country': 'Yugoslavia', 'income': None, 'life_exp': None, 'population': 4687422, 'year': 1800}
0    {'continent': 'asia', 'country': 'United Korea (former)', 'income': None, 'life_exp': None, 'population': 13740000, 'year': 1800}
1    {'continent': 'europe', 'country': 'Yugoslavia', 'income': None, 'life_exp': None, 'population': 4687422, 'year': 1900}
1    {'continent': 'asia', 'country': 'United Korea (former)', 'income': None, 'life_exp': None, 'population': 13740000, 'year': 1900}
Name: countries, dtype: object

finaldf
  continent                country income life_exp  population  year
0    europe             Yugoslavia   None     None     4687422  1800
1      asia  United Korea (former)   None     None    13740000  1800
2    europe             Yugoslavia   None     None     4687422  1900
3      asia  United Korea (former)   None     None    13740000  1900
🌐
CopyProgramming
copyprogramming.com › howto › pandas-dataframe-split-json-into-columns
Json: Splitting JSON into columns in a Pandas dataframe
June 24, 2023 - Extracting nested JSON/dictionary from a Pandas dataframe and separating them into individual columns · A guide on transforming json data into multiple columns for easy splitting
Find elsewhere
🌐
YouTube
youtube.com › watch
How to Split JSON String Column in a Pandas DataFrame into Multiple Columns - YouTube
March 17, 2025 - Discover how to effectively `split a JSON string column` in a Pandas DataFrame into multiple columns using Python. Free up your data processes with this simp...
Top answer
1 of 2
1

I hope I've understood your question well. Try:

from ast import literal_eval

df["experimental_properties"] = df["experimental_properties"].apply(
    lambda x: {d["name"]: d["property"] for d in literal_eval(x)}
)
df = pd.concat([df, df.pop("experimental_properties").apply(pd.Series)], axis=1)

print(df)

Prints:

            Boiling Point                                Density
0                115.3 °C                                    NaN
1  91 °C @ Press: 20 Torr                                    NaN
2  58 °C @ Press: 12 Torr  0.8753 g/cm<sup>3</sup> @ Temp: 20 °C
2 of 2
0

Is the expected output really what you are looking for? Another way to visualise the data would be to have "name", "property", and "sourceNumber" as column names.

import json
import pandas as pd

data = [
'''[{'name': 'Boiling Point', 'property': '115.3 °C', 'sourceNumber': 1}]''',
'''[{'name': 'Boiling Point', 'property': '91 °C @ Press: 20 Torr', 'sourceNumber': 1}]''',
'''[{'name': 'Boiling Point', 'property': '58 °C @ Press: 12 Torr', 'sourceNumber': 1}, {'name': 'Density', 'property': '0.8753 g/cm<sup>3</sup> @ Temp: 20 °C', 'sourceNumber': 1}]''']

#Initialise a naiveList
naiveList = []

#String to List
for i in data:
    tempStringOfData = i
    tempStringOfData = tempStringOfData.replace("\'", "\"")
    tempJsonData = json.loads(tempStringOfData)
    naiveList.append(tempJsonData)

#Initialise a List for Dictionaries
newListOfDictionaries = []
for i in naiveList:
    for j in i:
        newListOfDictionaries.append(j)

df = pd.DataFrame(newListOfDictionaries)
print(df)

Which gives you

            name                               property  sourceNumber
0  Boiling Point                               115.3 °C             1
1  Boiling Point                 91 °C @ Press: 20 Torr             1
2  Boiling Point                 58 °C @ Press: 12 Torr             1
3        Density  0.8753 g/cm<sup>3</sup> @ Temp: 20 °C             1
🌐
Stack Overflow
stackoverflow.com › questions › 73420669 › how-to-extract-database-column-in-json-format-into-multiple-columns-in-dataframe
python - how to extract database column in json format into multiple columns in dataframe - Stack Overflow
I have a database column that's been converted to a Pandas dataframe and it looks like below . My actual data has much more columns and rows with different key: value pair. df["Records"] {"ID":"1","ID_1":"40309","type":"type1"} {"ID":"2","ID_1":"40310","type":"type1"} {"ID":"3","ID_1":"40311","type":"type1"} I want to split this into multiple columns in a dataframe. df1: ID ID_1 type 1 40309 type1 2 40310 type1 3 40311 type1 ... json_Str=df.to_dict() json_dump= json.dumps(json_Str) json_dump=json_dump.replace("\\", "") with open("H:\\df2.json", 'w') as fp: # json.dump(result, fp, indent=4) print(json_dump, file=fp)
🌐
DataScientYst
datascientyst.com › normalize-json-dict-new-columns-pandas
How to Normalize JSON or Dict to New Columns in Pandas
April 9, 2024 - In order to convert JSON, dicts and lists to tabular form we can use several different options. Let's cover the most popular of them in next steps. In the post, we'll use the following DataFrame, which has columns: ... import pandas as pd data = {'col_json': {0: {'x': 1, 'y': 0, 'xy':1}, 1: {'x': 0, 'y': 1, 'xy':1}, 2: {'x': 1, 'y': 1, 'xy':1}}, 'col_str_dict': {0: "{'x': 1, 'y': 0, 'xy':1}", 1: "{'x': 0, 'y': 1, 'xy':1}", 2: "{'x': 1, 'y': 1, 'xy':1}"} , 'col_str_dict_list': {0: '{"x": [1,1]}', 1: '{"x": [0,1]}', 2: '{"x": [1,0]}'} } df = pd.DataFrame(data) df
Top answer
1 of 3
1

I can use rdd to get the columns, data and create dataframe with this.

rdd = sc.textFile('test.txt')

import json
cols = rdd.map(lambda x: json.loads(x)['columns']).take(1)[0]
data = rdd.map(lambda x: json.loads(x)['data']).take(1)[0]

df = spark.createDataFrame(data, cols)
df.show(truncate=False)

+--------------+-----------+--------------+---------------------+------------+-----------------+-------------------------------+-----------+--------------+---------+------------+----------------------+------------+-------------+---------------------------+------------------+----------------------+-------------+----------------------------+-------------------------------+-------------------------+-----------+------------------+-------------------+--------------------------------+-----------------------+---------------------------+------------------------------------+------------------+-------------------+------------------------------------+---------------------------------+------------------------------+----------------+--------------+
|ApplicationNum|eads59Us01S|HouseDeal_flag|Liability_Asset_Ratio|CBRAvailPcnt|CMSFairIsaacScore|OweTaxes_or_IRAWithdrawalHistry|eads14Fi02S|GuarantorCount|CBRRevMon|CBRInstalMon|CMSApprovedToRequested|SecIncSource|eads59Us01S_4|Liability_Asset_Ratio_40_90|CBRAvailPcnt_20_95|CMSFairIsaacScore_Fund|eads14Fi02S_2|InstalMonthlyPayments_400_3k|RevolvingMonthlyPayments_1k_cap|ApprovedToRequested_0_100|NoSecIncome|coef_eads59Us01S_4|coef_HouseDeal_flag|coef_Liability_Asset_Ratio_40_90|coef_CBRAvailPcnt_20_95|coef_CMSFairIsaacScore_Fund|coef_OweTaxes_or_IRAWithdrawalHistry|coef_eads14Fi02S_2|coef_GuarantorCount|coef_RevolvingMonthlyPayments_1k_cap|coef_InstalMonthlyPayments_400_3k|coef_ApprovedToRequested_0_100|coef_NoSecIncome|coef_Intercept|
+--------------+-----------+--------------+---------------------+------------+-----------------+-------------------------------+-----------+--------------+---------+------------+----------------------+------------+-------------+---------------------------+------------------+----------------------+-------------+----------------------------+-------------------------------+-------------------------+-----------+------------------+-------------------+--------------------------------+-----------------------+---------------------------+------------------------------------+------------------+-------------------+------------------------------------+---------------------------------+------------------------------+----------------+--------------+
|569325.0      |2          |0.0           |1                    |92          |825              |0.0                            |4          |1.0           |74       |854         |0.51                  |2           |2.0          |0.9                        |92.0              |825.0                 |4.0          |854.0                       |1000.0                         |0.51                     |0.0        |0.11716245        |0.299528064        |0.392119645                     |-0.010826643           |-0.004957868               |0.339407077                         |0.061509795       |0.3685047          |1.67603E-4                          |2.25742E-4                       |0.902205454                   |-0.371734864    |2.788087559   |
+--------------+-----------+--------------+---------------------+------------+-----------------+-------------------------------+-----------+--------------+---------+------------+----------------------+------------+-------------+---------------------------+------------------+----------------------+-------------+----------------------------+-------------------------------+-------------------------+-----------+------------------+-------------------+--------------------------------+-----------------------+---------------------------+------------------------------------+------------------+-------------------+------------------------------------+---------------------------------+------------------------------+----------------+--------------+
2 of 3
1

You can use the function json.loads, transform the json string into a dictionary with column-data-pairs and create new columns from this dictionary with .apply(pd.Series)

import json
import pandas as pd

df = pd.DataFrame([["""{"columns":["ApplicationNum","eads59Us01S","HouseDeal_flag","Liability_Asset_Ratio","CBRAvailPcnt","CMSFairIsaacScore","OweTaxes_or_IRAWithdrawalHistry","eads14Fi02S","GuarantorCount","CBRRevMon","CBRInstalMon","CMSApprovedToRequested","SecIncSource","eads59Us01S_4","Liability_Asset_Ratio_40_90","CBRAvailPcnt_20_95","CMSFairIsaacScore_Fund","eads14Fi02S_2","InstalMonthlyPayments_400_3k","RevolvingMonthlyPayments_1k_cap","ApprovedToRequested_0_100","NoSecIncome","coef_eads59Us01S_4","coef_HouseDeal_flag","coef_Liability_Asset_Ratio_40_90","coef_CBRAvailPcnt_20_95","coef_CMSFairIsaacScore_Fund","coef_OweTaxes_or_IRAWithdrawalHistry","coef_eads14Fi02S_2","coef_GuarantorCount","coef_RevolvingMonthlyPayments_1k_cap","coef_InstalMonthlyPayments_400_3k","coef_ApprovedToRequested_0_100","coef_NoSecIncome","coef_Intercept"],"data":[[569325.0,2,0.0,1,92,825,0.0,4,1.0,74,854,0.51,2,2.0,0.9,92.0,825.0,4.0,854.0,1000.0,0.51,0.0,0.11716245,0.299528064,0.392119645,-0.010826643,-0.004957868,0.339407077,0.061509795,0.3685047,0.000167603,0.000225742,0.902205454,-0.371734864,2.788087559]]}"""]], columns=['json_string'])
df['json_loads'] = df['json_string'].apply(json.loads)
df['column_names'] = df['json_loads'].apply(lambda x: x['columns'])
df['data'] = df['json_loads'].apply(lambda x: x['data'][0])

# turning it into a dictionary
df['dict_values']=df.apply(lambda x: dict(zip(x['column_names'],x['data'])), axis=1)

df = pd.concat([df, df['dict_values'].apply(pd.Series)], axis=1)

print(df.head())
🌐
Stack Overflow
stackoverflow.com › questions › 61815837 › how-to-split-json-column
python - How to split JSON column - Stack Overflow
1 Separate JSON elements into columns of pandas dataframe · 0 split json within dataframe column into multiple column in python · 2 How to split a json response into different columns with pandas? 1 Split out nested json/dictionary from Pandas dataframe into separate columns ·
🌐
Stack Overflow
stackoverflow.com › questions › 38752135 › python2-7-how-to-split-a-column-into-multiple-column-based-on-special-strings-l
python - Python2.7: How to split a column into multiple column based on special strings like this? - Stack Overflow
August 4, 2016 - I was thinking to use str.split functions to split into pieces and merge everthing later. But not sure that is the best way to go and I wanted to see if there is more sophisticated way to make a dataframe like this. Any advice is appreciated! ... When print(dframe['info']), it shows like this. ... Please don't use images to share data. ... It looks like the content of the info column is JSON-formatted, so you can parse that into a dict object easily: