Use DataFrame.groupby with DataFrame.apply and DataFrame.to_dict all columns with no Col1 filtered by Index.difference, create DataFrame by DataFrame.reset_index and last use DataFrame.to_dict for dictionary output or DataFrame.to_json for json output:
cols = df.columns.difference(['Col1'])
d = (df.groupby('Col1')[cols]
.apply(lambda x: x.to_dict('r'))
.reset_index(name='Other_details')
.to_dict(orient='records'))
cols = df.columns.difference(['Col1'])
d = (df.groupby('Col1')[cols]
.apply(lambda x: x.to_dict('r'))
.reset_index(name='Other_details')
.to_json(orient='records'))
Answer from jezrael on Stack Overflowpython - Using column values as key in pandas json - Stack Overflow
How to obtained the json from the dataframe with the column names for every row as a key using python - Stack Overflow
How to convert the row-wise data of dataframe with its column name as key and row data as value in json using python - Stack Overflow
python - DataFrame to Json Using First Col as Key and Second as Value - Stack Overflow
Videos
Use DataFrame.to_json with parameters orient='records' and lines=True:
df.to_json(file, orient='records', lines=True)
json_str_list = []
for i in df.index:
json_str = df.loc[i].to_json()
json_str_list.append(json_str)
The above code will solve your question. If you want it to be a JSON object you can import json and just do,
json_obj = json.loads(json_str)
Use Series.to_json and if necessary change key value add rename:
print (k.set_index('A').rename(columns={'B':'index1'}).to_json())
{"index1":{"1":"a","2":"b","3":"c","4":"d"}}
If need export to file:
k.set_index('A').rename(columns={'B':'index1'}).to_json('file.json')
Although what I am writing is not the answer to the question asked, still I am providing a solution to a small problem I was facing which I googled and reached here.
Problem: how to create a dictionary from a panda data frame with a column as the key and constant value (1 in my case) as the, you guessed it, value.
Solution:
f = pd.Series(data = [1]*df.shape[0],index=df['col_name'])
x = f.to_json(orient='columns')
Output:
{"one":1, "two":1, "three": 1}
Why would I do that? Because search in the dictionary is highly optimized (Yeah I can use set as well)
P.S. Novice in Python so please be gentle with me :).
Pandas is equipped for this out of the box.
pandas.DataFrame.to_json
here is the example dataframe:
import json
df = pd.DataFrame(
[["a", "b"], ["c", "d"]],
index=["row 1", "row 2"],
columns=["col 1", "col 2"],
)
Here is the result using to_json():
result = df.to_json(orient="split")
parsed = json.loads(result)
json.dumps(parsed, indent=4)
{
"columns": [
"col 1",
"col 2"
],
"index": [
"row 1",
"row 2"
],
"data": [
[
"a",
"b"
],
[
"c",
"d"
]
]
}
here is the link: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_json.html
As per the function provided here @Parsa T. You can just change the column names and use the function to get the required result.
def set_for_keys(my_dict, key_arr, val):
"""
Set value at the path in my_dict defined by the string (or serializable object) array key_arr
"""
current = my_dict
for i in range(len(key_arr)):
key = key_arr[i]
if key not in current:
current[key] = val if i==len(key_arr)-1 else {}
else:
if type(current[key]) is not dict:
print("Given dictionary is not compatible with key structure requested")
raise ValueError("Dictionary key already occupied")
current = current[key]
return my_dict
def to_formatted_json(df, sep="."):
result = []
for _, row in df.iterrows():
parsed_row = {}
for idx, val in row.iteritems():
keys = idx.split(sep)
parsed_row = set_for_keys(parsed_row, keys, val)
result.append(parsed_row)
return result
df.columns = ['ID', 'PERSONAL.NAME', 'PERSONAL.LAST', 'GEO.ADDRESS', 'GEO.COUNTY']
#Where df was parsed from json-dict using json_normalize
print(to_formatted_json(df, sep="."))
OUTPUT:
[{'ID': '0',
'PERSONAL': {'NAME': 'jimmy', 'LAST': 'neutron'},
'GEO': {'ADDRESS': '101 ocean avenue', 'COUNTY': 'yellow card park'}},
{'ID': '1',
'PERSONAL': {'NAME': 'james', 'LAST': 'baxter'},
'GEO': {'ADDRESS': '202 bubble gum county', 'COUNTY': 'candy kingdom'}},
{'ID': '2',
'PERSONAL': {'NAME': 'joben', 'LAST': 'segel'},
'GEO': {'ADDRESS': '303 china town', 'COUNTY': 'universal studio'}}]
Simple solution
a={'data':
[
{'Region': 'West', 'Airport': 'LAX', 'Score': 3, 'index': 0},
{'Region': 'West', 'Airport': 'SFO', 'Score': 6, 'index': 1},
{'Region': 'East', 'Airport': 'YYZ', 'Score': 9, 'index': 2}
]
}
pd.DataFrame(a['data'])
Also you can read JSON data directly
pd.read_json(your_json,orient='split')
You can also use the built-in json_normalize in pandas.
pd.io.json.json_normalize(json1, 'data')
Airport Region Score index
0 LAX West 3 0
1 SFO West 6 1
2 YYZ East 9 2
Use json.loads or ast.literal_eval for convert strings to list of dicts:
import ast, json
df = pd.DataFrame(rows)
df['Sales_Plan_Details'] = df['Sales_Plan_Details'].apply(json.loads)
#alternative solution
#df['Sales_Plan_Details'] = df['Sales_Plan_Details'].apply(ast.literal_eval)
j = df.to_json(orient='records')
print (j)
[{"Sales_Plan_Details":[{"Month":"2019-1","Quantity":10,"Product_Gid":3}],
"customer_name":"ABI2","employee_name":"ASU2","location_name":"Cherai2"},
{"Sales_Plan_Details":[{"Month":"2019-1","Quantity":10,"Product_Gid":3}],
"customer_name":"ABI","employee_name":"ASU","location_name":"Cherai"}]
Setup:
rows= [{
"customer_name": "ABI2",
"location_name": "Cherai2",
"employee_name": "ASU2",
"Sales_Plan_Details": "[{\"Month\": \"2019-1\", \"Quantity\": 10, \"Product_Gid\": 3}]"
},
{
"customer_name": "ABI",
"location_name": "Cherai",
"employee_name": "ASU",
"Sales_Plan_Details": "[{\"Month\": \"2019-1\", \"Quantity\": 10, \"Product_Gid\": 3}]"
}]
You can use list comprehensions to map the Sales_Plan_Details values.
You can use json.loads() to deserialize the list value from the string.
import json
dataframe_json = [
{
"customer_name": "ABI2",
"location_name": "Cherai2",
"employee_name": "ASU2",
"Sales_Plan_Details": "[{\"Month\": \"2019-1\", \"Quantity\": 10, \"Product_Gid\": 3}]"
},
{
"customer_name": "ABI",
"location_name": "Cherai",
"employee_name": "ASU",
"Sales_Plan_Details": "[{\"Month\": \"2019-1\", \"Quantity\": 10, \"Product_Gid\": 3}]"
}]
# get the "Sales_Plan_Details" key value's from the list
sales_plan_details_nested_list = [sales_plan_details_dict for sales_plan_details_dict in json.loads(item("Sales_Plan_Details")) for item in dataframe_json]
# flatten the list
sales_plan_details_list = [item for sublist in sales_plan_details_nested_list for item in sublist]
# pretty print the list now
print(json.dumps(sales_plan_details_list, indent=True))
JSON files are treated as dicts in python, the JSON file you specified has duplicate keys and could only be parsed as a string (and not using the python json library). The following code:
import json
from io import StringIO
df = pd.DataFrame(np.arange(1,10).reshape((3,3)), columns=['col1','col2','col3'])
io = StringIO()
df.to_json(io, orient='columns')
parsed = json.loads(io.getvalue())
with open("pretty.json", '+w') as of:
json.dump(parsed, of, indent=4)
will produce the following JSON:
{
"col1": {
"0": 1,
"1": 4,
"2": 7
},
"col2": {
"0": 2,
"1": 5,
"2": 8
},
"col3": {
"0": 3,
"1": 6,
"2": 9
}
}
which you could later load to python. alternatively, this script will produce exatcly the string you want:
with open("exact.json", "w+") as of:
of.write('[\n\t{\n' + '\t},\n\t{\n'.join(["".join(["\t\t\"%s\": %s,\n"%(c, df[c][i]) for i in df.index]) for c in df.columns])+'\t}\n]')
and the output would be:
[
{
"col1": 1,
"col1": 4,
"col1": 7,
},
{
"col2": 2,
"col2": 5,
"col2": 8,
},
{
"col3": 3,
"col3": 6,
"col3": 9,
}
]
edit: fixed brackets
You need to do
df.to_json('file.json', orient='records')
Note that this will give you a array of objects:
[
{
"col1": 1,
"col1": 4,
"col1": 7
},
{
"col2": 2,
"col2": 5,
"col2": 8
},
{
"col3": 3,
"col3": 6,
"col3": 9
}
]
You can also do
df.to_json('file.json', orient='records', lines=True)
if you want output like:
{"col1":1,"col1":4,"col1":7},
{"col2":2,"col2":5,"col2":8},
{"col3":3,"col3":6,"col3":9}
To prettify output:
pip install jq
cat file.json | jq '.' > new_file.json