I found a quick and easy solution to what I wanted using json_normalize() included in pandas 1.01.
from urllib2 import Request, urlopen
import json
import pandas as pd
path1 = '42.974049,-81.205203|42.974298,-81.195755'
request=Request('http://maps.googleapis.com/maps/api/elevation/json?locations='+path1+'&sensor=false')
response = urlopen(request)
elevations = response.read()
data = json.loads(elevations)
df = pd.json_normalize(data['results'])
This gives a nice flattened dataframe with the json data that I got from the Google Maps API.
Answer from pbreach on Stack OverflowI found a quick and easy solution to what I wanted using json_normalize() included in pandas 1.01.
from urllib2 import Request, urlopen
import json
import pandas as pd
path1 = '42.974049,-81.205203|42.974298,-81.195755'
request=Request('http://maps.googleapis.com/maps/api/elevation/json?locations='+path1+'&sensor=false')
response = urlopen(request)
elevations = response.read()
data = json.loads(elevations)
df = pd.json_normalize(data['results'])
This gives a nice flattened dataframe with the json data that I got from the Google Maps API.
Check this snip out.
# reading the JSON data using json.load()
file = 'data.json'
with open(file) as train_file:
dict_train = json.load(train_file)
# converting json dataset from dictionary to dataframe
train = pd.DataFrame.from_dict(dict_train, orient='index')
train.reset_index(level=0, inplace=True)
Hope it helps :)
Convert Pandas DataFrame to JSON format - Stack Overflow
Python pandas: convert a list of JSON strings into a dataframe. How do I do that?
python - How to read a json data into a dataframe using pandas - Stack Overflow
Python - How to convert JSON File to Dataframe - Stack Overflow
Videos
In newer versions of pandas (0.20.0+, I believe), this can be done directly:
df.to_json('temp.json', orient='records', lines=True)
Direct compression is also possible:
df.to_json('temp.json.gz', orient='records', lines=True, compression='gzip')
The output that you get after DF.to_json is a string. So, you can simply slice it according to your requirement and remove the commas from it too.
out = df.to_json(orient='records')[1:-1].replace('},{', '} {')
To write the output to a text file, you could do:
with open('file_name.txt', 'w') as f:
f.write(out)
I need to convert a list of JSON strings into a dataframe. How do I do that?
you can use:
with open('json_example.json') as json_data:
data = json.load(json_data)
df=pd.DataFrame.from_dict(data,orient='index').reset_index().rename(columns={'index':'Sentence',0:'Label'})
Easy way that I remember
import pandas as pd
import json
with open("./data.json", "r") as f:
data = json.load(f)
df = pd.DataFrame({"Sentence": data.keys(), "Label": data.values()})
With read_json
To read straight from the file using read_json, you can use something like:
pd.read_json("./data.json", lines=True)\
.T\
.reset_index()\
.rename(columns={"index": "Sentence", 0: "Labels"})
Explanation
A little dirty but as you probably noticed, lines=True isn't completely sufficient so the above transposes the result so that you have
| (index) | 0 |
|---|---|
| Text1 | 4 |
| Text2 | 1 |
| TextN | 123 |
So then resetting the index moves the index over to be a column named "index" and then renaming the columns.
Creating dataframe from dictionary object.
import pandas as pd
data = [{'name': 'vikash', 'age': 27}, {'name': 'Satyam', 'age': 14}]
df = pd.DataFrame.from_dict(data, orient='columns')
df
Out[4]:
age name
0 27 vikash
1 14 Satyam
If you have nested columns then you first need to normalize the data:
data = [
{
'name': {
'first': 'vikash',
'last': 'singh'
},
'age': 27
},
{
'name': {
'first': 'satyam',
'last': 'singh'
},
'age': 14
}
]
df = pd.DataFrame.from_dict(pd.json_normalize(data), orient='columns')
df
Out[8]:
age name.first name.last
0 27 vikash singh
1 14 satyam singh
Source:
pandas.DataFrame.from_dictpandas.json_normalize
import pandas as pd
print(pd.json_normalize(your_json))
This will Normalize semi-structured JSON data into a flat table
Output
FirstName LastName MiddleName password username
John Mark Lewis 2910 johnlewis2
From your code, it looks like you're loading a JSON file which has JSON data on each separate line. read_json supports a lines argument for data like this:
data_df = pd.read_json('C:/Users/Alberto/nutrients.json', lines=True)
Note
Removelines=Trueif you have a single JSON object instead of individual JSON objects on each line.
Using the json module you can parse the json into a python object, then create a dataframe from that:
import json
import pandas as pd
with open('C:/Users/Alberto/nutrients.json', 'r') as f:
data = json.load(f)
df = pd.DataFrame(data)