I found a quick and easy solution to what I wanted using json_normalize() included in pandas 1.01.
from urllib2 import Request, urlopen
import json
import pandas as pd
path1 = '42.974049,-81.205203|42.974298,-81.195755'
request=Request('http://maps.googleapis.com/maps/api/elevation/json?locations='+path1+'&sensor=false')
response = urlopen(request)
elevations = response.read()
data = json.loads(elevations)
df = pd.json_normalize(data['results'])
This gives a nice flattened dataframe with the json data that I got from the Google Maps API.
Answer from pbreach on Stack OverflowI found a quick and easy solution to what I wanted using json_normalize() included in pandas 1.01.
from urllib2 import Request, urlopen
import json
import pandas as pd
path1 = '42.974049,-81.205203|42.974298,-81.195755'
request=Request('http://maps.googleapis.com/maps/api/elevation/json?locations='+path1+'&sensor=false')
response = urlopen(request)
elevations = response.read()
data = json.loads(elevations)
df = pd.json_normalize(data['results'])
This gives a nice flattened dataframe with the json data that I got from the Google Maps API.
Check this snip out.
# reading the JSON data using json.load()
file = 'data.json'
with open(file) as train_file:
dict_train = json.load(train_file)
# converting json dataset from dictionary to dataframe
train = pd.DataFrame.from_dict(dict_train, orient='index')
train.reset_index(level=0, inplace=True)
Hope it helps :)
How to read Panda's DataFrames from json file?
Convert Pandas DataFrame to JSON format - Stack Overflow
Python - How To Convert Pandas Dataframe To JSON Object? - Stack Overflow
Converting a JSON file into a Pandas Dataframe
Videos
In newer versions of pandas (0.20.0+, I believe), this can be done directly:
df.to_json('temp.json', orient='records', lines=True)
Direct compression is also possible:
df.to_json('temp.json.gz', orient='records', lines=True, compression='gzip')
The output that you get after DF.to_json is a string. So, you can simply slice it according to your requirement and remove the commas from it too.
out = df.to_json(orient='records')[1:-1].replace('},{', '} {')
To write the output to a text file, you could do:
with open('file_name.txt', 'w') as f:
f.write(out)
I believe need create dict and then convert to json:
import json
d = df1.to_dict(orient='records')
j = json.dumps(d)
Or if possible:
j = df1.to_json(orient='records')
Here's what worked for me:
import pandas as pd
import json
df = pd.DataFrame([{"test":"w","param":1},{"test":"w2","param":2}])
print(df)
test param
0 w 1
1 w2 2
So now we convert to a json string:
d = df.to_json(orient='records')
print(d)
'[{"test":"w","param":1},{"test":"w2","param":2}]'
And now we parse this string to a list of dicts:
data = json.loads(d)
print(data)
[{'test': 'w', 'param': 1}, {'test': 'w2', 'param': 2}]
Hi All,
I am trying to learn building data pipelines. So as a hobby project, I have done some web-scraping from a football stats website and extracted some data in JSON format. However, the format is such that I am unable to convert it into a Pandas Data frame. Could someone help me with it? The data sample is as below -
[
{"player_name": ["Kylian Mbapp\u00e9", "Erling Haaland", "Harry Kane", "Jadon Sancho", "Mohamed Salah", "Romelu Lukaku", "Kevin De Bruyne", "Neymar", "Frenkie de Jong", "Bruno Fernandes", "Joshua Kimmich", "Raheem Sterling", "Marcus Rashford", "Sadio Man\u00e9", "Heung-min Son", "Jo\u00e3o F\u00e9lix", "Phil Foden", "Lautaro Mart\u00ednez", "Marcos Llorente", "Lionel Messi", "Mason Mount", "Matthijs de Ligt", "Trent Alexander-Arnold", "R\u00faben Dias", "Marquinhos"], "player_age": ["Age", "22", "20", "27", "21", "29", "28", "30", "29", "24", "26", "26", "26", "23", "29", "28", "21", "21", "23", "26", "34", "22", "21", "22", "24", "27"], "market_value": ["\u20ac160.00m", "\u20ac130.00m", "\u20ac120.00m", "\u20ac100.00m", "\u20ac100.00m", "\u20ac100.00m", "\u20ac100.00m", "\u20ac100.00m", "\u20ac90.00m", "\u20ac90.00m", "\u20ac90.00m", "\u20ac90.00m", "\u20ac85.00m", "\u20ac85.00m", "\u20ac85.00m", "\u20ac80.00m", "\u20ac80.00m", "\u20ac80.00m", "\u20ac80.00m", "\u20ac80.00m", "\u20ac75.00m", "\u20ac75.00m", "\u20ac75.00m", "\u20ac75.00m", "\u20ac75.00m"]},
{"player_name": ["Pedri", "Alphonso Davies", "Rodri", "Mikel Oyarzabal", "Kai Havertz", "Sergej Milinkovi\u0107-Savi\u0107", "Bernardo Silva", "Rapha\u00ebl Varane", "Serge Gnabry", "Leon Goretzka", "Jan Oblak", "Casemiro", "Bukayo Saka", "Fede Valverde", "Declan Rice", "Nicol\u00f2 Barella", "Kingsley Coman", "Andrew Robertson", "Jack Grealish", "Timo Werner", "Ansu Fati", "Jules Kound\u00e9", "Achraf Hakimi", "Gabriel Jesus", "Dayot Upamecano"], "player_age": ["Age", "18", "20", "25", "24", "22", "26", "26", "28", "25", "26", "28", "29", "19", "22", "22", "24", "25", "27", "25", "25", "18", "22", "22", "24", "22"], "market_value": ["\u20ac70.00m", "\u20ac70.00m", "\u20ac70.00m", "\u20ac70.00m", "\u20ac70.00m", "\u20ac70.00m", "\u20ac70.00m", "\u20ac70.00m", "\u20ac70.00m", "\u20ac70.00m", "\u20ac70.00m", "\u20ac70.00m", "\u20ac65.00m", "\u20ac65.00m", "\u20ac65.00m", "\u20ac65.00m", "\u20ac65.00m", "\u20ac65.00m", "\u20ac65.00m", "\u20ac65.00m", "\u20ac60.00m", "\u20ac60.00m", "\u20ac60.00m", "\u20ac60.00m", "\u20ac60.00m"]},
{"player_name": ["Federico Chiesa", "Gianluigi Donnarumma", "Alessandro Bastoni", "Wilfred Ndidi", "Jos\u00e9 Mar\u00eda Gim\u00e9nez", "Fabinho", "Milan Skriniar", "Leroy San\u00e9", "Antoine Griezmann", "Paul Pogba", "Thibaut Courtois", "Alisson", "Marc-Andr\u00e9 ter Stegen", "Koke", "Robert Lewandowski", "Eduardo Camavinga", "Jude Bellingham", "Richarlison", "Franck Kessi\u00e9", "James Maddison", "Youri Tielemans", "N'Golo Kant\u00e9", "Jo\u00e3o Cancelo", "Virgil van Dijk", "Marco Verratti"], "player_age": ["Age", "23", "22", "22", "24", "26", "27", "26", "25", "30", "28", "29", "28", "29", "29", "32", "18", "18", "24", "24", "24", "24", "30", "27", "29", "28"], "market_value": ["\u20ac60.00m", "\u20ac60.00m", "\u20ac60.00m", "\u20ac60.00m", "\u20ac60.00m", "\u20ac60.00m", "\u20ac60.00m", "\u20ac60.00m", "\u20ac60.00m", "\u20ac60.00m", "\u20ac60.00m", "\u20ac60.00m", "\u20ac60.00m", "\u20ac60.00m", "\u20ac60.00m", "\u20ac55.00m", "\u20ac55.00m", "\u20ac55.00m", "\u20ac55.00m", "\u20ac55.00m", "\u20ac55.00m", "\u20ac55.00m", "\u20ac55.00m", "\u20ac55.00m", "\u20ac55.00m"]}]
Suppose that I have the following dataframe:
info drinks reviews score menu funfacts
input 1.0 1 2 4 1. funfacts, 2. fun fact, 3. fun fact
How could I transform this to the required JSON format? I tried Pandas(df.to_json) however the default formatting seems incorrect.
Snippet: `df3.to_json('File Name.json', orient='records')`
Expected output:
{
"info":[
{
"drinks":[
"1.0"
],
"reviews":[
"1"
],
"score":[
"2"
],
"menu":[
"4"
],
"funfacts":[
"1. funfacts",
"2. fun fact",
"3. fun fact"
]
}
]
}
Current output:
[
{
"drinks": "1.0",
"reviews": "1",
"score": "2",
"menu": "4",
"funfacts": "1. funfacts ,2. fun facts ,3 fun facts"
}
]
Are there any arguments in pandas that I could use to get the desired format or do I need to use a different solution? Thanks
» pip install pandas
I need to convert a list of JSON strings into a dataframe. How do I do that?