Iterate over the file, loading each line as JSON in the loop:
tweets = []
with open('tweets.json', 'r') as file:
for line in file:
tweets.append(json.loads(line))
This avoids storing intermediate python objects. As long as you write one full tweet per append() call, this should work.
Iterate over the file, loading each line as JSON in the loop:
tweets = []
with open('tweets.json', 'r') as file:
for line in file:
tweets.append(json.loads(line))
This avoids storing intermediate python objects. As long as you write one full tweet per append() call, this should work.
As you can see in the following example, json.loads (and json.load) does not decode multiple json object.
>>> json.loads('{}')
{}
>>> json.loads('{}{}') # == json.loads(json.dumps({}) + json.dumps({}))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\json\__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "C:\Python27\lib\json\decoder.py", line 368, in decode
raise ValueError(errmsg("Extra data", s, end, len(s)))
ValueError: Extra data: line 1 column 3 - line 1 column 5 (char 2 - 4)
If you want to dump multiple dictionaries, wrap them in a list, dump the list (instead of dumping dictionaries multiple times)
>>> dict1 = {}
>>> dict2 = {}
>>> json.dumps([dict1, dict2])
'[{}, {}]'
>>> json.loads(json.dumps([dict1, dict2]))
[{}, {}]
You have two records in your json file, and json.loads() is not able to decode more than one. You need to do it record by record.
See Python json.loads shows ValueError: Extra data
OR you need to reformat your json to contain an array:
{
"foo" : [
{"name": "XYZ", "address": "54.7168,94.0215", "country_of_residence": "PQR", "countries": "LMN;PQRST", "date": "28-AUG-2008", "type": null},
{"name": "OLMS", "address": null, "country_of_residence": null, "countries": "Not identified;No", "date": "23-FEB-2017", "type": null}
]
}
would be acceptable again. But there cannot be several top level objects.
I was parsing JSON from a REST API call and got this error. It turns out the API had become "fussier" (eg about order of parameters etc) and so was returning malformed results. Check that you are getting what you expect :)
json.decoder.JSONDecodeError: Extra data: line 1 column 300 (char 299)
[Bug]: JSONDecodeError: Extra data: line 1 column 4 (char 3) on download_llama_dataset for PaulGrahamEssay
python - json.decoder.JSONDecodeError: Extra data: line 1 column 5 (char 4) AND raise JSONDecodeError("Extra data", s, end) - Stack Overflow
Error Report: json.decoder.JSONDecodeError: Extra data: line 1 column 47 (char 46)
Hi, I have an assignment for class that I need to do, and I'm not asking for people to do it for me. But simply loading the JSON file into python using elasticsearch goes wrong for me. I get the following error: "json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 212)". When using the code: import json
from elasticsearch import Elasticsearch
es = Elasticsearch([{'host': 'localhost', 'port': 9200, 'scheme': 'http'}])
index_name = 'car_data'
with open("file_path", 'r' ) as file:
data = json.load(file)
Where "file_path" cointains a real file path.
I'm still totally new to coding so forgive me if I make very basic mistakes. Can anyone help me? If you need additional information that is also fine. I thought of looking at line 2 column 1 but couldn't find anything.