Iterate over the file, loading each line as JSON in the loop:
tweets = []
with open('tweets.json', 'r') as file:
for line in file:
tweets.append(json.loads(line))
This avoids storing intermediate python objects. As long as you write one full tweet per append() call, this should work.
Iterate over the file, loading each line as JSON in the loop:
tweets = []
with open('tweets.json', 'r') as file:
for line in file:
tweets.append(json.loads(line))
This avoids storing intermediate python objects. As long as you write one full tweet per append() call, this should work.
As you can see in the following example, json.loads (and json.load) does not decode multiple json object.
>>> json.loads('{}')
{}
>>> json.loads('{}{}') # == json.loads(json.dumps({}) + json.dumps({}))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python27\lib\json\__init__.py", line 338, in loads
return _default_decoder.decode(s)
File "C:\Python27\lib\json\decoder.py", line 368, in decode
raise ValueError(errmsg("Extra data", s, end, len(s)))
ValueError: Extra data: line 1 column 3 - line 1 column 5 (char 2 - 4)
If you want to dump multiple dictionaries, wrap them in a list, dump the list (instead of dumping dictionaries multiple times)
>>> dict1 = {}
>>> dict2 = {}
>>> json.dumps([dict1, dict2])
'[{}, {}]'
>>> json.loads(json.dumps([dict1, dict2]))
[{}, {}]
I have this json file that has this content:
{
"Instagram": {
"Social": "Instagram",
"E-Mail": "prova123@gmail.com",
"Password": "passw1234"
},
"Facebook": {
"Social": "Facebook",
"E-Mail": "ctry1231",
"Password": "ctry123"
}
}{
"Instagram": {
"Social": "Instagram",
"E-Mail": "prova123@gmail.com",
"Password": "passw1234"
},
"Facebook": {
"Social": "Facebook",
"E-Mail": "ctry1231",
"Password": "ctry123"
}
}
(this a try, after that all values will be the various under one "Manager").
But when i try to read it with json.load, i have this error:
json.decoder.JSONDecodeError: Extra data: line 14 column 2 (char 320)
like the problem appens if i try to add multiple dictionaries.. i can use json.load without any problem by having only one dictionary in the file
what can i do guys?
Bug: `JSONDecodeError` when trying to load JSON with an array at the top-level
python - json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 190) - Stack Overflow
[Bug]: JSONDecodeError: Extra data: line 1 column 4 (char 3) on download_llama_dataset for PaulGrahamEssay
Hello, i need help with the code. I need read code of JSON, but I have a problem
You have two records in your json file, and json.loads() is not able to decode more than one. You need to do it record by record.
See Python json.loads shows ValueError: Extra data
OR you need to reformat your json to contain an array:
{
"foo" : [
{"name": "XYZ", "address": "54.7168,94.0215", "country_of_residence": "PQR", "countries": "LMN;PQRST", "date": "28-AUG-2008", "type": null},
{"name": "OLMS", "address": null, "country_of_residence": null, "countries": "Not identified;No", "date": "23-FEB-2017", "type": null}
]
}
would be acceptable again. But there cannot be several top level objects.
I was parsing JSON from a REST API call and got this error. It turns out the API had become "fussier" (eg about order of parameters etc) and so was returning malformed results. Check that you are getting what you expect :)
Whatever you receive, it does not seem to end where it should end; example:
>>> import json
>>> json.loads(""" {"Hello" : "World"} \ """)
....
ValueError: Extra data: line 1 column 21 - line 1 column 23 (char 21 - 23)
I'd suggest inspect your output before it gets parsed to get hold of the problem.
PS. There are simpler ways to get JSON data from a server (assuming your server returns parsable JSON, which it might not). Here is an example using the requests library:
>>> import json, requests
>>> u = "http://gdata.youtube.com/feeds/api/standardfeeds/most_popular?alt=json"
>>> json.loads(requests.get(u).text) # <-- request + parse
{u'feed': {u'category': [{u'term': u'http://gdata.youtube.com/...
.....
In the json lib ->decoder.py->decode function
if end != len(s):
raise ValueError(errmsg("Extra data" , s , end , len(s)))
It's mean if your string's len != end , will raise this Exception And the end is the last "}" postion in your string. so you I can use:
string = "".join([string.rsplit("}" , 1)[0] , "}"])
cut the extra data after the last "}".
I am trying to load .json tweet data dictionaries as follows:
import json
import gzip
filename = "tweets.json.txt.gz"
for line in gzip.open(filename , 'rt', encoding='utf-8'):
tweet = json.loads(line.strip())I receive the following error:
raise JSONDecodeError("Extra data", s, end)
JSONDecodeError: Extra dataHow can I solve this?
As you already found out: that is not valid JSON.
You have to modify it to make it valid, specifically, you have to wrap your top-level objects in an array. Try this:
import json
from pprint import pprint
with open('myfile.json') as f:
data = json.loads("[" +
f.read().replace("}\n{", "},\n{") +
"]")
print(data)
Your JSON data set is not valid , You can merge them into one array of objects. For example :
[
{
"host": "a.com",
"ip": "1.2.2.3",
"port": 8
}, {
"host": "b.com",
"ip": "2.5.0.4",
"port": 3
}, {
"host": "c.com",
"ip": "9.17.6.7",
"port": 4
}
]
In JSON you can't have multiple objects of top-level but you can have array of objects and it is valid
You can see more JSON Data Set examples if you want in this link
- If you want to know more about JSON arrays you can read in w3schools JSON tutorial