You are creating those bytes objects yourself:
Copyitem['title'] = [t.encode('utf-8') for t in title]
item['link'] = [l.encode('utf-8') for l in link]
item['desc'] = [d.encode('utf-8') for d in desc]
items.append(item)
Each of those t.encode(), l.encode() and d.encode() calls creates a bytes string. Do not do this, leave it to the JSON format to serialise these.
Next, you are making several other errors; you are encoding too much where there is no need to. Leave it to the json module and the standard file object returned by the open() call to handle encoding.
You also don't need to convert your items list to a dictionary; it'll already be an object that can be JSON encoded directly:
Copyclass W3SchoolPipeline(object):
def __init__(self):
self.file = open('w3school_data_utf8.json', 'w', encoding='utf-8')
def process_item(self, item, spider):
line = json.dumps(item) + '\n'
self.file.write(line)
return item
I'm guessing you followed a tutorial that assumed Python 2, you are using Python 3 instead. I strongly suggest you find a different tutorial; not only is it written for an outdated version of Python, if it is advocating line.decode('unicode_escape') it is teaching some extremely bad habits that'll lead to hard-to-track bugs. I can recommend you look at Think Python, 2nd edition for a good, free, book on learning Python 3.
You are creating those bytes objects yourself:
Copyitem['title'] = [t.encode('utf-8') for t in title]
item['link'] = [l.encode('utf-8') for l in link]
item['desc'] = [d.encode('utf-8') for d in desc]
items.append(item)
Each of those t.encode(), l.encode() and d.encode() calls creates a bytes string. Do not do this, leave it to the JSON format to serialise these.
Next, you are making several other errors; you are encoding too much where there is no need to. Leave it to the json module and the standard file object returned by the open() call to handle encoding.
You also don't need to convert your items list to a dictionary; it'll already be an object that can be JSON encoded directly:
Copyclass W3SchoolPipeline(object):
def __init__(self):
self.file = open('w3school_data_utf8.json', 'w', encoding='utf-8')
def process_item(self, item, spider):
line = json.dumps(item) + '\n'
self.file.write(line)
return item
I'm guessing you followed a tutorial that assumed Python 2, you are using Python 3 instead. I strongly suggest you find a different tutorial; not only is it written for an outdated version of Python, if it is advocating line.decode('unicode_escape') it is teaching some extremely bad habits that'll lead to hard-to-track bugs. I can recommend you look at Think Python, 2nd edition for a good, free, book on learning Python 3.
Simply write <variable name>.decode("utf-8").
For example:
Copymyvar = b'asdqweasdasd'
myvar.decode("utf-8")
Why am i getting this error TypeError: Object of type bytes is not JSON serializable
TypeError: Object of type bytes is not JSON serializable when trying to update JSON with encrypted AWS key
TypeError: Object of type bytes is not JSON serializable
JSON serialization / byte type error
Videos
You have to use the str.decode() method.
You are trying to serialize a object of type bytes to a JSON object. There is no such thing in the JSON schema. So you have to convert the bytes to a String first.
Also you should use json.dumps() instead of json.dump() because you dont want to write to a File.
In your example:
import json
import base64
with open(r"C:/Users/Documents/pdf2txt/outputImage.jpg", "rb") as img:
image = base64.b64encode(img.read())
data['ProcessedImage'] = image.decode() # not just image
print(json.dumps(data))
First of all, I think you should use json.dumps() because you're calling json.dump() with the incorrect arguments and it doesn't return anything to print.
Secondly, as the error message indicates, you can't serializable objects of type bytes which is what json.dumps() expects. To do this properly you need to decode the binary data into a Python string with some encoding. To preserve the data properly, you should use latin1 encoding because arbitrary binary strings are valid latin1 which can always be decoded to Unicode and then encoded back to the original string again (as pointed out in this answer by Sven Marnach).
Here's your code showing how to do that (plus corrections for the other not-directly-related problems it had):
import json
import base64
image_path = "C:/Users/Documents/pdf2txt/outputImage.jpg"
data = {}
with open(image_path, "rb") as img:
image = base64.b64encode(img.read()).decode('latin1')
data['ProcessedImage'] = image
print(json.dumps(data))
path_to_json = '' #old path
path_to_newjson ='' #new path
key_id = '' #aws key
json_files = [pos_json for pos_json in os.listdir(path_to_json) if pos_json.endswith('.json')] #loading all jsons in old path
jsons_data = pd.DataFrame(columns=['password']) #looking for password
def encode_password(obj):
if 'password' in obj:
listpasswords = obj['password']
jsons_data.loc[index] = [listpasswords]
print('Current Loaded password: ' + jsons_data)
print(listpasswords)
text = listpasswords
# sample_string_bytes = text.encode('ascii')
# base64_bytes = base64.b64encode(sample_string_bytes)
# finalpassword64password = base64_bytes.decode('ascii')
# obj['password'] = finalpassword64password
session = boto3.session.Session()
client = session.client('kms', 'us-west-1')
encryption_result = client.encrypt(KeyId=key_id,
Plaintext=text)
print('password updated: ' + text)
blob = encryption_result['CiphertextBlob']
obj['password'] = blob
print(base64.b64encode(blob))
return obj
for index, js in enumerate(json_files):
with open(os.path.join(path_to_json, js)) as json_file:
json_text = commentjson.load(json_file, object_hook=encode_password)
with open(os.path.join(path_to_newjson, js),'w') as f:
commentjson.dump(json_text,f,indent = 4) #updates the password
print('finished')I'm trying to get an value (password field) from a list of jsons and update all in one go. The issue I am having is I need to encrypt the old password taken from the json and update it with the AWS encryption along with 64 encoding. When I try this out, it throws an error, with "TypeError: Object of type bytes is not JSON serializable". If I simply want to just encode the string with 64 encoding and update the json, it works but once I comment out the previous code and add the encryption of AWS it throws an error.
I am getting this error when I try to run the register and login pages from this tutorial. As far as I can tell, my code matches the tutorial code, so I don't know why I get an error.
register.py:
from flask_wtf import FlaskForm
from wtforms import StringField, PasswordField, SubmitField, BooleanField
from wtforms.validators import DataRequired, Length, Email, EqualTo
class RegistrationForm(FlaskForm):
username = StringField('Username',
validators=[DataRequired(), Length(min=2, max=20)])
email = StringField('Email',
validators=[DataRequired(), Email()])
password = PasswordField('Password', validators=[DataRequired()])
confirm_password = PasswordField('Confirm Password',
validators=[DataRequired(), EqualTo('password')])
submit = SubmitField('Sign Up')
class LoginForm(FlaskForm):
email = StringField('Email',
validators=[DataRequired(), Email()])
password = PasswordField('Password', validators=[DataRequired()])
remember = BooleanField('Remember Me')
submit = SubmitField('Login') run.py:
from flask import Flask, render_template, url_for, flash, redirect
from forms import RegistrationForm, LoginForm
app = Flask(__name__)
app.config['SECRET_KEY'] = '5791628bb0b13ce0c676dfde280ba245'
posts = [
{
'author': 'Corey Schafer',
'title': 'Blog Post 1',
'content': 'First post content',
'date_posted': 'April 20, 2018'
},
{
'author': 'Jane Doe',
'title': 'Blog Post 2',
'content': 'Second post content',
'date_posted': 'April 21, 2018'
}
]
@app.route("/")
@app.route("/home")
def home():
return render_template('home.html', posts=posts)
@app.route("/about")
def about():
return render_template('about.html', title='About')
@app.route("/register", methods=['GET', 'POST'])
def register():
form = RegistrationForm()
if form.validate_on_submit():
flash(f'Account created for {form.username.data}!', 'success')
return redirect(url_for('home'))
return render_template('register.html', title='Register', form=form)
@app.route("/login", methods=['GET', 'POST'])
def login():
form = LoginForm()
if form.validate_on_submit():
if form.email.data == 'admin@blog.com' and form.password.data == 'password':
flash('You have been logged in!', 'success')
return redirect(url_for('home'))
else:
flash('Login Unsuccessful. Please check username and password', 'danger')
return render_template('login.html', title='Login', form=form)
if __name__ == '__main__':
app.run(debug=True)The error message in full:
File "c:\python37\lib\site-packages\flask\app.py", line 2328, in __call__
return self.wsgi_app(environ, start_response)
File "c:\python37\lib\site-packages\flask\app.py", line 2314, in wsgi_app
response = self.handle_exception(e)
File "c:\python37\lib\site-packages\flask\app.py", line 1760, in handle_exception
reraise(exc_type, exc_value, tb)
File "c:\python37\lib\site-packages\flask\_compat.py", line 36, in reraise
raise value
File "c:\python37\lib\site-packages\flask\app.py", line 2311, in wsgi_app
response = self.full_dispatch_request()
File "c:\python37\lib\site-packages\flask\app.py", line 1834, in full_dispatch_request
rv = self.handle_user_exception(e)
File "c:\python37\lib\site-packages\flask\app.py", line 1737, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "c:\python37\lib\site-packages\flask\_compat.py", line 36, in reraise
raise value
File "c:\python37\lib\site-packages\flask\app.py", line 1832, in full_dispatch_request
rv = self.dispatch_request()
File "c:\python37\lib\site-packages\flask\app.py", line 1818, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "C:\code\codeinstitute\python\flask\run.py", line 36, in register
form = RegistrationForm()
File "c:\python37\lib\site-packages\wtforms\form.py", line 212, in __call__
return type.__call__(cls, *args, **kwargs)
File "c:\python37\lib\site-packages\flask_wtf\form.py", line 88, in __init__
super(FlaskForm, self).__init__(formdata=formdata, **kwargs)
File "c:\python37\lib\site-packages\wtforms\form.py", line 278, in __init__
self.process(formdata, obj, data=data, **kwargs)
File "c:\python37\lib\site-packages\wtforms\form.py", line 132, in process
field.process(formdata)
File "c:\python37\lib\site-packages\wtforms\csrf\core.py", line 43, in process
self.current_token = self.csrf_impl.generate_csrf_token(self)
File "c:\python37\lib\site-packages\flask_wtf\csrf.py", line 134, in generate_csrf_token
token_key=self.meta.csrf_field_name
File "c:\python37\lib\site-packages\flask_wtf\csrf.py", line 47, in generate_csrf
setattr(g, field_name, s.dumps(session[field_name]))
File "c:\python37\lib\site-packages\itsdangerous\serializer.py", line 166, in dumps
payload = want_bytes(self.dump_payload(obj))
File "c:\python37\lib\site-packages\itsdangerous\url_safe.py", line 42, in dump_payload
json = super(URLSafeSerializerMixin, self).dump_payload(obj)
File "c:\python37\lib\site-packages\itsdangerous\serializer.py", line 133, in dump_payload
return want_bytes(self.serializer.dumps(obj, **self.serializer_kwargs))
File "c:\python37\lib\site-packages\itsdangerous\_json.py", line 18, in dumps
return json.dumps(obj, **kwargs)
File "c:\python37\lib\json\__init__.py", line 238, in dumps
**kw).encode(obj)
File "c:\python37\lib\json\encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "c:\python37\lib\json\encoder.py", line 257, in iterencode
return _iterencode(o, 0)
File "c:\python37\lib\json\encoder.py", line 179, in default
raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type bytes is not JSON serializable