You are creating those bytes objects yourself:

Copyitem['title'] = [t.encode('utf-8') for t in title]
item['link'] = [l.encode('utf-8') for l in link]
item['desc'] = [d.encode('utf-8') for d in desc]
items.append(item)

Each of those t.encode(), l.encode() and d.encode() calls creates a bytes string. Do not do this, leave it to the JSON format to serialise these.

Next, you are making several other errors; you are encoding too much where there is no need to. Leave it to the json module and the standard file object returned by the open() call to handle encoding.

You also don't need to convert your items list to a dictionary; it'll already be an object that can be JSON encoded directly:

Copyclass W3SchoolPipeline(object):    
    def __init__(self):
        self.file = open('w3school_data_utf8.json', 'w', encoding='utf-8')

    def process_item(self, item, spider):
        line = json.dumps(item) + '\n'
        self.file.write(line)
        return item

I'm guessing you followed a tutorial that assumed Python 2, you are using Python 3 instead. I strongly suggest you find a different tutorial; not only is it written for an outdated version of Python, if it is advocating line.decode('unicode_escape') it is teaching some extremely bad habits that'll lead to hard-to-track bugs. I can recommend you look at Think Python, 2nd edition for a good, free, book on learning Python 3.

Answer from Martijn Pieters on Stack Overflow
🌐
GitHub
github.com › adamobeng › wddbfs › issues › 1
TypeError: Object of type bytes is not JSON serializable · Issue #1 · adamobeng/wddbfs
February 18, 2024 - I'm seeing this error in my console - everything still works, but presumably that's caused by tables with BLOB columns that can't be represented as JSON? I use this format to solve that: { "id": 1, "blob_column": { "$base64": true, "enco...
Author   simonw
Discussions

Why am i getting this error TypeError: Object of type bytes is not JSON serializable
I have the below code, when i run it i get this error in my jupyter notetbook: TypeError: Object of type bytes is not JSON serializable. import pyodbc conn = pyodbc.connect('Driver={SQL Server Native Client 11.0};' 'Server=LDN01939344;' 'Database=EData;' 'Trusted_Connection=yes;' ) sql = 'SELECT ... More on community.plotly.com
🌐 community.plotly.com
0
0
October 15, 2019
TypeError: Object of type bytes is not JSON serializable when trying to update JSON with encrypted AWS key
Are you storing blob or base64.b64encode(blob) ? blob = encryption_result['CiphertextBlob'] obj['password'] = blob print(base64.b64encode(blob)) Also, look at the example you have in the comments: # base64_bytes = base64.b64encode(sample_string_bytes) # finalpassword64password = base64_bytes.decode('ascii') # obj['password'] = finalpassword64password The result from b64decode() result needs to be turned back into a "str" before being stored. More on reddit.com
🌐 r/learnpython
2
1
January 11, 2022
TypeError: Object of type bytes is not JSON serializable
hello everyone I use UCS 5 and am on the latest Update. i have an issue creating and editing users. the first noticeable problem is that I always get An Error Occured; Internal server error: The module process died unexpectedly. inside the UMC when I’m in the users app. then when I try to ... More on help.univention.com
🌐 help.univention.com
1
0
August 2, 2021
JSON serialization / byte type error
I have received an TypeError “Object of type ‘bytes’ is not JSON serializable” for the first time. As it stands I have no way to know what part of my code is at fault, since the message provides no traceback. Is there a way to diagnose or correct this without inspecting every variable ... More on discourse.bokeh.org
🌐 discourse.bokeh.org
0
0
September 11, 2018
Top answer
1 of 3
8

You have to use the str.decode() method.

You are trying to serialize a object of type bytes to a JSON object. There is no such thing in the JSON schema. So you have to convert the bytes to a String first.

Also you should use json.dumps() instead of json.dump() because you dont want to write to a File.

In your example:

import json 
import base64

with open(r"C:/Users/Documents/pdf2txt/outputImage.jpg", "rb") as img:
    image = base64.b64encode(img.read())
    data['ProcessedImage'] = image.decode() # not just image

print(json.dumps(data))
2 of 3
7

First of all, I think you should use json.dumps() because you're calling json.dump() with the incorrect arguments and it doesn't return anything to print.

Secondly, as the error message indicates, you can't serializable objects of type bytes which is what json.dumps() expects. To do this properly you need to decode the binary data into a Python string with some encoding. To preserve the data properly, you should use latin1 encoding because arbitrary binary strings are valid latin1 which can always be decoded to Unicode and then encoded back to the original string again (as pointed out in this answer by Sven Marnach).

Here's your code showing how to do that (plus corrections for the other not-directly-related problems it had):

import json
import base64

image_path = "C:/Users/Documents/pdf2txt/outputImage.jpg"
data = {}

with open(image_path, "rb") as img:
    image = base64.b64encode(img.read()).decode('latin1')
    data['ProcessedImage'] = image

print(json.dumps(data))
🌐
Splunk Community
community.splunk.com › t5 › Getting-Data-In › TypeError-Object-of-type-bytes-is-not-JSON-serializable › m-p › 687395
TypeError: Object of type bytes is not JSON serializable
June 9, 2024 - This is an error resulting from the python code trying to do somethin it's not supposed to. In this case - it's trying to serialize to json an object which is not serializable (not all classes can be serialized).
🌐
Bobby Hadz
bobbyhadz.com › blog › python-typeerror-object-of-type-bytes-is-not-json-serializable
TypeError: Object of type bytes is not JSON serializable | bobbyhadz
April 8, 2024 - Copied!import json class BytesEncoder(json.JSONEncoder): def default(self, obj): if isinstance(obj, bytes): return obj.decode('utf-8') return json.JSONEncoder.default(self, obj) If the passed-in value is a bytes object, we decode it to a str and return the result. The isinstance function returns True if the passed-in object is an instance or a subclass of the passed-in class. In all other cases, we let the base class's default method do the serialization.
🌐
Plotly
community.plotly.com › dash python
Why am i getting this error TypeError: Object of type bytes is not JSON serializable - Dash Python - Plotly Community Forum
October 15, 2019 - I have the below code, when i run it i get this error in my jupyter notetbook: TypeError: Object of type bytes is not JSON serializable. import pyodbc conn = pyodbc.connect('Driver={SQL Server Native Client 11.0};' …
Find elsewhere
🌐
Reddit
reddit.com › r/learnpython › typeerror: object of type bytes is not json serializable when trying to update json with encrypted aws key
r/learnpython on Reddit: TypeError: Object of type bytes is not JSON serializable when trying to update JSON with encrypted AWS key
January 11, 2022 -
 path_to_json = '' #old path
        path_to_newjson ='' #new path
        key_id = '' #aws key
        json_files = [pos_json for pos_json in os.listdir(path_to_json) if pos_json.endswith('.json')] #loading all jsons in old path
 
 
        jsons_data = pd.DataFrame(columns=['password']) #looking for password
 
 
        def encode_password(obj):
 
            if 'password' in obj:
                listpasswords = obj['password']
                jsons_data.loc[index] = [listpasswords]
                print('Current Loaded password: ' + jsons_data)
                print(listpasswords)
                text = listpasswords
                # sample_string_bytes = text.encode('ascii')
                # base64_bytes = base64.b64encode(sample_string_bytes)
                # finalpassword64password = base64_bytes.decode('ascii')
                # obj['password'] = finalpassword64password
               

    
                session = boto3.session.Session()
                client = session.client('kms', 'us-west-1')
                encryption_result = client.encrypt(KeyId=key_id,
                Plaintext=text)
                print('password updated: ' + text)
 
                blob = encryption_result['CiphertextBlob']
                obj['password'] = blob            
                print(base64.b64encode(blob))
            return obj
 
        for index, js in enumerate(json_files):
            with open(os.path.join(path_to_json, js)) as json_file:
                json_text = commentjson.load(json_file, object_hook=encode_password)
                with open(os.path.join(path_to_newjson, js),'w') as f:
                    commentjson.dump(json_text,f,indent = 4) #updates the password  
                    print('finished')

I'm trying to get an value (password field) from a list of jsons and update all in one go. The issue I am having is I need to encrypt the old password taken from the json and update it with the AWS encryption along with 64 encoding. When I try this out, it throws an error, with "TypeError: Object of type bytes is not JSON serializable". If I simply want to just encode the string with 64 encoding and update the json, it works but once I comment out the previous code and add the encryption of AWS it throws an error.

🌐
Researchdatapod
researchdatapod.com › home › how to solve python typeerror: object of type bytes is not json serializable
How to Solve Python TypeError: Object of type bytes is not JSON serializable - The Research Scientist Pod
June 7, 2022 - The simplest way to solve this error is to call the decode() method on the bytes object returned by base64.b64encode to get a base64 string. We can then store the base64 string in the dictionary and serialize the data.
🌐
Univention Help
help.univention.com › ucs - univention corporate server
TypeError: Object of type bytes is not JSON serializable - UCS - Univention Corporate Server - Univention Help
August 2, 2021 - hello everyone I use UCS 5 and am on the latest Update. i have an issue creating and editing users. the first noticeable problem is that I always get An Error Occured; Internal server error: The module process died unexpectedly. inside the UMC when I’m in the users app. then when I try to ...
🌐
Bokeh Discourse
discourse.bokeh.org › community support
JSON serialization / byte type error - Community Support - Bokeh Discourse
September 11, 2018 - I have received an TypeError “Object of type ‘bytes’ is not JSON serializable” for the first time. As it stands I have no way to know what part of my code is at fault, since the message provides no traceback. Is there a way to diagnose or correct this without inspecting every variable ...
🌐
Prospera Soft
prosperasoft.com › blog › web-scrapping › scrapy › bytes-json-serializable-error
Understanding TypeError: Object of Type 'Bytes' is Not JSON Serializable
This TypeError arises when you attempt to serialize an object that contains bytes as part of your data structure. JSON, by design, does not support the bytes data type, and instead favors strings.
🌐
GitHub
github.com › keylime › keylime › issues › 115
Python 3 Port - TypeError: Object of type bytes is not JSON serializable · Issue #115 · keylime/keylime
April 18, 2019 - This takes the return from crypto.generate_random_key, encodes it as base64 and attempts to load that into a JSON object, this is then refused with TypeError: Object of type bytes is not JSON serializable
Author   lukehinds
🌐
GitHub
github.com › celery › kombu › discussions › 1421
Kombu 5.2.0 Object of type bytes is not JSON serializable · celery/kombu · Discussion #1421
November 5, 2021 - I just found out that the new version is using python internal json module as the default encoder, wherein the older version (5.1.0) it was using simplejson; and python builtin json can not dump byte data.
Author   celery
🌐
Johnlekberg
johnlekberg.com › blog › 2020-12-11-stdlib-json.html
Using Python's json module
December 11, 2020 - TypeError: Object of type bytes is not JSON serializable · import datetime json.dumps(datetime.datetime(2019, 6, 17, 3, 46, 23)) TypeError: Object of type datetime is not JSON serializable · You could manually encode the data.
🌐
Microsoft Learn
learn.microsoft.com › en-us › dotnet › api › system.text.json.jsonserializer.serializetoutf8bytes
JsonSerializer.SerializeToUtf8Bytes Method (System.Text.Json) | Microsoft Learn
... A JSON string representation of the value, encoded as UTF-8 bytes. ... There is no compatible JsonConverter for inputType or its serializable members. For more information, see How to serialize and deserialize JSON.
🌐
Streamlit
discuss.streamlit.io › using streamlit
Object of type 'bytes' is not JSON serializable - Using Streamlit - Streamlit
August 25, 2021 - I run the same code on a jupyter notebook, it works fine. but when I run it on a script with streamlit i keep getting “Object of type bytes is not JSON serializable”. this is the script codes def lime_explain(x_train, x_val, y_train, feat, model, i): ml_model = pickle.load(open(model, 'rb')) explainer = lime.lime_tabular.LimeTabularExplainer(x_train.values, feature_names = feat, class_names = ['True', 'False'], mode='classification', training_labels=x_train.colu mns.values...
🌐
Reddit
reddit.com › r/learnpython › object of type 'bytes' is not json serializable
r/learnpython on Reddit: Object of type 'bytes' is not JSON serializable
June 22, 2019 -

I am getting this error when I try to run the register and login pages from this tutorial. As far as I can tell, my code matches the tutorial code, so I don't know why I get an error.

register.py:

from flask_wtf import FlaskForm
from wtforms import StringField, PasswordField, SubmitField, BooleanField
from wtforms.validators import DataRequired, Length, Email, EqualTo


class RegistrationForm(FlaskForm):
    username = StringField('Username',
                           validators=[DataRequired(), Length(min=2, max=20)])
    email = StringField('Email',
                        validators=[DataRequired(), Email()])
    password = PasswordField('Password', validators=[DataRequired()])
    confirm_password = PasswordField('Confirm Password',
                                     validators=[DataRequired(), EqualTo('password')])
    submit = SubmitField('Sign Up')


class LoginForm(FlaskForm):
    email = StringField('Email',
                        validators=[DataRequired(), Email()])
    password = PasswordField('Password', validators=[DataRequired()])
    remember = BooleanField('Remember Me')
    submit = SubmitField('Login')    

run.py:

from flask import Flask, render_template, url_for, flash, redirect
from forms import RegistrationForm, LoginForm

app = Flask(__name__)
app.config['SECRET_KEY'] = '5791628bb0b13ce0c676dfde280ba245'

posts = [
    {
        'author': 'Corey Schafer',
        'title': 'Blog Post 1',
        'content': 'First post content',
        'date_posted': 'April 20, 2018'
    },
    {
        'author': 'Jane Doe',
        'title': 'Blog Post 2',
        'content': 'Second post content',
        'date_posted': 'April 21, 2018'
    }
]


@app.route("/")
@app.route("/home")
def home():
    return render_template('home.html', posts=posts)


@app.route("/about")
def about():
    return render_template('about.html', title='About')


@app.route("/register", methods=['GET', 'POST'])
def register():
    form = RegistrationForm()
    if form.validate_on_submit():
        flash(f'Account created for {form.username.data}!', 'success')
        return redirect(url_for('home'))
    return render_template('register.html', title='Register', form=form)


@app.route("/login", methods=['GET', 'POST'])
def login():
    form = LoginForm()
    if form.validate_on_submit():
        if form.email.data == 'admin@blog.com' and form.password.data == 'password':
            flash('You have been logged in!', 'success')
            return redirect(url_for('home'))
        else:
            flash('Login Unsuccessful. Please check username and password', 'danger')
    return render_template('login.html', title='Login', form=form)


if __name__ == '__main__':
    app.run(debug=True)

The error message in full:

File "c:\python37\lib\site-packages\flask\app.py", line 2328, in __call__
return self.wsgi_app(environ, start_response)
File "c:\python37\lib\site-packages\flask\app.py", line 2314, in wsgi_app
response = self.handle_exception(e)
File "c:\python37\lib\site-packages\flask\app.py", line 1760, in handle_exception
reraise(exc_type, exc_value, tb)
File "c:\python37\lib\site-packages\flask\_compat.py", line 36, in reraise
raise value
File "c:\python37\lib\site-packages\flask\app.py", line 2311, in wsgi_app
response = self.full_dispatch_request()
File "c:\python37\lib\site-packages\flask\app.py", line 1834, in full_dispatch_request
rv = self.handle_user_exception(e)
File "c:\python37\lib\site-packages\flask\app.py", line 1737, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "c:\python37\lib\site-packages\flask\_compat.py", line 36, in reraise
raise value
File "c:\python37\lib\site-packages\flask\app.py", line 1832, in full_dispatch_request
rv = self.dispatch_request()
File "c:\python37\lib\site-packages\flask\app.py", line 1818, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "C:\code\codeinstitute\python\flask\run.py", line 36, in register
form = RegistrationForm()
File "c:\python37\lib\site-packages\wtforms\form.py", line 212, in __call__
return type.__call__(cls, *args, **kwargs)
File "c:\python37\lib\site-packages\flask_wtf\form.py", line 88, in __init__
super(FlaskForm, self).__init__(formdata=formdata, **kwargs)
File "c:\python37\lib\site-packages\wtforms\form.py", line 278, in __init__
self.process(formdata, obj, data=data, **kwargs)
File "c:\python37\lib\site-packages\wtforms\form.py", line 132, in process
field.process(formdata)
File "c:\python37\lib\site-packages\wtforms\csrf\core.py", line 43, in process
self.current_token = self.csrf_impl.generate_csrf_token(self)
File "c:\python37\lib\site-packages\flask_wtf\csrf.py", line 134, in generate_csrf_token
token_key=self.meta.csrf_field_name
File "c:\python37\lib\site-packages\flask_wtf\csrf.py", line 47, in generate_csrf
setattr(g, field_name, s.dumps(session[field_name]))
File "c:\python37\lib\site-packages\itsdangerous\serializer.py", line 166, in dumps
payload = want_bytes(self.dump_payload(obj))
File "c:\python37\lib\site-packages\itsdangerous\url_safe.py", line 42, in dump_payload
json = super(URLSafeSerializerMixin, self).dump_payload(obj)
File "c:\python37\lib\site-packages\itsdangerous\serializer.py", line 133, in dump_payload
return want_bytes(self.serializer.dumps(obj, **self.serializer_kwargs))
File "c:\python37\lib\site-packages\itsdangerous\_json.py", line 18, in dumps
return json.dumps(obj, **kwargs)
File "c:\python37\lib\json\__init__.py", line 238, in dumps
**kw).encode(obj)
File "c:\python37\lib\json\encoder.py", line 199, in encode
chunks = self.iterencode(o, _one_shot=True)
File "c:\python37\lib\json\encoder.py", line 257, in iterencode
return _iterencode(o, 0)
File "c:\python37\lib\json\encoder.py", line 179, in default
raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type bytes is not JSON serializable
🌐
Protocol Buffers
protobuf.dev › programming-guides › proto3
Language Guide (proto 3) | Protocol Buffers Documentation
Delete a oneof field and add it back: This may clear your currently set oneof field after the message is serialized and parsed. Split or merge oneof: This has similar issues to moving singular fields. If you want to create an associative map as part of your data definition, protocol buffers provides a handy shortcut syntax: ... …where the key_type can be any integral or string type (so, any scalar type except for floating point types and bytes).