Brave Search

Difference between `open` and `io.BytesIO` in binary streams

stackoverflow.com › questions › 42800250 › difference-between-open-and-io-bytesio-in-binary-streams

For simplicity's sake, let's consider writing instead of reading for now.

So when you use open() like say:

with open("test.dat", "wb") as f:
    f.write(b"Hello World")
    f.write(b"Hello World")
    f.write(b"Hello World")

After executing that a file called test.dat will be created, containing 3x Hello World. The data wont be kept in memory after it's written to the file (unless being kept by a name).

Now when you consider io.BytesIO() instead:

with io.BytesIO() as f:
    f.write(b"Hello World")
    f.write(b"Hello World")
    f.write(b"Hello World")

Which instead of writing the contents to a file, it's written to an in memory buffer. In other words a chunk of RAM. Essentially writing the following would be the equivalent:

buffer = b""
buffer += b"Hello World"
buffer += b"Hello World"
buffer += b"Hello World"

In relation to the example with the with statement, then at the end there would also be a del buffer.

The key difference here is optimization and performance. io.BytesIO is able to do some optimizations that makes it faster than simply concatenating all the b"Hello World" one by one.

Just to prove it here's a small benchmark:

Concat: 1.3529 seconds
BytesIO: 0.0090 seconds

import io
import time

begin = time.time()
buffer = b""
for i in range(0, 50000):
    buffer += b"Hello World"
end = time.time()
seconds = end - begin
print("Concat:", seconds)

begin = time.time()
buffer = io.BytesIO()
for i in range(0, 50000):
    buffer.write(b"Hello World")
end = time.time()
seconds = end - begin
print("BytesIO:", seconds)

Besides the performance gain, using BytesIO instead of concatenating has the advantage that BytesIO can be used in place of a file object. So say you have a function that expects a file object to write to. Then you can give it that in-memory buffer instead of a file.

The difference is that open("myfile.jpg", "rb") simply loads and returns the contents of myfile.jpg; whereas, BytesIO again is just a buffer containing some data.

Since BytesIO is just a buffer - if you wanted to write the contents to a file later - you'd have to do:

buffer = io.BytesIO()
# ...
with open("test.dat", "wb") as f:
    f.write(buffer.getvalue())

Also, you didn't mention a version; I'm using Python 3. Related to the examples: I'm using the with statement instead of calling f.close()

Answer from vallentin on Stack Overflow

Python

docs.python.org › 3 › library › io.html

io — Core tools for working with streams

Its subclasses, BufferedWriter, ... interface to seekable streams. Another BufferedIOBase subclass, BytesIO, is a stream of in-memory bytes....

Stack Overflow

stackoverflow.com › questions › 42800250 › difference-between-open-and-io-bytesio-in-binary-streams

python - Difference between `open` and `io.BytesIO` in binary streams - Stack Overflow

Top answer

1 of 2

208

For simplicity's sake, let's consider writing instead of reading for now.

So when you use open() like say:

with open("test.dat", "wb") as f:
    f.write(b"Hello World")
    f.write(b"Hello World")
    f.write(b"Hello World")

After executing that a file called test.dat will be created, containing 3x Hello World. The data wont be kept in memory after it's written to the file (unless being kept by a name).

Now when you consider io.BytesIO() instead:

with io.BytesIO() as f:
    f.write(b"Hello World")
    f.write(b"Hello World")
    f.write(b"Hello World")

Which instead of writing the contents to a file, it's written to an in memory buffer. In other words a chunk of RAM. Essentially writing the following would be the equivalent:

buffer = b""
buffer += b"Hello World"
buffer += b"Hello World"
buffer += b"Hello World"

In relation to the example with the with statement, then at the end there would also be a del buffer.

The key difference here is optimization and performance. io.BytesIO is able to do some optimizations that makes it faster than simply concatenating all the b"Hello World" one by one.

Just to prove it here's a small benchmark:

Concat: 1.3529 seconds
BytesIO: 0.0090 seconds

import io
import time

begin = time.time()
buffer = b""
for i in range(0, 50000):
    buffer += b"Hello World"
end = time.time()
seconds = end - begin
print("Concat:", seconds)

begin = time.time()
buffer = io.BytesIO()
for i in range(0, 50000):
    buffer.write(b"Hello World")
end = time.time()
seconds = end - begin
print("BytesIO:", seconds)

The difference is that open("myfile.jpg", "rb") simply loads and returns the contents of myfile.jpg; whereas, BytesIO again is just a buffer containing some data.

Since BytesIO is just a buffer - if you wanted to write the contents to a file later - you'd have to do:

buffer = io.BytesIO()
# ...
with open("test.dat", "wb") as f:
    f.write(buffer.getvalue())

Also, you didn't mention a version; I'm using Python 3. Related to the examples: I'm using the with statement instead of calling f.close()

2 of 2

Using open opens a file on your hard drive. Depending on what mode you use, you can read or write (or both) from the disk.

A BytesIO object isn't associated with any real file on the disk. It's just a chunk of memory that behaves like a file does. It has the same API as a file object returned from open (with mode r+b, allowing reading and writing of binary data).

BytesIO (and it's close sibling StringIO which is always in text mode) can be useful when you need to pass data to or from an API that expect to be given a file object, but where you'd prefer to pass the data directly. You can load your input data you have into the BytesIO before giving it to the library. After it returns, you can get any data the library wrote to the file from the BytesIO using the getvalue() method. (Usually you'd only need to do one of those, of course.)

How to read a bytes object like Python's open?

How the write(), read() and getvalue() methods of Python io.BytesIO work?

python - Why io.BytesIO is not a subclass of typing.BinaryIO, and io.StringIO is neither a subclass of typing.TextIO?

How to use io.BytesIO in Python to write to an existing buffer?

python 3.x - Difference between FileIO object and object returned by open(filename, mode)

More results from stackoverflow.com

Discussions

io.BytesIO doesn't support the buffer protocol

BPO 5506 Nosy @birkenfeld, @amauryfa, @pitrou, @benjaminp Files bytesiobuf2.patch Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current sta... More on github.com

github.com

March 18, 2009

Using BytesIO make bytes object save as a png to memory?

I don't understand your question, could you elaborate? Assuming BytesIO works like StringIO, you pass it around like a file, like you know where you'd do with open("file.png", "wb") as f: f.write(data) you'd do instead f = BytesIO.BytesIO() f.write(data) More on reddit.com

r/learnpython

June 15, 2016

Videos

01:31

YouTube

PYTHON : Difference between `open` and `io.BytesIO` in binary streams ...

December 8, 2021

youtube.com

How to use io.StringIO and io.BytesIO in Python? (Memory | IO | ...

October 25, 2024

08:01

YouTube

Python 3 - File IO #04 - YouTube

September 12, 2017

01:16

YouTube

PYTHON : Writing then reading in-memory bytes (BytesIO) gives a ...

December 9, 2021

01:41

YouTube

Converting Bytes Object to _io.BytesIO in Python - YouTube

October 6, 2025

View all

GeeksforGeeks

geeksforgeeks.org › python › stringio-and-bytesio-for-managing-data-as-file-object

Stringio And Bytesio For Managing Data As File Object - GeeksforGeeks

July 24, 2025 - BytesIO is like a virtual file that exists in the computer's memory, just like `StringIO`. However, it's tailored to handle binary data (bytes) instead of text. It lets you perform operations on these bytes, such as reading and writing, as if ...

Medium

medium.com › @abhishekshaw020 › understanding-bytesio-handling-in-memory-files-like-a-pro-e1b767339468

Understanding BytesIO: Handling In-Memory Files Like a Pro | by Abhishek Shaw | Medium

March 31, 2025 - Think of BytesIO as a virtual file that lives in your computer’s memory (RAM) instead of your hard drive. It lets you read and write data just like a normal file, but everything stays in memory—fast, efficient, and no cleanup required!

DigitalOcean

digitalocean.com › community › tutorials › python-io-bytesio-stringio

Python io.BytesIO and io.StringIO: Memory File Guide | DigitalOcean

August 3, 2022 - There are many ways in which we can use the io module to perform stream and buffer operations in Python. We will demonstrate a lot of examples here to prove the point. Let’s get started. Just like what we do with variables, data can be kept as bytes in an in-memory buffer when we use the io module’s Byte IO operations. Here is a sample program to demonstrate this: import io stream_str = io.BytesIO(b"JournalDev Python: \x00\x01") print(stream_str.getvalue())

Beautiful Soup

tedboy.github.io › python_stdlib › generated › generated › io.BytesIO.html

io.BytesIO — Python Standard Library

io.BytesIO · View page source · class io.BytesIO¶ ·

Find elsewhere

Google Bing Mojeek

Medium

medium.com › @sarthakshah1920 › harnessing-the-power-of-in-memory-buffers-with-bytesio-0ac6d5493178

Harnessing the Power of In-Memory Buffers with BytesIO | by Sarthak Shah | Medium

December 24, 2023 - Whether dealing with images or files, the traditional approach of saving data to disk can introduce various challenges such as slower I/O operations, security concerns, and the need for manual file cleanup. This article explores a more efficient alternative using in-memory buffers, exemplified by the Python BytesIO module.

reddit.com › r/learnpython › vs

r/learnpython on Reddit: <class 'bytes'> vs <class '_io.BytesIO'>

December 1, 2022 -

Hello all,

I'm trying to wrap my head around the practical differences between:

<class 'bytes'> and <class '_io.BytesIO'>.

I read through the documentation:

https://docs.python.org/3/library/io.html?highlight=bytesio#binary-i-o

Binary I/O (also called buffered I/O) expects bytes-like objects and produces bytes objects. No encoding, decoding, or newline translation is performed. This category of streams can be used for all kinds of non-text data, and also when manual control over the handling of text data is desired.

It provides some examples:

The easiest way to create a binary stream is with open() with 'b' in the mode string:

and

f = io.BytesIO(b"some initial binary data: \x00\x01")

So I read all this, but so what? Why would you use the io.BytesIO data type over a standard bytes data type?

EDIT: Let me provide some additional context that I just discovered after reading the documentation on lxml.

https://lxml.de/parsing.html#parsing-html

I'm using the requests object and parsing the results with lxml. Here is the example code:

from io import BytesIO
from lxml import etree
#* etree - https://lxml.de/parsing.html
#? etree stands for element tree

import requests

#? Need to know concepts
#?  What are bytes
#?  HTTP status codes
#?  HTTP methods (GET. POST, PUT, DELETE)
#?  bytes - https://docs.python.org/3/library/stdtypes.html?highlight=bytes#bytes-objects

url = 'http://localhost'
#! The URL https://nostarch.com/ doesn't seem to work

resp = requests.get(url=url)
html_bytes = resp.content
parser = etree.HTMLParser()
content = etree.parse(BytesIO(html_bytes), parser=parser)

print(type(html_bytes))
print(type(BytesIO(html_bytes)))

for link in content.findall('//a'):
    print(f"{link.get('href')} -> {link.text}")

Kind regards