Brave Search

When should one use BytesIO .getvalue() instead of .getbuffer()?

stackoverflow.com › questions › 61319551 › when-should-one-use-bytesio-getvalue-instead-of-getbuffer

This question is old, but it looks like nobody has answered this sufficiently.

Simply:

obj.getbuffer() creates a memoryview object.
Every time you write, or if there is a memoryview of obj present, obj.getvalue() will need to create a new, complete value.
If you have not written (since creation or since the last obj.getvalue() call) and there is no memoryview present, obj.getvalue() is the fastest method of access, and requires no copies.

That being the case:

When creating another io.BytesIO, use obj.getvalue()
For random-access reading and writing, DEFINITELY use obj.getbuffer()
Avoid interpolating reading and writing frequently. If you must, then DEFINITELY use obj.getbuffer(), unless your file is tiny.
Avoid using obj.getvalue() while a buffer is laying around.

Here, we see that it's all fast, and all well and good if no buffer is laying around:


# time getvalue()
>>> i = io.BytesIO(b'f' * 1_000_000)
>>> %timeit i.getvalue()
34.6 ns ± 0.178 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

# time getbuffer()
>>> %timeit i.getbuffer()
118 ns ± 0.495 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

# time getbuffer() and getvalue() together
>>> %timeit i.getbuffer(); i.getvalue()
173 ns ± 0.829 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

Everything is fine, and working about like you'd expect. But let's see what happens when there's a buffer just laying around:

>>> x = i.getbuffer()
>>> %timeit i.getvalue()
33 µs ± 675 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Notice that we're no longer measuring in nanoseconds, we're measuring in microseconds. That's multiple orders of magnitude slower. If you del x, we're back to being fast. This is all because while a memoryview exists, Python has to account for the possibility that the BytesIO may have been written to. So, to give a definite state to the user, it copies the buffer.

Answer from Mr. B on Stack Overflow

Python

docs.python.org › 3 › library › io.html

io — Core tools for working with streams

January 30, 2026 - Its subclasses, BufferedWriter, BufferedReader, and BufferedRWPair buffer raw binary streams that are writable, readable, and both readable and writable, respectively. BufferedRandom provides a buffered interface to seekable streams. Another BufferedIOBase subclass, BytesIO, is a stream of in-memory bytes.

Stack Overflow

stackoverflow.com › questions › 61319551 › when-should-one-use-bytesio-getvalue-instead-of-getbuffer

python - When should one use BytesIO .getvalue() instead of .getbuffer()? - Stack Overflow

Top answer

1 of 1

This question is old, but it looks like nobody has answered this sufficiently.

Simply:

obj.getbuffer() creates a memoryview object.
Every time you write, or if there is a memoryview of obj present, obj.getvalue() will need to create a new, complete value.
If you have not written (since creation or since the last obj.getvalue() call) and there is no memoryview present, obj.getvalue() is the fastest method of access, and requires no copies.

That being the case:

When creating another io.BytesIO, use obj.getvalue()
For random-access reading and writing, DEFINITELY use obj.getbuffer()
Avoid interpolating reading and writing frequently. If you must, then DEFINITELY use obj.getbuffer(), unless your file is tiny.
Avoid using obj.getvalue() while a buffer is laying around.

Here, we see that it's all fast, and all well and good if no buffer is laying around:


# time getvalue()
>>> i = io.BytesIO(b'f' * 1_000_000)
>>> %timeit i.getvalue()
34.6 ns ± 0.178 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

# time getbuffer()
>>> %timeit i.getbuffer()
118 ns ± 0.495 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

# time getbuffer() and getvalue() together
>>> %timeit i.getbuffer(); i.getvalue()
173 ns ± 0.829 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

Everything is fine, and working about like you'd expect. But let's see what happens when there's a buffer just laying around:

>>> x = i.getbuffer()
>>> %timeit i.getvalue()
33 µs ± 675 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

Python⇒Speed

pythonspeed.com › articles › bytesio-reduce-memory-usage

The surprising way to save memory with BytesIO

February 27, 2025 - Another options is BytesIO.getvalue(), which returns the contents of the BytesIO as a bytes object.

Beautiful Soup

tedboy.github.io › python_stdlib › generated › generated › io.BytesIO.getvalue.html

io.BytesIO.getvalue — Python Standard Library

Retrieve the entire contents of the BytesIO object.

DigitalOcean

digitalocean.com › community › tutorials › python-io-bytesio-stringio

Python io.BytesIO and io.StringIO: Memory File Guide | DigitalOcean

August 3, 2022 - import io stream_str = io.BytesIO(b"JournalDev Python: \x00\x01") print(stream_str.getvalue())

Stack Overflow

stackoverflow.com › questions › 6479317 › is-it-normal-for-pythons-io-bytesio-getvalue-to-return-str-instead-of-bytes

Is it normal for python's io.BytesIO.getvalue() to return str instead of bytes? - Stack Overflow

Top answer

1 of 3

The issue is that you are positioned at the end of the stream. Think of the position like a cursor. Once you have written b' world', your cursor is at the end of the stream. When you try to .read(), you are reading everything after the position of the cursor - which is nothing, so you get the empty bytestring.

To navigate around the stream you can use the .seek method:

>>> import io
>>> in_memory = io.BytesIO(b'hello', )
>>> in_memory.write(b' world')
>>> in_memory.seek(0)  # go to the start of the stream
>>> print(in_memory.read())
b' world'

Note that, just like a filestream in write ('w') mode, the initial bytes b'hello' have been overwritten by your writing of b' world'.

.getvalue() just returns the entire contents of the stream regardless of current position.

2 of 3

this is a memory stream but still a stream. The position is stored, so like any other stream if you try to read after having written, you have to re-position:

import io
in_memory = io.BytesIO(b'hello')
in_memory.seek(0,2)   # seek to end, else we overwrite
in_memory.write(b' world')
in_memory.seek(0)    # seek to start
print( in_memory.read() )

prints:

b'hello world'

while in_memory.getvalue() doesn't need the final seek(0) as it returns the contents of the stream from position 0.

GitHub

github.com › python › cpython › blob › main › Modules › _io › bytesio.c

cpython/Modules/_io/bytesio.c at main · python/cpython

_io.BytesIO.getvalue · · Retrieve the entire contents of the BytesIO object. [clinic start generated code]*/ ·

Author python

GeeksforGeeks

geeksforgeeks.org › python › convert-from-_io-bytesio-to-a-bytes-like-object-in-python

Convert from '_Io.Bytesio' to a Bytes-Like Object in Python - GeeksforGeeks

July 23, 2025 - To convert from _io.BytesIO to a bytes-like object using the getvalue() method, we can directly obtain the byte data stored in the BytesIO buffer.

Find elsewhere

Google Bing Mojeek

Pynerds

pynerds.com › io-bytesio-in-python

io.BytesIO in Python

March 28, 2024 - In the above example we created ... rather than a regular string. The stream.getvalue() method returns the entire contents of the bytes stream as a bytes object....

ProgramCreek

programcreek.com › python › example › 1734 › io.BytesIO

Python Examples of io.BytesIO

def _serialize_data(self, data): # Default to raw bytes type_ = _BYTES if isinstance(data, np.ndarray): # When the data is a numpy array, use the more compact native # numpy format. buf = io.BytesIO() np.save(buf, data) data = buf.getvalue() type_ = _NUMPY elif not isinstance(data, (bytearray, bytes)): # Everything else except byte data is serialized in pickle format.

Simon Willison

simonwillison.net › 2025 › Jan › 31 › save-memory-with-bytesio

The surprising way to save memory with BytesIO

January 31, 2025 - The surprising way to save memory ... of that object, doubling the amount of memory used - but calling .getvalue() returns a bytes object that uses no additional memory, instead using copy-on-write....

GitHub

github.com › orgs › micropython › discussions › 13708

How to clear an io.BytesIO buffer. · micropython · Discussion #13708

Class StaticBuffer is (for now) modeled after class io.BytesIO. class StaticBufferException( Exception ): pass # 0 <= next <= lngt <= size class StaticBuffer(): def __init__( self, size ): self._bfr= bytearray( size ) self.size= size # Allocated size self.lngt= 0 # Current length self.next= 0 # Ordinal to start next write def clear( self ): self.lngt= 0 self.next= 0 def flush( self ): pass def getvalue( self ): return bytes(self._bfr[:self.lngt]) def seek( self, ordinal ): if ordinal < 0 or ordinal > self.next: raise StaticBufferException( b'Seek ordinal out of range' ) self.next= ordinal def

Python Assets

pythonassets.com › posts › what-is-io-bytesio-useful-for

What Is `io.BytesIO` Useful For? | Python Assets

July 19, 2024 - A quick guide to understand what the io.BytesIO standard class is and how to use it, with Pandas and Flask examples.

Python

wiki.python.org › moin › BytesIO

BytesIO - Python Wiki

April 28, 2011 - There is no BytesIO.getvalue() method because it's not needed.

Medium

medium.com › @sarthakshah1920 › harnessing-the-power-of-in-memory-buffers-with-bytesio-0ac6d5493178

Harnessing the Power of In-Memory Buffers with BytesIO | by Sarthak Shah | Medium

December 24, 2023 - ... Finally, we create an HTTP ... is set to the value obtained from file_buffer.getvalue(), which retrieves the entire content of the in-memory buffer....

Google Groups

groups.google.com › g › dev-python › c › snzL-qcw5sM

[Python-Dev] io.BytesIO slower than monkey-patching io.RawIOBase

On Tue, Jul 17, 2012 at 2:57 PM, John O'Connor <jxo...@rit.edu> wrote: >> >> The second approach is consistently 10-20% faster than the first one >> (depending on input) for trunk Python 3.3 >> > > I think the difference is that StringIO spends extra time reallocating > memory during the write loop as it grows, whereas bytes.join computes > the allocation size first since it already knows the final length. BytesIO is actually missing an optimisation that is already used in StringIO: the StringIO C implementation uses a fragment accumulator internally, and collapses that into a single string object when getvalue() is called.

GeeksforGeeks

geeksforgeeks.org › stringio-and-bytesio-for-managing-data-as-file-object

Stringio And Bytesio For Managing Data As File Object - GeeksforGeeks

April 28, 2025 - After reset: I am New Website! BytesIO is like a virtual file that exists in the computer's memory, just like `StringIO`. However, it's tailored to handle binary data (bytes) instead of text.

Jamie Phillips

phillipsj.net › posts › odd-issue-using-bytesio-with-boto3

Odd Issue Using BytesIO With Boto3 • Jamie Phillips

September 28, 2023 - Then I realized that I probably shouldn’t be using read at all, I can just call getvalue and it will get all the contents of the stream. Here is the final example that is a little cleaner. stream = StringIO() writer = csv.writer(stream) writer.writerow(['Test1', 'Test2']) writer.writerow(['TestA', 'TestB']) client = boto3.client('s3') client.upload_fileobj(BytesIO(stream.getvalue().encode()), 'upload_bucket', 'csv_key'))

GitHub

github.com › streamlit › streamlit › issues › 2235

Automatically reset buffer when file is read · Issue #2235 · streamlit/streamlit

October 20, 2020 - Problem: read_csv does not work when multiple files are uploaded Reason: It works currently because we are creating a new BytesIO (or StringIO) object each time in deltagenerator. the object gets passed to and gets processed. when anothe...

Author karriebear