python get size of bytes

stackoverflow.com › questions › 65716589 › python-size-of-byte-string-in-bytes

As @jonrsharpe has stated, b'123' is an immutable sequence of bytes, in this case of 3 bytes. Your confusion appears to be because len() and sys.getsizeof(b'123') are not the same thing.

len() queries for the number of items contained in a container. For a string that's the number of characters, in your case you have 3 bytes, so its length will be 3.

Return the length (the number of items) of an object. The argument may be a sequence (string, tuple or list) or a mapping (dictionary).

sys.getsizeof() on the other hand returns the memory size of the object, this means you are not only getting the size of the bytes, as it a full object it also has its methods, attributes, addresses... which are all considered in that size.

Return the size of an object in bytes. The object can be any type of object. All built-in objects will return correct results, but this does not have to hold true for third-party extensions as it is implementation specific.

Answer from Shunya on Stack Overflow

Stack Overflow

stackoverflow.com › questions › 65716589 › python-size-of-byte-string-in-bytes

python size of byte string in bytes - Stack Overflow

Top answer

1 of 2

As @jonrsharpe has stated, b'123' is an immutable sequence of bytes, in this case of 3 bytes. Your confusion appears to be because len() and sys.getsizeof(b'123') are not the same thing.

len() queries for the number of items contained in a container. For a string that's the number of characters, in your case you have 3 bytes, so its length will be 3.

Return the length (the number of items) of an object. The argument may be a sequence (string, tuple or list) or a mapping (dictionary).

Return the size of an object in bytes. The object can be any type of object. All built-in objects will return correct results, but this does not have to hold true for third-party extensions as it is implementation specific.

2 of 2

The use of b before 123, indicates that the literal should become a bytes literal, rather than a string literal.

str literals = a sequence of Unicode characters (Latin-1, UCS-2 or UCS-4, depending on the widest character in the string)

bytes literals = a sequence of octets (integers between 0 and 255)

Put simply - you use str for text, and bytes for low level binary data.

Stack Overflow

stackoverflow.com › questions › 449560 › how-do-i-determine-the-size-of-an-object-in-python

How do I determine the size of an object in Python? - Stack Overflow

Top answer

1 of 16

964

Just use the sys.getsizeof function defined in the sys module.

sys.getsizeof(object[, default]):

Return the size of an object in bytes. The object can be any type of object. All built-in objects will return correct results, but this does not have to hold true for third-party extensions as it is implementation specific.

Only the memory consumption directly attributed to the object is accounted for, not the memory consumption of objects it refers to.

The default argument allows to define a value which will be returned if the object type does not provide means to retrieve the size and would cause a TypeError.

getsizeof calls the object’s __sizeof__ method and adds an additional garbage collector overhead if the object is managed by the garbage collector.

See recursive sizeof recipe for an example of using getsizeof() recursively to find the size of containers and all their contents.

Usage example, in python 3.0:

>>> import sys
>>> x = 2
>>> sys.getsizeof(x)
24
>>> sys.getsizeof(sys.getsizeof)
32
>>> sys.getsizeof('this')
38
>>> sys.getsizeof('this also')
48

If you are in python < 2.6 and don't have sys.getsizeof you can use this extensive module instead. Never used it though.

2 of 16

659

How do I determine the size of an object in Python?

The answer, "Just use sys.getsizeof", is not a complete answer.

That answer does work for builtin objects directly, but it does not account for what those objects may contain, specifically, what types, such as custom objects, tuples, lists, dicts, and sets contain. They can contain instances each other, as well as numbers, strings and other objects.

A More Complete Answer

Using 64-bit Python 3.6 from the Anaconda distribution, with sys.getsizeof, I have determined the minimum size of the following objects, and note that sets and dicts preallocate space so empty ones don't grow again until after a set amount (which may vary by implementation of the language):

Python 3:

Empty
Bytes  type        scaling notes
28     int         +4 bytes about every 30 powers of 2
37     bytes       +1 byte per additional byte
49     str         +1-4 per additional character (depending on max width)
48     tuple       +8 per additional item
64     list        +8 for each additional
224    set         5th increases to 736; 21nd, 2272; 85th, 8416; 341, 32992
240    dict        6th increases to 368; 22nd, 1184; 43rd, 2280; 86th, 4704; 171st, 9320
136    func def    does not include default args and other attrs
1056   class def   no slots 
56     class inst  has a __dict__ attr, same scaling as dict above
888    class def   with slots
16     __slots__   seems to store in mutable tuple-like structure
                   first slot grows to 48, and so on.

How do you interpret this? Well say you have a set with 10 items in it. If each item is 100 bytes each, how big is the whole data structure? The set is 736 itself because it has sized up one time to 736 bytes. Then you add the size of the items, so that's 1736 bytes in total

Some caveats for function and class definitions:

Note each class definition has a proxy __dict__ (48 bytes) structure for class attrs. Each slot has a descriptor (like a property) in the class definition.

Slotted instances start out with 48 bytes on their first element, and increase by 8 each additional. Only empty slotted objects have 16 bytes, and an instance with no data makes very little sense.

Also, each function definition has code objects, maybe docstrings, and other possible attributes, even a __dict__.

Also note that we use sys.getsizeof() because we care about the marginal space usage, which includes the garbage collection overhead for the object, from the docs:

getsizeof() calls the object’s __sizeof__ method and adds an additional garbage collector overhead if the object is managed by the garbage collector.

Also note that resizing lists (e.g. repetitively appending to them) causes them to preallocate space, similarly to sets and dicts. From the listobj.c source code:

    /* This over-allocates proportional to the list size, making room
     * for additional growth.  The over-allocation is mild, but is
     * enough to give linear-time amortized behavior over a long
     * sequence of appends() in the presence of a poorly-performing
     * system realloc().
     * The growth pattern is:  0, 4, 8, 16, 25, 35, 46, 58, 72, 88, ...
     * Note: new_allocated won't overflow because the largest possible value
     *       is PY_SSIZE_T_MAX * (9 / 8) + 6 which always fits in a size_t.
     */
    new_allocated = (size_t)newsize + (newsize >> 3) + (newsize < 9 ? 3 : 6);

Historical data

Python 2.7 analysis, confirmed with guppy.hpy and sys.getsizeof:

Bytes  type        empty + scaling notes
24     int         NA
28     long        NA
37     str         + 1 byte per additional character
52     unicode     + 4 bytes per additional character
56     tuple       + 8 bytes per additional item
72     list        + 32 for first, 8 for each additional
232    set         sixth item increases to 744; 22nd, 2280; 86th, 8424
280    dict        sixth item increases to 1048; 22nd, 3352; 86th, 12568 *
120    func def    does not include default args and other attrs
64     class inst  has a __dict__ attr, same scaling as dict above
16     __slots__   class with slots has no dict, seems to store in 
                    mutable tuple-like structure.
904    class def   has a proxy __dict__ structure for class attrs
104    old class   makes sense, less stuff, has real dict though.

Note that dictionaries (but not sets) got a more compact representation in Python 3.6

I think 8 bytes per additional item to reference makes a lot of sense on a 64 bit machine. Those 8 bytes point to the place in memory the contained item is at. The 4 bytes are fixed width for unicode in Python 2, if I recall correctly, but in Python 3, str becomes a unicode of width equal to the max width of the characters.

And for more on slots, see this answer.

A More Complete Function

We want a function that searches the elements in lists, tuples, sets, dicts, obj.__dict__'s, and obj.__slots__, as well as other things we may not have yet thought of.

We want to rely on gc.get_referents to do this search because it works at the C level (making it very fast). The downside is that get_referents can return redundant members, so we need to ensure we don't double count.

Classes, modules, and functions are singletons - they exist one time in memory. We're not so interested in their size, as there's not much we can do about them - they're a part of the program. So we'll avoid counting them if they happen to be referenced.

We're going to use a blacklist of types so we don't include the entire program in our size count.

import sys
from types import ModuleType, FunctionType
from gc import get_referents

# Custom objects know their class.
# Function objects seem to know way too much, including modules.
# Exclude modules as well.
BLACKLIST = type, ModuleType, FunctionType


def getsize(obj):
    """sum size of object & members."""
    if isinstance(obj, BLACKLIST):
        raise TypeError('getsize() does not take argument of type: '+ str(type(obj)))
    seen_ids = set()
    size = 0
    objects = [obj]
    while objects:
        need_referents = []
        for obj in objects:
            if not isinstance(obj, BLACKLIST) and id(obj) not in seen_ids:
                seen_ids.add(id(obj))
                size += sys.getsizeof(obj)
                need_referents.append(obj)
        objects = get_referents(*need_referents)
    return size

To contrast this with the following whitelisted function, most objects know how to traverse themselves for the purposes of garbage collection (which is approximately what we're looking for when we want to know how expensive in memory certain objects are. This functionality is used by gc.get_referents.) However, this measure is going to be much more expansive in scope than we intended if we are not careful.

For example, functions know quite a lot about the modules they are created in.

Another point of contrast is that strings that are keys in dictionaries are usually interned so they are not duplicated. Checking for id(key) will also allow us to avoid counting duplicates, which we do in the next section. The blacklist solution skips counting keys that are strings altogether.

Whitelisted Types, Recursive visitor

To cover most of these types myself, instead of relying on the gc module, I wrote this recursive function to try to estimate the size of most Python objects, including most builtins, types in the collections module, and custom types (slotted and otherwise).

This sort of function gives much more fine-grained control over the types we're going to count for memory usage, but has the danger of leaving important types out:

import sys
from numbers import Number
from collections import deque
from collections.abc import Set, Mapping


ZERO_DEPTH_BASES = (str, bytes, Number, range, bytearray)


def getsize(obj_0):
    """Recursively iterate to sum size of object & members."""
    _seen_ids = set()
    def inner(obj):
        obj_id = id(obj)
        if obj_id in _seen_ids:
            return 0
        _seen_ids.add(obj_id)
        size = sys.getsizeof(obj)
        if isinstance(obj, ZERO_DEPTH_BASES):
            pass # bypass remaining control flow and return
        elif isinstance(obj, (tuple, list, Set, deque)):
            size += sum(inner(i) for i in obj)
        elif isinstance(obj, Mapping) or hasattr(obj, 'items'):
            size += sum(inner(k) + inner(v) for k, v in getattr(obj, 'items')())
        # Check for custom object instances - may subclass above too
        if hasattr(obj, '__dict__'):
            size += inner(vars(obj))
        if hasattr(obj, '__slots__'): # can have __slots__ with __dict__
            size += sum(inner(getattr(obj, s)) for s in obj.__slots__ if hasattr(obj, s))
        return size
    return inner(obj_0)

And I tested it rather casually (I should unittest it):

>>> getsize(['a', tuple('bcd'), Foo()])
344
>>> getsize(Foo())
16
>>> getsize(tuple('bcd'))
194
>>> getsize(['a', tuple('bcd'), Foo(), {'foo': 'bar', 'baz': 'bar'}])
752
>>> getsize({'foo': 'bar', 'baz': 'bar'})
400
>>> getsize({})
280
>>> getsize({'foo':'bar'})
360
>>> getsize('foo')
40
>>> class Bar():
...     def baz():
...         pass
>>> getsize(Bar())
352
>>> getsize(Bar().__dict__)
280
>>> sys.getsizeof(Bar())
72
>>> getsize(Bar.__dict__)
872
>>> sys.getsizeof(Bar.__dict__)
280

This implementation breaks down on class definitions and function definitions because we don't go after all of their attributes, but since they should only exist once in memory for the process, their size really doesn't matter too much.

Discussions

How to count bytes

You are most likely just printing it which usually means that you apply some ascii encoding/decoding so it becomes more readable to humans. \x00 does not have a nice character (it's the null character) so it is preserved. The rest that you see are interpreted bytes: B M F ; \x00...\x00 6 \x00...\x00 ( \x00...\x00 H \x00...\x00 F Edit: If you want to see decimal or hex representation instead, you can do this bytestring = b"BMF;\x00\x00\x00\x00\x00\x006\x00\x00\x00(\x00\x00\x00H\x00\x00\x00F" print([byte for byte in bytestring]) print([hex(byte) for byte in bytestring]) [66, 77, 70, 59, 0, 0, 0, 0, 0, 0, 54, 0, 0, 0, 40, 0, 0, 0, 72, 0, 0, 0, 70] ['0x42', '0x4d', '0x46', '0x3b', '0x0', '0x0', '0x0', '0x0', '0x0', '0x0', '0x36', '0x0', '0x0', '0x0', '0x28', '0x0', '0x0', '0x0', '0x48', '0x0', '0x0', '0x0', '0x46'] More on reddit.com

r/learnpython

October 8, 2023

Python: Get size of string in bytes - Stack Overflow

That's because the string "a" is an object in python that contains extra information. ... @Some Developer is there a way to get bytes for the string only, without extra information of the complete object? More on stackoverflow.com

stackoverflow.com

Get size in Bytes needed for an integer in Python - Stack Overflow

Then you get what would be called normal byte boundaries, etc.. etc... etc... If you just want what "purely" should be stored - minus object overhead, then from 2.(6|7) you can use some_int.bit_length() (otherwise just bitshift it as other answers have shown) and then work from there ... By the way, with the number 5, on top of all the other stuff you're talking about, there's interning to deal with. That means some Python ... More on stackoverflow.com

stackoverflow.com

Is there a way to get the bytes of any object in python?

https://scikit-learn.org/stable/model_persistence.html Don't re-invent the wheel, especially when the wheel was written by people who actually know what they're doing. More on reddit.com

r/learnpython

April 2, 2024

Python

docs.python.org › 3 › c-api › bytes.html

Bytes Objects — Python 3.14.3 documentation

February 23, 2026 - Return the length of the bytes in bytes object o. ... Thread safety: Atomic. Similar to PyBytes_Size(), but without error checking.

w3resource

w3resource.com › python-exercises › python-basic-exercise-79.php

Python: Get the size of an object in bytes - w3resource

May 17, 2025 - import sys # Import the sys module ... # Print the size in bytes of each variable print("Size of ", str1, "=", str(sys.getsizeof(str1)) + " bytes") print("Size of ", str2, "=", str(sys.getsizeof(str2)) + " bytes") print("Size of ...

GeeksforGeeks

geeksforgeeks.org › python › python-size-of-string-in-memory

Size of String in Memory - Python - GeeksforGeeks

July 11, 2025 - To get the number of bytes in a string using a memoryview, we can use the nbytes attribute of the memoryview object. This attribute returns the total size of the memoryview's data in bytes.

30 Seconds of Code

30secondsofcode.org › home › python › byte size

Byte size of a Python string - 30 seconds of code

August 16, 2024 - To get the byte size of a string, you can use the str.encode() method to encode the string and then return the length of the encoded string. def byte_size(s): return len(s.encode('utf-8')) byte_size('😀') # 4 byte_size('Hello World') # 11 ...

Bobby Hadz

bobbyhadz.com › blog › python-get-length-of-bytes

Get the length of a Bytes object in Python | bobbyhadz

April 8, 2024 - The same approach can be used to get the length of a bytearray. ... If you need to get the size of an object, use the sys.getsizeof() method.

reddit.com › r/learnpython › how to count bytes

r/learnpython on Reddit: How to count bytes

October 8, 2023 -

This bytestring should have 23 bytes.
b'BMF;\x00\x00\x00\x00\x00\x006\x00\x00\x00(\x00\x00\x00H\x00\x00\x00F'

How is this counted? I.e. what are the bytes in this string? How are they separated?

In the first part (b'BMF;) is BMF a byte?

Thanks for helping!

Top answer

1 of 3

2 of 3

There are 15 instances of the null character, displayed as \x00, in this string. The other 8 bytes are the other characters that are printed: B, M, F, ;, 6, (, H, F.

Find elsewhere

Google Bing Mojeek

Medium

medium.com › @glasshost › get-the-length-of-a-bytes-object-in-python-f940bba4027f

Get the length of a Bytes object in Python | by Glasshost | Medium

May 4, 2023 - Getting the length of a bytes object in Python is easy using the len() function. It is a simple and straightforward way to obtain the size of a bytes object.

Stack Overflow

stackoverflow.com › questions › 30686701 › python-get-size-of-string-in-bytes › 30686735

Python: Get size of string in bytes - Stack Overflow

Top answer

1 of 3

249

If you want the number of bytes in a string, this function should do it for you pretty solidly.

def utf8len(s):
    return len(s.encode('utf-8'))

The reason you got weird numbers is because encapsulated in a string is a bunch of other information due to the fact that strings are actual objects in Python.

It’s interesting because if you look at my solution to encode the string into 'utf-8', there's an 'encode' method on the 's' object (which is a string). Well, it needs to be stored somewhere right? Hence, the higher than normal byte count. Its including that method, along with a few others :).

2 of 3

You can use len(s.encode()), but there's a caveat.

The size in bytes of a string depends on the encoding you choose (by default "utf-8").

For some multi-byte encodings (e.g., UTF-16), string.encode will add a byte-order mark (BOM) at the start, which is a sequence of special bytes that inform the reader on the byte endianness used. So the length you get is actually len(BOM) + len(encoded_word).

If you don't want to count the BOM bytes, you can use either the little-endian version of the encoding (adding the suffix "-le") or the big-endian version (adding the suffix "be").

>>> len('ciao'.encode('utf-16'))
10
>>> len('ciao'.encode('utf-16-le'))
8

TutorialKart

tutorialkart.com › python › how-to-find-length-of-bytes-in-python

How to find Length of Bytes Object in Python?

January 15, 2021 - To find the length of a bytes object in Python, call len() builtin function and pass the bytes object as argument. len() function returns the number of bytes in the object.

DigitalOcean

digitalocean.com › community › tutorials › how-to-get-file-size-in-python

How to Get File Size in Python with os and pathlib | DigitalOcean

September 8, 2025 - The most straightforward way to get the size of a file in Python is by using the os.path.getsize() function. This function is part of the built-in os module and returns the size of the file in bytes.

GeeksforGeeks

geeksforgeeks.org › python › get-file-size-in-bytes-kb-mb-and-gb-using-python

Get File Size in Bytes, Kb, Mb, And Gb using Python - GeeksforGeeks

July 23, 2025 - import os file_path = 'example.txt' ... MB") print(f"File Size (GB): {file_size_gb:.2f} GB") ... The os.stat() function can provide more detailed information about a file, including its size....

GeeksforGeeks

geeksforgeeks.org › how-to-find-size-of-an-object-in-python

How to find size of an object in Python? - GeeksforGeeks

July 17, 2023 - In python, the usage of sys.getsizeof() can be done to find the storage size of a particular object that occupies some space in the memory. This function returns the size of the object in bytes.

Tutorial Reference

tutorialreference.com › python › examples › faq › python-how-to-get-length-of-string-and-of-bytes-object

How to Get String and Bytes Length and Size in Python | Tutorial Reference

This guide explores how to determine the length and memory size of string and bytes objects in Python. We'll cover using the len() function for character count, and the sys.getsizeof() method for memory size, as well as how to get a string's size in bytes by using the encoding.

Stack Abuse

stackabuse.com › bytes › determining-the-size-of-an-object-in-python

Determining the Size of an Object in Python

September 8, 2023 - Python provides a built-in function, sys.getsizeof(), which can be used to determine the size of an object. This function returns the size in bytes.

Python

docs.python.org › 3.4 › c-api › bytes.html

Bytes Objects — Python 3.4.10 documentation

June 16, 2019 - Return the length of the bytes in bytes object o. ... Macro form of PyBytes_Size() but without error checking.

Stack Overflow

stackoverflow.com › questions › 14329794 › get-size-in-bytes-needed-for-an-integer-in-python

Get size in Bytes needed for an integer in Python - Stack Overflow

Top answer

1 of 7

def byte_length(i):
    return (i.bit_length() + 7) // 8

Of course, as Jon Clements points out, this isn't the size of the actual PyIntObject, which has a PyObject header, and stores the value as a bignum in whatever way is easiest to deal with rather than most compact, and which you have to have at least one pointer (4 or 8 bytes) to on top of the actual object, and so on.

But this is the byte length of the number itself. It's almost certainly the most efficient answer, and probably also the easiest to read.

Or is ceil(i.bit_length() / 8.0) more readable?

2 of 7

Unless you're dealing with an array.array or a numpy.array - the size always has object overhead. And since Python deals with BigInts naturally, it's really, really hard to tell...

>>> i = 5
>>> import sys
>>> sys.getsizeof(i)
24

So on a 64bit platform it requires 24 bytes to store what could be stored in 3 bits.

However, if you did,

>>> s = '\x05'
>>> sys.getsizeof(s)
38

So no, not really - you've got the memory-overhead of the definition of the object rather than raw storage...

If you then take:

>>> a = array.array('i', [3])
>>> a
array('i', [3])
>>> sys.getsizeof(a)
60L
>>> a = array.array('i', [3, 4, 5])
>>> sys.getsizeof(a)
68L

Then you get what would be called normal byte boundaries, etc.. etc... etc...

If you just want what "purely" should be stored - minus object overhead, then from 2.(6|7) you can use some_int.bit_length() (otherwise just bitshift it as other answers have shown) and then work from there

LabEx

labex.io › tutorials › determining-string-byte-size-13593

Determining String Byte Size in Python | LabEx

Write a function byte_size(s) that takes a string s as input and returns its byte size. The byte size of a string is the number of bytes required to store the string in memory. To calculate the byte size of a string, you need to encode the string ...

reddit.com › r/learnpython › is there a way to get the bytes of any object in python?

r/learnpython on Reddit: Is there a way to get the bytes of any object in python?

April 2, 2024 -

I know how to get the number of the bytes (getsizeof) but I don't know how to get the bytes themselves.

Top answer

1 of 2

https://scikit-learn.org/stable/model_persistence.html Don't re-invent the wheel, especially when the wheel was written by people who actually know what they're doing.

2 of 2

Maybe look at Pickle ?