python sys.getsizeof in mb

Why does python seem to allocate more memory than sys.getsizeof accounts for?

stackoverflow.com › questions › 11750234 › why-does-python-seem-to-allocate-more-memory-than-sys-getsizeof-accounts-for

The memory assigned is not disproportional; you are creating 100,000 objects! As you can see, they take up roughly 34 megabytes of space:

>>> sys.getsizeof(Test())+sys.getsizeof(Test().__dict__)
344
>>> (sys.getsizeof(Test())+sys.getsizeof(Test().__dict__)) * 1000000 / 10**6
34.4 #megabytes

You can get a minor improvement with __slots__, but you will still need about 20MB of memory to store those 100,000 objects.

>>> sys.getsizeof(Test2())+sys.getsizeof(Test2().__slots__)
200
>>> sys.getsizeof(Test2())+sys.getsizeof(Test2().__slots__) * 1000000 / 10**6
20.0 #megabytes

(With credit to mensi's answer, sys.getsizeof is not taking into account references. You can autocomplete to see most of the attributes of an object.)

See SO answer: Usage of __slots__? http://docs.python.org/release/2.5.2/ref/slots.html

To use __slots__:

class Test2():
    __slots__ = ['a','b','c','d','e']

    def __init__(self):
        ...

Answer from ninjagecko on Stack Overflow

reddit.com › r/learnpython › size of python objects different? [real memory vs sys.getsizeof()]

r/learnpython on Reddit: Size of python objects different? [Real memory vs sys.getsizeof()]

November 11, 2016 -

Hi Pyople!

Yesterday I learned about sys.getsizeof() function and try some code. More specifically:

lst = [i for i in range(1000000000)]  # one mld numbers, creating for about a minute

When I use sys.getsizeof(lst), it returns: 8058558880. Which is correct. But when I look at my system resources in Linux Centos7 IPython (Python 3.4) I see: ipython Memory: 39592564 K Shared Mem: 5176 K - That's freaking 40GB.

I don't understand why, if a object is 8 GB in size, takes 40 KGB system memory. I tried it in list that had around 400 MB and system took 400 * 5 (approx) = 2 GB (approx)

Why is it taking 5-times more memory than it should? Or is the problem only because I tried it in iPython / Konsole? And in program it wouldn't be a problem?

Top answer

1 of 3

sys.getsizeof gives you the amount of memory allocated to the list itself, but you also have 10...00 int objects that the list only contains a pointer to.

2 of 3

The size of an object does not include the size of all the objects that that object refers to. For example:

>>> import sys
>>> foo = ['a' * 1000000]
>>> sys.getsizeof(foo)
40
>>> sys.getsizeof(foo[0])
1000025

foo is a list object that contains one item. Its size is 40 bytes, because that's how much memory it takes to store a list big enough to hold a reference to one object. That object happens to be about a megabyte in size, but it's a completely separate object from the list object and doesn't count towards the size of the list object.

Stack Overflow

stackoverflow.com › questions › 11750234 › why-does-python-seem-to-allocate-more-memory-than-sys-getsizeof-accounts-for

Why does python seem to allocate more memory than sys.getsizeof accounts for? - Stack Overflow

Top answer

1 of 2

The memory assigned is not disproportional; you are creating 100,000 objects! As you can see, they take up roughly 34 megabytes of space:

>>> sys.getsizeof(Test())+sys.getsizeof(Test().__dict__)
344
>>> (sys.getsizeof(Test())+sys.getsizeof(Test().__dict__)) * 1000000 / 10**6
34.4 #megabytes

You can get a minor improvement with __slots__, but you will still need about 20MB of memory to store those 100,000 objects.

>>> sys.getsizeof(Test2())+sys.getsizeof(Test2().__slots__)
200
>>> sys.getsizeof(Test2())+sys.getsizeof(Test2().__slots__) * 1000000 / 10**6
20.0 #megabytes

(With credit to mensi's answer, sys.getsizeof is not taking into account references. You can autocomplete to see most of the attributes of an object.)

See SO answer: Usage of __slots__? http://docs.python.org/release/2.5.2/ref/slots.html

To use __slots__:

class Test2():
    __slots__ = ['a','b','c','d','e']

    def __init__(self):
        ...

2 of 2

Every instance references a dict for it's __dict__ which is 272 bytes on my machine for your example. Multiply that by 100'000.

Discussions

What is the difference between len() and sys.getsizeof() methods in python? - Stack Overflow

Specifically, the sys.getsizeof() function includes the garbage collector overhead if any: getsizeof() calls the object’s __sizeof__ method and adds an additional garbage collector overhead if the object is managed by the garbage collector. String objects do not need to be tracked (they cannot create circular references), but string objects do need more memory than just the bytes per character. In Python ... More on stackoverflow.com

stackoverflow.com

BUG: Incorrect results from `sys.getsizeof()` for multi-dimensional arrays

Describe the issue: While sys.getsizeof() seems to work correctly for one-dimensional arrays, it gives, in my opinion, incorrect results for multi-dimensional arrays. import sys import numpy as np ... More on github.com

github.com

January 2, 2022

python - sys.getsizeof() results don't quite correlate to structure size - Stack Overflow

I am trying to create a list of size 1 MB. while the following code works: dummy = ['a' for i in xrange(0, 1024)] sys.getsizeof(dummy) Out[1]: 9032 More on stackoverflow.com

stackoverflow.com

April 29, 2017

Why does a 9 GB list appear to use 40 GB of memory?

The size of a list is only the list itself, not the elements in the list. In other words the only thing that governs the reported size of the list is the length of the list. The elements in the list need to be stored elsewhere. If instead of a range of integers you filled the list with a single object you would see the result you expect. z = [None] * 1000000000 More on reddit.com

r/learnpython

119

October 9, 2021

Ned Batchelder

nedbatchelder.com › blog › 202002 › sysgetsizeof_is_not_what_you_want

sys.getsizeof is not what you want | Ned Batchelder

February 9, 2020 - And sys.getsizeof also reports on the wrong bytes: ... Huh? How can a small integer be 28 bytes? And the one-character string “a” is 50 bytes!? It’s because Python objects have internal bookkeeping, like links to their type, and reference counts for managing memory.

Envato Tuts+

code.tutsplus.com › home › python

Understand How Much Memory Your Python Objects Use | Envato Tuts+

May 20, 2022 - It turns out that CPython has several tricks up its sleeve, so the numbers you get from deep\_getsizeof() don't fully represent the memory usage of a Python program.

Stack Abuse

stackabuse.com › bytes › determining-the-size-of-an-object-in-python

Determining the Size of an Object in Python

September 8, 2023 - Python provides a built-in function, sys.getsizeof(), which can be used to determine the size of an object.

Stack Overflow

stackoverflow.com › questions › 17574076 › what-is-the-difference-between-len-and-sys-getsizeof-methods-in-python

What is the difference between len() and sys.getsizeof() methods in python? - Stack Overflow

Top answer

1 of 2

They are not the same thing at all.

len() queries for the number of items contained in a container. For a string that's the number of characters:

Return the length (the number of items) of an object. The argument may be a sequence (string, tuple or list) or a mapping (dictionary).

sys.getsizeof() on the other hand returns the memory size of the object:

Return the size of an object in bytes. The object can be any type of object. All built-in objects will return correct results, but this does not have to hold true for third-party extensions as it is implementation specific.

Python string objects are not simple sequences of characters, 1 byte per character.

Specifically, the sys.getsizeof() function includes the garbage collector overhead if any:

getsizeof() calls the object’s __sizeof__ method and adds an additional garbage collector overhead if the object is managed by the garbage collector.

String objects do not need to be tracked (they cannot create circular references), but string objects do need more memory than just the bytes per character. In Python 2, __sizeof__ method returns (in C code):

Py_ssize_t res;
res = PyStringObject_SIZE + PyString_GET_SIZE(v) * Py_TYPE(v)->tp_itemsize;
return PyInt_FromSsize_t(res);

where PyStringObject_SIZE is the C struct header size for the type, PyString_GET_SIZE basically is the same as len() and Py_TYPE(v)->tp_itemsize is the per-character size. In Python 2.7, for byte strings, the size per character is 1, but it's PyStringObject_SIZE that is confusing you; on my Mac that size is 37 bytes:

>>> sys.getsizeof('')
37

For unicode strings the per-character size goes up to 2 or 4 (depending on compilation options). On Python 3.3 and newer, Unicode strings take up between 1 and 4 bytes per character, depending on the contents of the string.

For containers such as dictionaries or lists that reference other objects, the memory size given covers only the memory used by the container and the pointer values used to reference those other objects. There is no straightforward method of including the memory size of the ‘contained’ objects because those same objects could have many more references elsewhere and are not necessarily owned by a single container.

The documentation states it like this:

Only the memory consumption directly attributed to the object is accounted for, not the memory consumption of objects it refers to.

If you need to calculate the memory footprint of a container and anything referenced by that container you’ll have to use some method of traversing to those contained objects and get their size; the documentation points to a recursive recipe.

2 of 2

key difference is that len() will give actual length of elements in container , Whereas sys.getsizeof() will give it's memory size which it occupy

for more information read docs of python which is available at https://docs.python.org/3/library/sys.html#module-sys

GeeksforGeeks

geeksforgeeks.org › python › how-to-find-size-of-an-object-in-python

How to find size of an object in Python? - GeeksforGeeks

July 17, 2023 - In python, the usage of sys.getsizeof() can be done to find the storage size of a particular object that occupies some space in the memory. This function returns the size of the object in bytes.

GitHub

github.com › numpy › numpy › issues › 20707

BUG: Incorrect results from `sys.getsizeof()` for multi-dimensional arrays · Issue #20707 · numpy/numpy

January 2, 2022 - for size in [10, 100, 1_000, 10_000]: arr = np.arange(size) diff = sys.getsizeof(arr) - arr.nbytes print(f'{size: 6d}: {diff}')

Author pya

Find elsewhere

Google Bing Mojeek

Stack Overflow

stackoverflow.com › questions › 35332797 › sys-getsizeof-results-dont-quite-correlate-to-structure-size

python - sys.getsizeof() results don't quite correlate to structure size - Stack Overflow

Top answer

1 of 1

If you check the size of a list, it will be provide the size of the list data structure, including the pointers to its constituent elements. It won't consider the size of elements.

str1_size = sys.getsizeof(['a' for i in xrange(0, 1024)])
str2_size = sys.getsizeof(['abc' for i in xrange(0, 1024)])
int_size = sys.getsizeof([123 for i in xrange(0, 1024)])
none_size = sys.getsizeof([None for i in xrange(0, 1024)])
str1_size == str2_size == int_size == none_size

The size of empty list: sys.getsizeof([]) == 72
Add an element: sys.getsizeof([1]) == 80
Add another element: sys.getsizeof([1, 1]) == 88
So each element adds 4 bytes.
To get 1024 bytes, we need (1024 - 72) / 8 = 119 elements.

The size of the list with 119 elements: sys.getsizeof([None for i in xrange(0, 119)]) == 1080.
This is because a list maintains an extra buffer for inserting more items, so that it doesn't have to resize every time. (The size comes out to be same as 1080 for number of elements between 107 and 126).

So what we need is an immutable data structure, which doesn't need to keep this buffer - tuple.

empty_tuple_size = sys.getsizeof(())                     # 56
single_element_size = sys.getsizeof((1,))                # 64
pointer_size = single_element_size - empty_tuple_size    # 8
n_1mb = (1024 - empty_tuple_size) / pointer_size         # (1024 - 56) / 8 = 121
tuple_1mb = (1,) * n_1mb
sys.getsizeof(tuple_1mb) == 1024

So this is your answer to get a 1MB data structure: (1,)*121

But note that this is only the size of tuple and the constituent pointers. For the total size, you actually need to add up the size of individual elements.

Alternate:

sys.getsizeof('') == 37
sys.getsizeof('1') == 38     # each character adds 1 byte

For 1 MB, we need 987 characters:

sys.getsizeof('1'*987) == 1024

And this is the actual size, not just the size of pointers.

Codedamn

codedamn.com › news › python

How to Determine the Size of Objects in Python

July 2, 2023 - Python provides a built-in module named 'sys' which has a method called 'getsizeof()' that can be used to get the size of an object.

w3resource

w3resource.com › python-exercises › python-basic-exercise-79.php

Python: Get the size of an object in bytes - w3resource

May 17, 2025 - Write a Python program to get the size of an object in bytes. ... import sys # Import the sys module to use sys.getsizeof() # Define three strings and assign values to them str1 = "one" str2 = "four" str3 = "three" x = 0 y = 112 z = 122.56 # Print the size in bytes of each variable print("Size of ", str1, "=", str(sys.getsizeof(str1)) + " bytes") print("Size of ", str2, "=", str(sys.getsizeof(str2)) + " bytes") print("Size of ", str3, "=", str(sys.getsizeof(str3)) + " bytes") print("Size of", x, "=", str(sys.getsizeof(x)) + " bytes") print("Size of", y, "=", str(sys.getsizeof(y)) + " bytes") #

reddit.com › r/learnpython › why does a 9 gb list appear to use 40 gb of memory?

r/learnpython on Reddit: Why does a 9 GB list appear to use 40 GB of memory?

October 9, 2021 -

Can someone help me understand what's going on here? My OS reports that a python process whose only large object is a 9 GB list is consuming 40.6 GB of system memory. I repeated this test several times with both the interactive and standard interpreters and the results are pretty consistent.

import sys
import psutil

#memory in use prior to generating list
prior_used = psutil.virtual_memory().used

print(f"Prior used: {round(prior_used/1e9, 2)} GB")
z = [*range(1000000000)]
list_size = sys.getsizeof(z)
print(f"List size: {round(list_size/1e9, 2)} GB")

#memory in use after to generating list
post_used = psutil.virtual_memory().used
print(f"Post used: {round(post_used/1e9, 2)} GB")

difference = post_used - prior_used
print(f"Memory used by list: {round(difference/1e9, 2)} GB")

#clear the list
z = None
after_deleting = psutil.virtual_memory().used
print(f"Memory used after clearing the list: {round(after_deleting/1e9, 2)} GB")

# output:
# Prior used: 3.27 GB
# List size: 9.0 GB
# Post used: 43.87 GB
# Memory used by list: 40.6 GB
# Memory used after clearing the list: 3.28 GB

Top answer

1 of 5

2 of 5

Could you try recursuve approach described here I believe you have size of data consumed by list itself (9G), but real data consume much more

GeeksforGeeks

geeksforgeeks.org › python › difference-between-__sizeof__-and-getsizeof-method-python

Difference between __sizeof__() and getsizeof() method - Python - GeeksforGeeks

July 12, 2025 - A function from the sys module that measures an object’s size in bytes, including extra memory used by Python’s garbage collector, it calls '__sizeof__()' internally but adds the garbage collector’s overhead—extra memory Python reserves to manage objects. Let's explore it using an example: ... import sys a = [1, 2] # Small list b = [1, 2, 3, 4] # Medium list d = [2, 3, 1, 4, 66, 54, 45, 89] # Larger list print(sys.getsizeof(a)) print(sys.getsizeof(b)) print(sys.getsizeof(d))

Towards Data Science

towardsdatascience.com › home › latest › unexpected size of python objects in memory

Unexpected Size of Python Objects in Memory | Towards Data Science

March 5, 2025 - >>> sys.getsizeof('') 49 >>> sys.getsizeof('P') 50 >>> sys.getsizeof('Py') 51 >>> sys.getsizeof('Pyt') 52 · First, I have an empty string. It took 49 bytes! Then I have a string with only one character, and its size is 50 bytes. I added more characters, and it seems that each character adds one byte to the size of my string object. How do we explain this observation? Actually, it is easy. In Python, like almost everything else, a string is an object, not only a collection of characters.

Stack Overflow

stackoverflow.com › questions › 30255307 › what-does-python-sys-getsizeof-for-string-return

What does python sys getsizeof for string return? - Stack Overflow

Top answer

1 of 2

I will attempt to answer your question from a broader point of view. You're referring to two functions and comparing their outputs. Let's take a look at their documentation first:

len():

Return the length (the number of items) of an object. The argument may be a sequence (such as a string, bytes, tuple, list, or range) or a collection (such as a dictionary, set, or frozen set).

So in case of string, you can expect len() to return the number of characters.

sys.getsizeof():

Return the size of an object in bytes. The object can be any type of object. All built-in objects will return correct results, but this does not have to hold true for third-party extensions as it is implementation specific.

So in case of string (as with many other objects) you can expect sys.getsizeof() the size of the object in bytes. There is no reason to think that it should be the same as the number of characters.

Let's have a look at some examples:

>>> first = "First"
>>> len(first)
5
>>> sys.getsizeof(first)
42

This example confirms that the size is not the same as the number of characters.

>>> second = "Second"
>>> len(second)
6
>>> sys.getsizeof(second)
43

We can notice that if we look at a string one character longer, its size is one byte bigger as well. We don't know if it's a coincidence or not though.

>>> together = first + second
>>> print(together)
FirstSecond
>>> len(together)
11

If we concatenate the two strings, their combined length is equal to the sum of their lengths, which makes sense.

>>> sys.getsizeof(together)
48

Contrary to what someone might expect though, the size of the combined string is not equal to the sum of their individual sizes. But it still seems to be the length plus something. In particular, something worth 37 bytes. Now you need to realize that it's 37 bytes in this particular case, using this particular Python implementation etc. You should not rely on that at all. Still, we can take a look why it's 37 bytes what they are (approximately) used for.

String objects are in CPython (probably the most widely used implementation of Python) implemented as PyStringObject. This is the C source code (I use the 2.7.9 version):

typedef struct {
    PyObject_VAR_HEAD
    long ob_shash;
    int ob_sstate;
    char ob_sval[1];

    /* Invariants:
     *     ob_sval contains space for 'ob_size+1' elements.
     *     ob_sval[ob_size] == 0.
     *     ob_shash is the hash of the string or -1 if not computed yet.
     *     ob_sstate != 0 iff the string object is in stringobject.c's
     *       'interned' dictionary; in this case the two references
     *       from 'interned' to this object are *not counted* in ob_refcnt.
     */
} PyStringObject;

You can see that there is something called PyObject_VAR_HEAD, one int, one long and a char array. The char array will always contain one more character to store the '\0' at the end of the string. This, along with the int, long and PyObject_VAR_HEAD take the additional 37 bytes. PyObject_VAR_HEAD is defined in another C source file and it refers to other implementation-specific stuff, you need to explore if you want to find out where exactly are the 37 bytes. Plus, the documentation mentions that sys.getsizeof()

adds an additional garbage collector overhead if the object is managed by the garbage collector.

Overall, you don't need to know what exactly takes the something (the 37 bytes here) but this answer should give you a certain idea why the numbers differ and where to find more information should you really need it.

2 of 2

To quote the documentation:

Return the size of an object in bytes. The object can be any type of object. All built-in objects will return correct results, but this does not have to hold true for third-party extensions as it is implementation specific.

Built in strings are not simple character sequences - they are full fledged objects, with garbage collection overhead, which probably explains the size discrepancy you're noticing.

Stack Overflow

stackoverflow.com › questions › 6080477 › how-to-get-the-size-of-tar-gz-in-mb-file-in-python

linux - How to get the size of tar.gz in (MB) file in python - Stack Overflow

Top answer

1 of 5

It's not clear from your question whether you want to the compressed or uncompressed size of the file, but in the former case, it's easy with the os.path.getsize function from the os module

>>> import os
>>> os.path.getsize('flickrapi-1.2.tar.gz')
35382L

To get the answer in megabytes you can shift the answer right by 20, e.g.

os.path.getsize('large.tar.gz') >> 20

Although that operation will be done in integers - if you want to preserve fractions of a megabyte, divide by (1024*1024.0) instead. (Note the .0 so that the divisor will be a float.)

Update: In the comments below, Johnsyweb points out a useful recipe for more generally producing human readable representations of file sizes.

2 of 5

To get file size in MB, I created this function:

import os

def get_size(path):
        size = os.path.getsize(path)
        if size < 1024:
            return f"{size} bytes"
        elif size < pow(1024,2):
            return f"{round(size/1024, 2)} KB"
        elif size < pow(1024,3):
            return f"{round(size/(pow(1024,2)), 2)} MB"
        elif size < pow(1024,4):
            return f"{round(size/(pow(1024,3)), 2)} GB"

>>> get_size("k.tar.gz")
1.4MB

GoShippo

goshippo.com › blog › measure-real-size-any-python-object

How to Measure the Real Size of Any Object in Python

April 14, 2025 - When you measure a size of an object, you really want the size of all of it’s attributes, and their attributes, etc. sys.getsizeof only gives you the size of the object and their attributes, however it does not recursively add the size of sub-attributes. So I decided to fill in the gap. I wrote the helper below to recursively measure the size of a Python object (or dict, tuple, list etc).

Python

mail.python.org › pipermail › tutor › 2013-July › 096983.html

[Tutor] object size in python is in what units?

July 23, 2013 - The code I linked to shows the exact strategy used for this and if you read this you'll have a vague understanding of the results you see. You can check the size in bytes of an object with sys.getsizeof e.g.: >>> import sys >>> sys.getsizeof(lardKronk) 36 Note that I'm running Python 2.7, Windows XP, 32-bit to get that result and it is different on different platforms.

TutorialsPoint

tutorialspoint.com › difference-between-sizeof-and-getsizeof-method-in-python

Difference between __sizeof__() and getsizeof() method in Python

April 17, 2023 - The getsizeof() operator internally calls the __sizeof__() operator and adds an extra overhead while returning the size of the object for garbage collection. It returns 64 bytes(depending upon the system it can vary) for an empty list and 8 bytes for each list element.

datagy

datagy.io › home › python posts › how to get file size in python in bytes, kb, mb, and gb

How to Get File Size in Python in Bytes, KB, MB, and GB • datagy

November 4, 2022 - You first learned how to use the os library to explore two different methods of getting the size of a file. Then, you learned how to use the Python pathlib library to get the size of a file. Finally, you learned how to use these methods to convert the sizes of files to KB, MB, and GB.