The short answer
You're getting the size of the class, not of an instance of the class. Call int to get the size of an instance:
>>> sys.getsizeof(int())
28
If that size still seems a little bit large, remember that a Python int is very different from an int in (for example) C. In Python, an int is a fully-fledged object. This means there's extra overhead.
Every Python object contains at least a refcount and a reference to the object's type in addition to other storage; on a 64-bit machine, just those two things alone take up 16 bytes! The int internals (as determined by the standard CPython implementation) have also changed over time, so that the amount of additional storage taken depends on your version.
int objects in CPython 3.11
Integer objects are internally PyLongObject C types representing blocks of memory. The code that defines this type is spread across multiple files. Here are the relevant parts:
typedef struct _longobject PyLongObject;
struct _longobject {
PyObject_VAR_HEAD
digit ob_digit[1];
};
#define PyObject_VAR_HEAD PyVarObject ob_base;
typedef struct {
PyObject ob_base;
Py_ssize_t ob_size; /* Number of items in variable part */
} PyVarObject;
typedef struct _object PyObject;
struct _object {
_PyObject_HEAD_EXTRA
union {
Py_ssize_t ob_refcnt;
#if SIZEOF_VOID_P > 4
PY_UINT32_T ob_refcnt_split[2];
#endif
};
PyTypeObject *ob_type;
};
/* _PyObject_HEAD_EXTRA is nothing on non-debug builds */
# define _PyObject_HEAD_EXTRA
typedef uint32_t digit;
If we expand all the macros and replace all the typedef statements, this is the struct we end up with:
struct PyLongObject {
Py_ssize_t ob_refcnt;
PyTypeObject *ob_type;
Py_ssize_t ob_size; /* Number of items in variable part */
uint32_t ob_digit[1];
};
uint32_t means "unsigned 32-bit integer" and uint32_t ob_digit[1]; means an array of 32-bit integers is used to hold the (absolute) value of the integer. The "1" in "ob_digit[1]" means the array should be initialized with space for 1 element.
So we have the following bytes to store an integer object in Python (on a 64-bit system):
- 8 bytes (64 bits,
Py_ssize_t, signed) forob_refcnt- the reference count - 8 bytes (64 bits,
PyTypeObject*) forob_type- the pointer to theintclass itself - 8 bytes (64 bits,
Py_ssize_t, signed) forob_size- which stores how many 32-bit integers are used to store the integer
and finally a variable-length array (with at least 1 element) of
- 4 bytes (32 bits) to store each part of the integer
The comment that accompanies this definition summarizes Python 3.11's representation of integers. Zero is represented not by an object with size (ob_size) zero (the actual size is always at least 1 though). Negative numbers are represented by objects with a negative size attribute! This comment further explains that only 30 bits of each uint32_t are used for storing the value.
>>> sys.getsizeof(0)
28
>>> sys.getsizeof(1)
28
>>> sys.getsizeof(2 ** 30 - 1)
28
>>> sys.getsizeof(2 ** 30)
32
>>> sys.getsizeof(2 ** 60 - 1)
32
>>> sys.getsizeof(2 ** 60)
36
On CPython 3.10 and older, sys.getsizeof(0) incorrectly returns 24 instead of 28, this was a bug that was fixed. Python 2 had a second, separate type of integer which worked a bit differently, but generally similar.
You will get slightly different results on a 32-bit system.
Answer from senderle on Stack OverflowGoing through the docs I didn't find a definite answer
If I do
x = 234234 sys.getsizeof(x)
It returns 28, so that's 224 bits ?
Apparently getsizeof() returns the size of an object in bytes, is this size consistent across all int's in python though?
I know that there are long ints, and that they are converted automatically to long ints at a certain value in python, but I'm just wondering about ints for now.
In Java the size that I got was 32 bit, so 4 bytes.
Is this correct, that :
python int = 224 bits java int = 32 bits
And is this because the python int has more methods etc?
Basically what I was trying to state was that across programming languages the data type sizes were fairly conventional and that an int in C# would be the same size as an int in Python / Java. I think this might be wrong though.
Thanks!
In Python, integers have arbitrary magnitude. You need not worry about how many bits are used in memory to store the integer; integers can be arbitrarily large (or large negative).
In this case, the 224 bits you are seeing are not all used to represent the integer; a lot of that is overhead of representing a Python object (as opposed to a native value in C-like languages).
EDIT: See also: https://docs.python.org/3/library/stdtypes.html#typesnumeric
is this because the python int has more methods etc?
Almost but not quite.
Internally, all python objects contain at least a reference count used by the garbage collector and a pointer to a [statically-allocated] type object (which is essentially an array of function pointers to the methods available on objects of that type). Objects with variable size (like lists, dicts, and python 3's variable-precision ints) also contain a size value.
So that's 3 values stored in the "header" of each python object. On a 32-bit platform these 3 values will most likely take 3*32 bits, and on a 64-bit platform they will take 3*64 bits. So any variable-size python object with zero size is going to take at least 24 bytes on a 64-bit machine.
Beyond that, the value of a python long integer is actually stored in an array of 15- or 30-bit chunks (depending on your build). On my 64-bit build, you can see the size of integers increase steadily as I increase their size by increments of 30 bits.
>>> [sys.getsizeof(2**(n*30) - 1) for n in range(100)]
[24, 28, 32, 36, 40, 44, 48, 52, 56, 60, 64, 68, 72, 76, 80, 84, 88, 92, 96,
100, 104, 108, 112, 116, 120, 124, 128, 132, 136, 140, 144, 148, 152, 156,
160, 164, 168, 172, 176, 180, 184, 188, 192, 196, 200, 204, 208, 212, 216,
220, 224, 228, 232, 236, 240, 244, 248, 252, 256, 260, 264, 268, 272, 276,
280, 284, 288, 292, 296, 300, 304, 308, 312, 316, 320, 324, 328, 332, 336,
340, 344, 348, 352, 356, 360, 364, 368, 372, 376, 380, 384, 388, 392, 396,
400, 404, 408, 412, 416, 420]
And there you have it, demystified I hope.
What is the size of an int in Python? - Stack Overflow
integer - How does Python manage int and long? - Stack Overflow
Get size in Bytes needed for an integer in Python - Stack Overflow
How much memory do int's take up in python?
In Python, integers have arbitrary magnitude. You need not worry about how many bits are used in memory to store the integer; integers can be arbitrarily large (or large negative).
In this case, the 224 bits you are seeing are not all used to represent the integer; a lot of that is overhead of representing a Python object (as opposed to a native value in C-like languages).
EDIT: See also: https://docs.python.org/3/library/stdtypes.html#typesnumeric
More on reddit.comVideos
int and long were "unified" a few versions back. Before that it was possible to overflow an int through math ops.
3.x has further advanced this by eliminating long altogether and only having int.
- Python 2:
sys.maxintcontains the maximum value a Python int can hold.- On a 64-bit Python 2.7, the size is 24 bytes. Check with
sys.getsizeof().
- On a 64-bit Python 2.7, the size is 24 bytes. Check with
- Python 3:
sys.maxsizecontains the maximum size in bytes a Python int can be.- This will be gigabytes in 32 bits, and exabytes in 64 bits.
- Such a large int would have a value similar to 8 to the power of
sys.maxsize.
This PEP should help.
Bottom line is that you really shouldn't have to worry about it in python versions > 2.4
def byte_length(i):
return (i.bit_length() + 7) // 8
Of course, as Jon Clements points out, this isn't the size of the actual PyIntObject, which has a PyObject header, and stores the value as a bignum in whatever way is easiest to deal with rather than most compact, and which you have to have at least one pointer (4 or 8 bytes) to on top of the actual object, and so on.
But this is the byte length of the number itself. It's almost certainly the most efficient answer, and probably also the easiest to read.
Or is ceil(i.bit_length() / 8.0) more readable?
Unless you're dealing with an array.array or a numpy.array - the size always has object overhead. And since Python deals with BigInts naturally, it's really, really hard to tell...
>>> i = 5
>>> import sys
>>> sys.getsizeof(i)
24
So on a 64bit platform it requires 24 bytes to store what could be stored in 3 bits.
However, if you did,
>>> s = '\x05'
>>> sys.getsizeof(s)
38
So no, not really - you've got the memory-overhead of the definition of the object rather than raw storage...
If you then take:
>>> a = array.array('i', [3])
>>> a
array('i', [3])
>>> sys.getsizeof(a)
60L
>>> a = array.array('i', [3, 4, 5])
>>> sys.getsizeof(a)
68L
Then you get what would be called normal byte boundaries, etc.. etc... etc...
If you just want what "purely" should be stored - minus object overhead, then from 2.(6|7) you can use some_int.bit_length() (otherwise just bitshift it as other answers have shown) and then work from there