Properties of a Python float can be requested via sys.float_info. It returns information such as max/min value, max/min exp value, etc. These properties can potentially be used to calculate the byte size of a float. I never encountered anything else than 64 bit, though, on many different architectures.
The items of a NumPy array might have different size, but you can check their size in bytes by a.itemsize, where a is a NumPy array.
Properties of a Python float can be requested via sys.float_info. It returns information such as max/min value, max/min exp value, etc. These properties can potentially be used to calculate the byte size of a float. I never encountered anything else than 64 bit, though, on many different architectures.
The items of a NumPy array might have different size, but you can check their size in bytes by a.itemsize, where a is a NumPy array.
numpy.finfo lists sizes and other attributes of float32 ..., including
nexp : number of bits in the exponent including its sign and bias.
nmant : number of bits in the mantissa.
On a machine with IEEE-754
standard floating point,
import numpy as np
for f in (np.float32, np.float64, float):
finfo = np.finfo(f)
print finfo.dtype, finfo.nexp, finfo.nmant
will print e.g.
float32 8 23
float64 11 52
float64 11 52
(Try float16 and float128 too.)
Hi, not sure if this belongs here or in the main python sub, but decided to try here first. Anyways, I'm working on a project where I have a large text file which essentially contains a roughly 22,000x6,000 matrix of decimal values. This file is >2GB in size, however it's all encoded as text so the size of the data should become considerably smaller once I read them in and represent them as floats (they are mostly a single-digit integer part with 14 digit fractional part, along with the decimal point itself, each character of which UTF-8 can represent as a single byte since it's in the ascii range so each value should use 16 bytes if my calculations are correct). I figure that python represents floats as 8-byte double precision values, as the documentation says CPython will use a C double for your platform to represent them. My googling seems to support this assumption. That would mean that my original matrix in the file should be about 16220006000 ~= 2 GB (not including the whitespace separators), and the matrix in memory as floats should be 8220006000 ~= 1 GB, which while not small, should fit comfortably in memory on my laptop. However, this is not the case and I'm getting an out of memory error as a result.
As a result, I did some investigation and did >>> x = 3.14 >>> sys.getsizeof(x) 24 What the hell, how can this be??? What could CPython possibly be doing with 24 bytes? All I can figure is that there's an 8-byte pointer to a 16-byte double (huge!) and getsizeof() is counting both? Note that this is 64-bit CPython running on linux. I tried again on a 32-bit CPython running on windows and the answer I got was 16 bytes, so I figure this means a 4-byte pointer to an 8-byte double? Somewhat frustrated with this result, I tried again, this time using numpy.float16 because these are relatively small values and I don't need all that space anyways. For the linux 64-bit CPython I got 24 bytes again, and on the Windows 32-bit CPython I got 12 bytes. No idea at all what's happening here because if my previous guesses were correct I should have 8 and 4 byte pointers to 2-byte float16s, giving 10 and 6 bytes, respectively. Clearly I'm missing something here. Can anyone explain what's happening to me, and why so much memory is being used to store these values?
Also, I'm aware that my use case is memory intensive and I'm working on refactoring so that I don't need to have everything in memory at once. However, I'm still extremely curious about this behavior as it seems strange to me. Thanks!
Videos
Why does Python have a limit on float size?
What is the approximate range of Python float size?
How does the limited python float size affect the precision of calculations?
Running
sys.getsizeof(float)
does not return the size of any individual float, it returns the size of the float class. That class contains a lot more data than just any single float, so the returned size will also be much bigger.
If you just want to know the size of a single float, the easiest way is to simply instantiate some arbitrary float. For example:
sys.getsizeof(float())
Note that
float()
simply returns 0.0, so this is actually equivalent to:
sys.getsizeof(0.0)
This returns 24 bytes in your case (and probably for most other people as well). In the case of CPython (the most common Python implementation), every float object will contain a reference counter and a pointer to the type (a pointer to the float class), which will each be 8 bytes for 64bit CPython or 4 bytes each for 32bit CPython. The remaining bytes (24 - 8 - 8 = 8 in your case which is very likely to be 64bit CPython) will be the bytes used for the actual float value itself.
This is not guaranteed to work out the same way for other Python implementations though. The language reference says:
These represent machine-level double precision floating point numbers. You are at the mercy of the underlying machine architecture (and C or Java implementation) for the accepted range and handling of overflow. Python does not support single-precision floating point numbers; the savings in processor and memory usage that are usually the reason for using these are dwarfed by the overhead of using objects in Python, so there is no reason to complicate the language with two kinds of floating point numbers.
and I'm not aware of any runtime methods to accurately tell you the number of bytes used. However, note that the quote above from the language reference does say that Python only supports double precision floats, so in most cases (depending on how critical it is for you to always be 100% right) it should be comparable to double precision in C.
import ctypes
ctypes.sizeof(ctypes.c_double)
For float have a look at sys.float_info:
>>> import sys
>>> sys.float_info
sys.float_info(max=1.7976931348623157e+308, max_exp=1024, max_10_exp=308,
min=2.2250738585072014e-308, min_exp=-1021, min_10_exp=-307, dig=15, mant_dig=53,
epsilon=2.220446049250313e-16, radix=2, rounds=1)
Specifically, sys.float_info.max:
>>> sys.float_info.max
1.7976931348623157e+308
If that's not big enough, there's always positive infinity:
>>> infinity = float("inf")
>>> infinity
inf
>>> infinity / 10000
inf
int has unlimited precision, so it's only limited by available memory.
sys.maxsize (previously sys.maxint) is not the largest integer supported by python. It's the largest integer supported by python's regular integer type.