a = np.array([0.123456789121212,2,3], dtype=np.float16)
print("16bit: ", a[0])
a = np.array([0.123456789121212,2,3], dtype=np.float32)
print("32bit: ", a[0])
b = np.array([0.123456789121212121212,2,3], dtype=np.float64)
print("64bit: ", b[0])
- 16bit: 0.1235
- 32bit: 0.12345679
- 64bit: 0.12345678912121212
float32 is a 32 bit number - float64 uses 64 bits.
That means that float64’s take up twice as much memory - and doing operations on them may be a lot slower in some machine architectures.
However, float64’s can represent numbers much more accurately than 32 bit floats.
They also allow much larger numbers to be stored.
For your Python-Numpy project I'm sure you know the input variables and their nature.
To make a decision we as programmers need to ask ourselves
- What kind of precision does my output need?
- Is speed not an issue at all?
- what precision is needed in parts per million?
A naive example would be if I store weather data of my city as [12.3, 14.5, 11.1, 9.9, 12.2, 8.2]
Next day Predicted Output could be of 11.5 or 11.5164374
do your think storing float 32 or float 64 would be necessary?
I know javascript has problems when number goes big, usual thing is that trailing digits will be truncated to zero. And I wonder what that looks like in golang, so I write a program:
see https://go.dev/play/p/2rbKFNiupQ_6
package main
import "fmt"
func main() {
var v1 float64 = 1876219900889841660
var v2 float64 = 1876219900889841661
var v3 float64 = 1876219900889841662
var v4 float64 = 1876219900889841663
var v5 float64 = 1876219900889841664
var v6 float64 = 1876219900889841665
var v7 float64 = 1876219900889841666
var v8 float64 = 1876219900889841667
var v9 float64 = 1876219900889841668
fmt.Printf("v1==v2: %v\n", v1 == v2) // true
fmt.Printf("v2==v3: %v\n", v2 == v3) // true
fmt.Printf("v3==v4: %v\n", v3 == v4) // true
fmt.Printf("v4==v5: %v\n", v4 == v5) // true
fmt.Printf("v5==v6: %v\n", v5 == v6) // true
fmt.Printf("v6==v7: %v\n", v6 == v7) // true
fmt.Printf("v7==v8: %v\n", v7 == v8) // true
fmt.Printf("v8==v9: %v\n", v8 == v9) // true
fmt.Printf("int64(v4): %d\n", int64(v4)) // 1876219900889841664
fmt.Printf("int64(v9): %d\n", int64(v9)) // 1876219900889841664
fmt.Printf("float64(v9): %.0f\n", v9) // 1876219900889841664
}Why all float64 numbers are printed as 1876219900889841664? In javascript this is 1876219900889841700. Anyone can give an explanation please? Thanks.
>>> numpy.float64(5.9975).hex()
'0x1.7fd70a3d70a3dp+2'
>>> (5.9975).hex()
'0x1.7fd70a3d70a3dp+2'
They are the same number. What differs is the textual representation obtained via by their __repr__ method; the native Python type outputs the minimal digits needed to uniquely distinguish values, while NumPy code before version 1.14.0, released in 2018 didn't try to minimise the number of digits output.
Numpy float64 dtype inherits from Python float, which implements C double internally. You can verify that as follows:
isinstance(np.float64(5.9975), float) # True
So even if their string representation is different, the values they store are the same.
On the other hand, np.float32 implements C float (which has no analog in pure Python) and no numpy int dtype (np.int32, np.int64 etc.) inherits from Python int because in Python 3 int is unbounded:
isinstance(np.float32(5.9975), float) # False
isinstance(np.int32(1), int) # False
So why define np.float64 at all?
np.float64 defines most of the attributes and methods in np.ndarray. From the following code, you can see that np.float64 implements all but 4 methods of np.array:
[m for m in set(dir(np.array([]))) - set(dir(np.float64())) if not m.startswith("_")]
# ['argpartition', 'ctypes', 'partition', 'dot']
So if you have a function that expects to use ndarray methods, you can pass np.float64 to it while float doesn't give you the same.
For example:
def my_cool_function(x):
return x.sum()
my_cool_function(np.array([1.5, 2])) # <--- OK
my_cool_function(np.float64(5.9975)) # <--- OK
my_cool_function(5.9975) # <--- AttributeError