The problem is that you do not do any type conversion of the numpy array. You calculate a float32 variable and put it as an entry into a float64 numpy array. numpy then converts it properly back to float64
Try someting like this:
a = np.zeros(4,dtype="float64")
print a.dtype
print type(a[0])
a = np.float32(a)
print a.dtype
print type(a[0])
The output (tested with python 2.7)
float64
<type 'numpy.float64'>
float32
<type 'numpy.float32'>
a is in your case the array tree.tree_.threshold
Answer from Glostas on Stack OverflowThe problem is that you do not do any type conversion of the numpy array. You calculate a float32 variable and put it as an entry into a float64 numpy array. numpy then converts it properly back to float64
Try someting like this:
a = np.zeros(4,dtype="float64")
print a.dtype
print type(a[0])
a = np.float32(a)
print a.dtype
print type(a[0])
The output (tested with python 2.7)
float64
<type 'numpy.float64'>
float32
<type 'numpy.float32'>
a is in your case the array tree.tree_.threshold
Actually i tried hard but not able to do as the 'sklearn.tree._tree.Tree' objects is not writable.
It is causing a precision issue while generating a PMML file, so i raised a bug over there and they gave an updated solution for it by not converting it in to the Float64 internally.
For more info, you can follow this link: Precision Issue
The problem is not that you're failing to set a float64 dtype. The error message says:
Input contains NaN, infinity or a value too large for dtype('float32').
So try checking for those conditions first:
assert not np.any(np.isnan(x_arr))
assert np.all(np.isfinite(x_arr))
assert np.all(x_arr <= finfo('float32').max)
assert np.all(x_arr >= finfo('float32').min)
I got here because of the question in the title which I still feel remains unanswered.
To convert float32 to float64 in a numpy.ndarray object:
array32 = np.ndarray(shape=(2,2), dtype=np.float32, order='F')
print("Type of an object of 'array32': " + str(type(array32[0][0])))
# Convert to float64
array64 = array32.astype(np.float64)
print("Type of an object of 'array64': " + str(type(array64[0][0])))
# Convert back to float32
array32again = array64.astype(np.float32)
print("Type of an object of 'array32again': " + str(type(array32again[0][0])))
Will give you:
Type of an object of 'array32': <class 'numpy.float32'>
Type of an object of 'array64': <class 'numpy.float64'>
Type of an object of 'array32again': <class 'numpy.float32'>
Yes, actually when you use Python's native float to specify the dtype for an array , numpy converts it to float64. As given in documentation -
Note that, above, we use the Python float object as a dtype. NumPy knows that
intrefers tonp.int_,boolmeansnp.bool_, thatfloatisnp.float_andcomplexisnp.complex_. The other data-types do not have Python equivalents.
And -
float_ - Shorthand for float64.
This is why even though you use float to convert the whole array to float , it still uses np.float64.
According to the requirement from the other question , the best solution would be converting to normal float object after taking each scalar value as -
float(new_array[0])
A solution that I could think of is to create a subclass for float and use that for casting (though to me it looks bad). But I would prefer the previous solution over this if possible. Example -
In [20]: import numpy as np
In [21]: na = np.array([1., 2., 3.])
In [22]: na = np.array([1., 2., 3., np.inf, np.inf])
In [23]: type(na[-1])
Out[23]: numpy.float64
In [24]: na[-1] - na[-2]
C:\Anaconda3\Scripts\ipython-script.py:1: RuntimeWarning: invalid value encountered in double_scalars
if __name__ == '__main__':
Out[24]: nan
In [25]: class x(float):
....: pass
....:
In [26]: na_new = na.astype(x)
In [28]: type(na_new[-1])
Out[28]: float #No idea why its showing float, I would have thought it would show '__main__.x' .
In [29]: na_new[-1] - na_new[-2]
Out[29]: nan
In [30]: na_new
Out[30]: array([1.0, 2.0, 3.0, inf, inf], dtype=object)
You can create an anonymous type float like this
>>> new_array = my_array.astype(type('float', (float,), {}))
>>> type(new_array[0])
<type 'float'>
Use val.item() to convert most NumPy values to a native Python type:
import numpy as np
# for example, numpy.float32 -> python float
val = np.float32(0)
pyval = val.item()
print(type(pyval)) # <class 'float'>
# and similar...
type(np.float64(0).item()) # <class 'float'>
type(np.uint32(0).item()) # <class 'int'>
type(np.int16(0).item()) # <class 'int'>
type(np.cfloat(0).item()) # <class 'complex'>
type(np.datetime64(0, 'D').item()) # <class 'datetime.date'>
type(np.datetime64('2001-01-01 00:00:00').item()) # <class 'datetime.datetime'>
type(np.timedelta64(0, 'D').item()) # <class 'datetime.timedelta'>
...
(A related method np.asscalar(val) was deprecated with 1.16, and removed with 1.23).
For the curious, to build a table of conversions of NumPy array scalars for your system:
for name in dir(np):
obj = getattr(np, name)
if hasattr(obj, 'dtype'):
try:
if 'time' in name:
npn = obj(0, 'D')
else:
npn = obj(0)
nat = npn.item()
print('{0} ({1!r}) -> {2}'.format(name, npn.dtype.char, type(nat)))
except:
pass
There are a few NumPy types that have no native Python equivalent on some systems, including: clongdouble, clongfloat, complex192, complex256, float128, longcomplex, longdouble and longfloat. These need to be converted to their nearest NumPy equivalent before using .item().
If you want to convert (numpy.array OR numpy scalar OR native type OR numpy.darray) TO native type you can simply do :
converted_value = getattr(value, "tolist", lambda: value)()
tolist will convert your scalar or array to python native type. The default lambda function takes care of the case where value is already native.
Will numpy.float32 help?
>>>PI=3.1415926535897
>>> print PI*PI
9.86960440109
>>> PI32=numpy.float32(PI)
>>> print PI32*PI32
9.86961
If you want to do math operation on float32, convert the operands to float32 may help you.
Use numpy.ndarray.astype:
import numpy as np
arr_f64 = np.array([1.0000123456789, 2.0000123456789, 3.0000123456789], dtype=np.float64)
arr_f32 = arr_f64.astype(np.float32)
Pay attention to precision:
np.set_printoptions(precision=16)
print("arr_f64 = ", arr_f64)
print("arr_f32 = ", arr_f32)
gives
arr_f64 = [1.0000123456789 2.0000123456789 3.0000123456789]
arr_f32 = [1.0000124000000 2.0000124000000 3.0000124000000]
The tolist() method should do what you want. If you have a numpy array, just call tolist():
In [17]: a
Out[17]:
array([ 0. , 0.14285714, 0.28571429, 0.42857143, 0.57142857,
0.71428571, 0.85714286, 1. , 1.14285714, 1.28571429,
1.42857143, 1.57142857, 1.71428571, 1.85714286, 2. ])
In [18]: a.dtype
Out[18]: dtype('float64')
In [19]: b = a.tolist()
In [20]: b
Out[20]:
[0.0,
0.14285714285714285,
0.2857142857142857,
0.42857142857142855,
0.5714285714285714,
0.7142857142857142,
0.8571428571428571,
1.0,
1.1428571428571428,
1.2857142857142856,
1.4285714285714284,
1.5714285714285714,
1.7142857142857142,
1.857142857142857,
2.0]
In [21]: type(b)
Out[21]: list
In [22]: type(b[0])
Out[22]: float
If, in fact, you really have python list of numpy.float64 objects, then @Alexander's answer is great, or you could convert the list to an array and then use the tolist() method. E.g.
In [46]: c
Out[46]:
[0.0,
0.33333333333333331,
0.66666666666666663,
1.0,
1.3333333333333333,
1.6666666666666665,
2.0]
In [47]: type(c)
Out[47]: list
In [48]: type(c[0])
Out[48]: numpy.float64
@Alexander's suggestion, a list comprehension:
In [49]: [float(v) for v in c]
Out[49]:
[0.0,
0.3333333333333333,
0.6666666666666666,
1.0,
1.3333333333333333,
1.6666666666666665,
2.0]
Or, convert to an array and then use the tolist() method.
In [50]: np.array(c).tolist()
Out[50]:
[0.0,
0.3333333333333333,
0.6666666666666666,
1.0,
1.3333333333333333,
1.6666666666666665,
2.0]
If you are concerned with the speed, here's a comparison. The input, x, is a python list of numpy.float64 objects:
In [8]: type(x)
Out[8]: list
In [9]: len(x)
Out[9]: 1000
In [10]: type(x[0])
Out[10]: numpy.float64
Timing for the list comprehension:
In [11]: %timeit list1 = [float(v) for v in x]
10000 loops, best of 3: 109 µs per loop
Timing for conversion to numpy array and then tolist():
In [12]: %timeit list2 = np.array(x).tolist()
10000 loops, best of 3: 70.5 µs per loop
So it is faster to convert the list to an array and then call tolist().
You could use a list comprehension:
floats = [float(np_float) for np_float in np_float_list]