You can convert most of the columns by just calling convert_objects:
CopyIn [36]:
df = df.convert_objects(convert_numeric=True)
df.dtypes
Out[36]:
Date object
WD int64
Manpower float64
2nd object
CTR object
2ndU float64
T1 int64
T2 int64
T3 int64
T4 float64
dtype: object
For column '2nd' and 'CTR' we can call the vectorised str methods to replace the thousands separator and remove the '%' sign and then astype to convert:
CopyIn [39]:
df['2nd'] = df['2nd'].str.replace(',','').astype(int)
df['CTR'] = df['CTR'].str.replace('%','').astype(np.float64)
df.dtypes
Out[39]:
Date object
WD int64
Manpower float64
2nd int32
CTR float64
2ndU float64
T1 int64
T2 int64
T3 int64
T4 object
dtype: object
In [40]:
df.head()
Out[40]:
Date WD Manpower 2nd CTR 2ndU T1 T2 T3 T4
0 2013/4/6 6 NaN 2645 5.27 0.29 407 533 454 368
1 2013/4/7 7 NaN 2118 5.89 0.31 257 659 583 369
2 2013/4/13 6 NaN 2470 5.38 0.29 354 531 473 383
3 2013/4/14 7 NaN 2033 6.77 0.37 396 748 681 458
4 2013/4/20 6 NaN 2690 5.38 0.29 361 528 541 381
Or you can do the string handling operations above without the call to astype and then call convert_objects to convert everything in one go.
UPDATE
Since version 0.17.0 convert_objects is deprecated and there isn't a top-level function to do this so you need to do:
df.apply(lambda col:pd.to_numeric(col, errors='coerce'))
See the docs and this related question: pandas: to_numeric for multiple columns
Answer from EdChum on Stack OverflowYou can convert most of the columns by just calling convert_objects:
CopyIn [36]:
df = df.convert_objects(convert_numeric=True)
df.dtypes
Out[36]:
Date object
WD int64
Manpower float64
2nd object
CTR object
2ndU float64
T1 int64
T2 int64
T3 int64
T4 float64
dtype: object
For column '2nd' and 'CTR' we can call the vectorised str methods to replace the thousands separator and remove the '%' sign and then astype to convert:
CopyIn [39]:
df['2nd'] = df['2nd'].str.replace(',','').astype(int)
df['CTR'] = df['CTR'].str.replace('%','').astype(np.float64)
df.dtypes
Out[39]:
Date object
WD int64
Manpower float64
2nd int32
CTR float64
2ndU float64
T1 int64
T2 int64
T3 int64
T4 object
dtype: object
In [40]:
df.head()
Out[40]:
Date WD Manpower 2nd CTR 2ndU T1 T2 T3 T4
0 2013/4/6 6 NaN 2645 5.27 0.29 407 533 454 368
1 2013/4/7 7 NaN 2118 5.89 0.31 257 659 583 369
2 2013/4/13 6 NaN 2470 5.38 0.29 354 531 473 383
3 2013/4/14 7 NaN 2033 6.77 0.37 396 748 681 458
4 2013/4/20 6 NaN 2690 5.38 0.29 361 528 541 381
Or you can do the string handling operations above without the call to astype and then call convert_objects to convert everything in one go.
UPDATE
Since version 0.17.0 convert_objects is deprecated and there isn't a top-level function to do this so you need to do:
df.apply(lambda col:pd.to_numeric(col, errors='coerce'))
See the docs and this related question: pandas: to_numeric for multiple columns
convert_objects is deprecated.
For pandas >= 0.17.0, use pd.to_numeric
Copydf["2nd"] = pd.to_numeric(df["2nd"])
python - How to transform the type of a column from object to float64? - Stack Overflow
How do I convert a object to a float.
python - How to convert object data type to Float64 and Int64 after updating the Pandas for interpolation for NAN values - Stack Overflow
python - Pandas convert data type from object to float - Stack Overflow
Hi guys, im struggeling with my dataframe.
Im trying to get a percentage of 2 columns, but when I divide the 2 I get this error: unsupported operand type(s) for /: 'float' and 'str'
The dtypes of the columns are an 'object'
Im aware that I have empty data in my columns, but is that an issue?
Would appreciate your help!
1. Using pandas.Series.astype
You can use
pandas.Series.astypeYou can do something like this :
weather["Temp"] = weather.Temp.astype(float)
2. Using pd.to_numeric
You can also use
pd.to_numericthat will convert the column from object to floatFor details on how to use it checkout this link :http://pandas.pydata.org/pandas-docs/version/0.20/generated/pandas.to_numeric.html
Example :
s = pd.Series(['apple', '1.0', '2', -3]) print(pd.to_numeric(s, errors='ignore')) print("=========================") print(pd.to_numeric(s, errors='coerce'))Output:
0 apple 1 1.0 2 2 3 -3 ========================= dtype: object 0 NaN 1 1.0 2 2.0 3 -3.0 dtype: float64In your case you can do something like this:
weather["Temp"] = pd.to_numeric(weather.Temp, errors='coerce')
3. Using convert_objects
Other option is to use
convert_objectsExample is as follows
>> pd.Series([1,2,3,4,'.']).convert_objects(convert_numeric=True) 0 1 1 2 2 3 3 4 4 NaN dtype: float64You can use this as follows:
weather["Temp"] = weather.Temp.convert_objects(convert_numeric=True)I have showed you examples because if any of your column won't have a number then it will be converted to
NaN... so be careful while using it.
I tried all methods suggested here but sadly none worked. Instead, found this to be working:
df['column'] = pd.to_numeric(df['column'],errors = 'coerce')
And then check it using:
print(df.info())