use astype(np.int64)
s = pd.Series(['', 8.00735e+09, 4.35789e+09, 6.10644e+09])
mask = pd.to_numeric(s).notnull()
s.loc[mask] = s.loc[mask].astype(np.int64)
s
0
1 8007350000
2 4357890000
3 6106440000
dtype: object
Answer from piRSquared on Stack Overflowuse astype(np.int64)
s = pd.Series(['', 8.00735e+09, 4.35789e+09, 6.10644e+09])
mask = pd.to_numeric(s).notnull()
s.loc[mask] = s.loc[mask].astype(np.int64)
s
0
1 8007350000
2 4357890000
3 6106440000
dtype: object
In Pandas/NumPy, integers are not allowed to take NaN values, and arrays/series (including dataframe columns) are homogeneous in their datatype --- so having a column of integers where some entries are None/np.nan is downright impossible.
EDIT:data.phone.astype('object')
should do the trick; in this case, Pandas treats your column as a series of generic Python objects, rather than a specific datatype (e.g. str/float/int), at the cost of performance if you intend to run any heavy computations with this data (probably not in your case).
Assuming you want to keep those NaN entries, your approach of converting to strings is a valid possibility:
data.phone.astype(str).str.split('.', expand = True)[0]
should give you what you're looking for (there are alternative string methods you can use, such as .replace or .extract, but .split seems the most straightforward in this case).
Alternatively, if you are only interested in the display of floats (unlikely I'd suppose), you can do pd.set_option('display.float_format','{:.0f}'.format), which doesn't actually affect your data.
You have a few options...
1) convert everything to integers.
df.astype(int)
<=35 >35
Cut-off
Calcium 0 1
Copper 1 0
Helium 0 8
Hydrogen 0 1
2) Use round:
>>> df.round()
<=35 >35
Cut-off
Calcium 0 1
Copper 1 0
Helium 0 8
Hydrogen 0 1
but not always great...
>>> (df - .2).round()
<=35 >35
Cut-off
Calcium -0 1
Copper 1 -0
Helium -0 8
Hydrogen -0 1
3) Change your display precision option in Pandas.
pd.set_option('precision', 0)
>>> df
<=35 >35
Cut-off
Calcium 0 1
Copper 1 0
Helium 0 8
Hydrogen 0 1
Since pandas 0.17.1 you can set the displayed numerical precision by modifying the style of the particular data frame rather than setting the global option:
import pandas as pd
import numpy as np
np.random.seed(24)
df = pd.DataFrame(np.random.randn(5, 3), columns=list('ABC'))
df

df.style.set_precision(2)

It is also possible to apply column specific styles
df.style.format({
'A': '{:,.1f}'.format,
'B': '{:,.3f}'.format,
})

Hello everyone,
I am still new to python and learning.
So I practiced some exercises and made an app that calculates the percentage from the number the user enters.
My question use, how can I terminate the .0 part if the user enters an Int and keep the decimal part if they enter a float?
so for example, 5% of 100 is 5 ( Int)
and 5.1% of 100 is 5.1 (float)
Here is the pandas series:
0 ?
1 10
2 15
3 12
4 90
...
79537 8.0
79538 14.0
79539 15.0
79540 12.0
79541 15.0
Name: Age, Length: 79542, dtype: objectI filled the NaN values in this series with '?' (hence the value at index 0) and all the values in this series are strings. I'm eventually going to make another column comprised of a base string added to the values in this series. Like this:
df['new column'] = 'age is: ' + df['Age']
this will make a new column with values like:
0 age is ?
1 age is 10
2 age is 15
3 age is 12
4 age is 90
...
79537 age is 8.0
79538 age is 14.0
79539 age is 15.0
79540 age is 12.0
79541 age is 15.0
Name: new column, Length: 79542, dtype: objectas for my project, i guess this is okay but it's bothering me that some of the values have a trailing zero. Again, everything here is a string.
I tried doing this:
df['new column'] = df['new column'].str.rstrip('0')
But it returns this:
0 age is ?
1 age is 1
2 age is 15
3 age is 12
4 age is 9
...
79537 age is 8.
79538 age is 14.
79539 age is 15.
79540 age is 12.
79541 age is 15.
Name: new column, Length: 79542, dtype: objectNote how the value at index 1 changed from 'age is 10' to 'age is 1', and the value at index 4 changed from 'age is 90' to 'age is 9'. This is would be fatal for my project. Also, the trailing decimal of the other values still bothers me.
I even tried this to get rid of the trailing decimal point:
df['new column'] = df['new column'].str.rstrip('.0')
But the problem isn't fully solved because indexes 1 and 4 are is still erroneous:
0 age is ?
1 age is 1
2 age is 15
3 age is 12
4 age is 9
...
79537 age is 8
79538 age is 14
79539 age is 15
79540 age is 12
79541 age is 15
Name: new column, Length: 79542, dtype: objectI presumably have many other values in this series that end in a zero (10, 20, 30, 40 etc) but I can't check manually because there are over 79k rows.
All i need is a way to drop trailing zeros after a decimal point, and also drop the decimal point itself, while also preserving values that are multiples of 10 and that do not have a trailing zero or decimal point. Basically, I need it to be like this:
0 age is ?
1 age is 10
2 age is 15
3 age is 12
4 age is 90
...
79537 age is 8
79538 age is 14
79539 age is 15
79540 age is 12
79541 age is 15
Name: new column, Length: 79542, dtype: objectAgain, please note how the value at index 1 is 'age is 10' and not 'age is 1', and the value at index for is 'age is 90' and not 'age is 9'. Also, there are no decimal points or trailing zeros after a decimal point.
Any help would be appreciated. Thank you
Use astype with replace:
df = pd.DataFrame({'ID':[805096730.0,805096730.0]})
df['ID'] = df['ID'].astype(str).replace('\.0', '', regex=True)
print (df)
ID
0 805096730
1 805096730
Or add parameter dtype:
df = pd.read_excel(file, dtype={'ID':str})
Check type of your numbers before converting them to strings. It seems that they are floats, rather than integers. If this is the case, convert your numbers to integers:
df = pd.DataFrame([123.0, 456.0])
df = df.apply(int, axis=1)
0 123
1 456
Then, convert it into strings:
df = df.apply(str)
print(df.iloc[1])
'456'
If want convert integers and floats numbers to strings with no trailing 0 use this with map or apply:
df = pd.DataFrame({'col1':[1.00, 1, 0.5, 1.50]})
df['new'] = df['col1'].map('{0:g}'.format)
#alternative solution
#df['new'] = df['col1'].apply('{0:g}'.format)
print (df)
col1 new
0 1.0 1
1 1.0 1
2 0.5 0.5
3 1.5 1.5
print (df['new'].apply(type))
0 <class 'str'>
1 <class 'str'>
2 <class 'str'>
3 <class 'str'>
Name: new, dtype: object
I think something like this should work:
if val.is_integer() == True :
val = int(val)
elif val.is_float() == True :
val = Decimal(val).normalize()
Assuming that val is a float value inside the dataframe's column. You simply cast the value to be integer.
For float value instead you cut extra zeros.