If you want to only modify the format of your values without doing any operation in pandas, you should just execute the following instruction:
pd.options.display.float_format = "{:,.2f}".format
This forces it not to use scientific notation (exponential notation) and always displays 2 places after the decimal point. It also adds commas.
You should be able to get more info here:
https://pandas.pydata.org/docs/user_guide/options.html#number-formatting
Examples:
0.0012 0.00
0.0123 0.01
1.2345 1.23
12.345 12.35
100 100.00
1234567890.123456 1,234,567,890.12
Answer from Zombraz on Stack OverflowHello, I am having trouble increasing the decimal precision of a python data frame, and I cannot figure out what I am doing wrong.
I make the following data frame in python:
import pandas as pd
data = [['Blue', 34], ['Green', 61], ['Red', 22]] df = pd.DataFrame(data, columns=['Color', 'Value'])
df
and see:
Color Value
0 Blue 34 1 Green 61 2 Red 22
I then want to divide the "Value" column by 7, yielding new values in each row that will not divide evenly and have many decimal places of precision.
I then divide by 7 with:
df['Value'] = df['Value']/7
df
And I see:
Color Value
0 Blue 4.86 1 Green 8.71 2 Red 3.14
But I want to see more decimal precision than this. I would like to see 5 decimal places of precision.
And so I try using the .round() function with:
df.round(5)
And nothing has changed to the precision of the values, still just 2 decimal places.
I am not sure why selecting 5 decimal places here is not being applied to my indicated data frame.
I also tried running:
pd.set_option('display.precision', 4)before calling the data frame, and still the decimal places do not change. How can I fix this code so that I get values with 5 decimal places of precision? Thanks!
python - Set decimal precision of a pandas dataframe column with a datatype of Decimal - Stack Overflow
Can't adjust dataframes decimal places
python - Pandas adding decimal points when using read_csv - Stack Overflow
python - Pandas data precision - Stack Overflow
If you want to only modify the format of your values without doing any operation in pandas, you should just execute the following instruction:
pd.options.display.float_format = "{:,.2f}".format
This forces it not to use scientific notation (exponential notation) and always displays 2 places after the decimal point. It also adds commas.
You should be able to get more info here:
https://pandas.pydata.org/docs/user_guide/options.html#number-formatting
Examples:
0.0012 0.00
0.0123 0.01
1.2345 1.23
12.345 12.35
100 100.00
1234567890.123456 1,234,567,890.12
Try:
import pandas as pd
pd.set_option('display.precision', 2)
This causes it to use scientific (exponential) notation when appropriate, and keeps 2 decimal places. It makes the decision about whether to use scientific notation or not on a per-column basis, so if 1 value requires scientific notation, the whole column is displayed that way.
Examples:
0.0012 1.23e-03
0.0123 1.23e-02
100 1.00e+02
1234567890.123456 1.23e+09
This can be modified by changing the print options for floats, however it will modify how every float datatype is printed
pd.set_option('display.float_format', '{:.10f}'.format)
Keep in mind that this is only the way it's printed. The value is stored in the dataframe, with every decimal.
On the other hand, you can restrict decimals by:
df.Value = df.Value.round(4)
But this will round depending the fifth decimal. Last option would be to use np.ceil or np.floor but since this wont support decimals, an approach with multiplication and division is requierd:
precision = 4
df['Value_ceil'] = np.ceil(df.Value * 10**precision) / (10**precision)
df['Value_floor'] = np.floor(df.Value * 10**precision) / (10**precision)
Fixed the issue, seems to be related to how Decimal converts from float to decimal. Setting the Values column to be of data type string then converting to Decimal got me the result I desired.
def get_df(table_filepath):
df = pd.read_csv(table_filepath)
df['Value'] = df['Value'].apply(str)
df['Value'] = df['Value'].apply(Decimal)
| Key | Value |
|---|---|
| A | 1.2089 |
| B | 5.6718 |
| B | 7.3084 |
If PersonalID if the header of the problematic column, try this:
df1 = pd.read_csv('Client.csv', dtype={'PersonalID':np.int32})
Edit: As there are no NaN value for integer. You can try this on each problematic colums:
df1[col] = df1[col].fillna(-9999) # or 0 or any value you want here
df1[col] = df1[col].astype(int)
You could go through each value, and if it is a number x, subtract int(x) from it, and if this difference is not 0.0, convert the number x to int(x). Or, if you're not dealing with any non-integers, you could just convert all values that are numbers to ints.
For an example of the latter (when your original data does not contain any non-integer numbers):
for index, row in df1.iterrows():
for c, x in enumerate(row):
if isinstance(x, float):
df1.iloc[index,c] = int(x)
For an example of the former (if you want to keep non-integer numbers as non-integer numbers, but want to guarantee that integer numbers stay as integers):
import numbers
import sys
for c, col in enumerate(df1.columns):
foundNonInt = False
for r, index in enumerate(df1.index):
if isinstance(x, float):
if (x - int(x) > sys.float_info.epsilon):
foundNonInt = True
break
if (foundNonInt==False):
df1.iloc[:,c] = int(df1.iloc[:,c])
else:
Note, the above method is not fool-proof: if by chance, a non-integer number column from the original data set contains non-integers that are all x.0000000, all the way to the last decimal place, this will fail.
No, 34.98774564765 is merely being printed by default with six decimal places:
>>> pandas.DataFrame([34.98774564765])
0
0 34.987746
The data itself has more precision:
>>> pandas.DataFrame([34.98774564765])[0].data[0]
34.98774564765
You can change the default used for printing frames by altering pandas.options.display.precision.
For example:
>>> pandas.set_option("display.precision", 8)
>>> pandas.DataFrame([34.98774564765])
0
0 34.98774564765
You can also use the 'display.float_format' option
with pd.option_context('display.float_format', '{:0.20f}'.format):
print(pd.DataFrame([34.98774564765]))
0
0 34.98774564765000150146
It seems you need DataFrame.round:
df = df.round(2)
print (df)
NO Topic A Topic B Topic C
0 0.0 1.00 1.00 1.00
1 1.0 0.55 0.64 0.55
2 2.0 0.57 0.74 0.68
3 3.0 0.85 0.86 0.85
4 4.0 0.20 0.20 0.20
5 5.0 0.85 0.84 0.85
6 6.0 0.45 0.53 0.45
7 7.0 0.62 0.66 0.70
8 8.0 0.57 0.50 0.57
9 9.0 0.85 0.90 0.88
10 10.0 0.95 0.97 0.96
The round method only works as I think you want if the values in each column (i.e., in each pandas.Series) of the DataFrame already have more decimal points than the value you are passing to round.
For instance:
pd.Series([1.09185, 2.31476]).round(2)
returns:
0 1.09
1 2.31
dtype: float64
But if the Series has fewer decimal points than the number you are trying to round, you will not get the desired visual result. For instance:
pd.Series([1.6, 2.3]).round(2)
returns:
0 1.6
1 2.3
dtype: float64
This is mathematically correct, since the numbers in the second Series already have fewer decimal points than 2. But it is not what you visually expect.
If you only want to change the display of a Series or DataFrame inside a notebook, you should use pandas.set_option("display.precision", 2). This changes the visual representation of the Series or DataFrame, without changing the inner precision of the actual numbers.
If for some reason you need to save a Series or DataFrame with the numbers already with the desired decimal points, you can apply a function that converts the object to string type and formats the string:
pd.Series([1.6, 2.3]).apply(lambda x: f"{x:.2f}")
which returns a new Series of dtype object instead of float:
0 1.60
1 2.30
dtype: object
Since 0.17.0 version you can do .round(n)
df.round(2)
0 1 2 3
0 0.06 0.67 0.77 0.71
1 0.80 0.56 0.97 0.15
2 0.03 0.59 0.11 0.95
3 0.33 0.19 0.46 0.92
df
0 1 2 3
0 0.057116 0.669422 0.767117 0.708115
1 0.796867 0.557761 0.965837 0.147157
2 0.029647 0.593893 0.114066 0.950810
3 0.325707 0.193619 0.457812 0.920403
import numpy as np
np.round(p_table, decimals=2)