Use
>>> df.iloc[:, 1:] = df.iloc[:, 1:].div(df['total'], axis=0).mul(100).round(2).astype(str).add(' %')
>>> df
ID col1 col2 col3 total
0 1 28.57 % 42.86 % 28.57 % 100.0 %
1 2 16.67 % 33.33 % 50.0 % 100.0 %
2 3 25.0 % 37.5 % 37.5 % 100.0 %
Answer from timgeb on Stack OverflowUse
>>> df.iloc[:, 1:] = df.iloc[:, 1:].div(df['total'], axis=0).mul(100).round(2).astype(str).add(' %')
>>> df
ID col1 col2 col3 total
0 1 28.57 % 42.86 % 28.57 % 100.0 %
1 2 16.67 % 33.33 % 50.0 % 100.0 %
2 3 25.0 % 37.5 % 37.5 % 100.0 %
You can use:
import pandas as pd
df = pd.DataFrame({
'ID': range(1, 4),
'col1': [10, 5, 10],
'col2': [15, 10, 15],
'col3': [10, 15, 15],
'total': [35, 30, 40]
})
idx = ['col1', 'col2', 'col3', 'total']
df[idx] = df[idx].apply(lambda x: x / x['total'], axis=1)
df
which gives you:
| | ID | col1 | col2 | col3 | total |
|---:|-----:|---------:|---------:|---------:|--------:|
| 0 | 1 | 0.285714 | 0.428571 | 0.285714 | 1 |
| 1 | 2 | 0.166667 | 0.333333 | 0.5 | 1 |
| 2 | 3 | 0.25 | 0.375 | 0.375 | 1 |
You can do this using basic pandas operators .div and .sum, using the axis argument to make sure the calculations happen the way you want:
cols = ['<80%', '80-90', '>90']
df[cols] = df[cols].div(df[cols].sum(axis=1), axis=0).multiply(100)
- Calculate the sum of each column (
df[cols].sum(axis=1).axis=1makes the summation occur across the rows, rather than down the columns. - Divide the dataframe by the resulting series (
df[cols].div(df[cols].sum(axis=1), axis=0).axis=0makes the division happen across the columns. - To finish, multiply the results by
100so they are percentages between 0 and 100 instead of proportions between 0 and 1 (or you can skip this step and store them as proportions).
df/df.sum()
If you want to divide the sum of rows, transpose it first.
Hi - how to convert decimal numbers such as 0.0555 to percent to display percentage sign as 5.55% ?
I tried below I have two problems:
-
some values have 1 and some two decimals in my output excel file.
-
when I export to excel it's string not value. When I convert to value manually in excel I get two see two decimal places.
How to solve this in pandas, to display two decimal places with percent sign and to be value. Keep in mind that excel file will feed client database and can't be text or string.
df["Value"]= df["Value"].astype(float)
df["Value"]=df["Value"]*100
df["Value"]=df["Value"].round(2)
df["Value"]=df["Value"].astype(str) + "%"
Hey folks, I downloaded a CSV file from the internet and I wanted to convert one column into percentage with the first value in the column being 100 %. My approach looks as follows:
In the first step I fetch the first value of the column and make it a variable:
DAX_first=CSV_DAX['EL4F'].iloc[0]
In the second step I try to turn everything into percentages by the following:
plt.plot(date_DAX, CSV_DAX['EL4F']/DAX_first*100)
data_DAX contains the values for the X-axis obviously. The plot that I obtain doens't start at 100% but instead at around 60 %. Probably there's a blatant mistake somewhere but I don't see it at the moment. Can anyone spot the mistake?
The accepted answer suggests to modify the raw data for presentation purposes, something you generally do not want. Imagine you need to make further analyses with these columns and you need the precision you lost with rounding.
You can modify the formatting of individual columns in data frames, in your case:
output = df.to_string(formatters={
'var1': '{:,.2f}'.format,
'var2': '{:,.2f}'.format,
'var3': '{:,.2%}'.format
})
print(output)
For your information '{:,.2%}'.format(0.214) yields 21.40%, so no need for multiplying by 100.
You don't have a nice HTML table anymore but a text representation. If you need to stay with HTML use the to_html function instead.
from IPython.core.display import display, HTML
output = df.to_html(formatters={
'var1': '{:,.2f}'.format,
'var2': '{:,.2f}'.format,
'var3': '{:,.2%}'.format
})
display(HTML(output))
Update
As of pandas 0.17.1, life got easier and we can get a beautiful html table right away:
df.style.format({
'var1': '{:,.2f}'.format,
'var2': '{:,.2f}'.format,
'var3': '{:,.2%}'.format,
})
You could also set the default format for float :
pd.options.display.float_format = '{:.2%}'.format
Use '{:.2%}' instead of '{:.2f}%' - The former converts 0.41 to 41.00% (correctly), the latter to 0.41% (incorrectly)
You were very close with your df attempt. Try changing:
df['col'] = df['col'].astype(float)
to:
df['col'] = df['col'].str.rstrip('%').astype('float') / 100.0
# ^ use str funcs to elim '%' ^ divide by 100
# could also be: .str[:-1].astype(...
Pandas supports Python's string processing functions on string columns. Just precede the string function you want with .str and see if it does what you need. (This includes string slicing, too, of course.)
Above we utilize .str.rstrip() to get rid of the trailing percent sign, then we divide the array in its entirety by 100.0 to convert from percentage to actual value. For example, 45% is equivalent to 0.45.
Although .str.rstrip('%') could also just be .str[:-1], I prefer to explicitly remove the '%' rather than blindly removing the last char, just in case...
You can define a custom function to convert your percents to floats at read_csv() time:
# dummy data
temp1 = """index col
113 34%
122 50%
123 32%
301 12%"""
# Custom function taken from https://stackoverflow.com/questions/12432663/what-is-a-clean-way-to-convert-a-string-percent-to-a-float
def p2f(x):
return float(x.strip('%'))/100
# Pass to `converters` param as a dict...
df = pd.read_csv(io.StringIO(temp1), sep='\s+',index_col=[0], converters={'col':p2f})
df
col
index
113 0.34
122 0.50
123 0.32
301 0.12
# Check that dtypes really are floats
df.dtypes
col float64
dtype: object
My percent to float code is courtesy of ashwini's answer: What is a clean way to convert a string percent to a float?
If indeed percentage of 10 is what you want, the simplest way is to adjust your intake of the data slightly:
>>> p = pd.DataFrame(a.items(), columns=['item', 'score'])
>>> p['perc'] = p['score']/10
>>> p
Out[370]:
item score perc
0 Test 2 1 0.1
1 Test 3 1 0.1
2 Test 1 4 0.4
3 Test 4 9 0.9
For real percentages, instead:
>>> p['perc']= p['score']/p['score'].sum()
>>> p
Out[427]:
item score perc
0 Test 2 1 0.066667
1 Test 3 1 0.066667
2 Test 1 4 0.266667
3 Test 4 9 0.600000
First, make the keys of your dictionary the index of you dataframe:
import pandas as pd
a = {'Test 1': 4, 'Test 2': 1, 'Test 3': 1, 'Test 4': 9}
p = pd.DataFrame([a])
p = p.T # transform
p.columns = ['score']
Then, compute the percentage and assign to a new column.
def compute_percentage(x):
pct = float(x/p['score'].sum()) * 100
return round(pct, 2)
p['percentage'] = p.apply(compute_percentage, axis=1)
This gives you:
score percentage
Test 1 4 26.67
Test 2 1 6.67
Test 3 1 6.67
Test 4 9 60.00
[4 rows x 2 columns]