To make all your floats show comma separators by default in pandas versions 0.23 through 0.25 set the following:
pd.options.display.float_format = '{:,}'.format
https://pandas.pydata.org/pandas-docs/version/0.23.4/options.html
In pandas version 1.0 this leads to some strange formatting in some cases.
Answer from jeffhale on Stack Overflowpython - Format a number with commas to separate thousands - Stack Overflow
Adding commas to a column in a dataframe
python - Formatting thousand separator for integers in a pandas dataframe - Stack Overflow
python - Can you format pandas integers for display, like `pd.options.display.float_format` for floats? - Stack Overflow
To make all your floats show comma separators by default in pandas versions 0.23 through 0.25 set the following:
pd.options.display.float_format = '{:,}'.format
https://pandas.pydata.org/pandas-docs/version/0.23.4/options.html
In pandas version 1.0 this leads to some strange formatting in some cases.
df.head().style.format("{:,.0f}") (for all columns)
df.head().style.format({"col1": "{:,.0f}", "col2": "{:,.0f}"}) (per column)
https://pbpython.com/styling-pandas.html
I was able to find some code that would work for a single number, but I do not understand how to apply commas to separate thousands to an entire column in a dataframe.
This is the code I have for a single number:
var = re.sub("(\d)(?=(\d{3})+(?!\d))", r"\1,", "%d" % var)I tried to apply it to a column in my dataframe the following way but it doesn't work:
df['column_name'] = re.sub("(\d)(?=(\d{3})+(?!\d))", r"\1,", "%d" % df['column_name'])Can someone help with the code? I don't understand why df['column_name'] isn't equivalent to a variable.
pandas (as of 0.20.1) does not allow overriding the default integer format in an easy way. It is hard coded in pandas.io.formats.format.IntArrayFormatter (the lambda function):
class IntArrayFormatter(GenericArrayFormatter):
def _format_strings(self):
formatter = self.formatter or (lambda x: '% d' % x)
fmt_values = [formatter(x) for x in self.values]
return fmt_values
I'm assuming what you're actually asking for is how you can override the format for all integers: modify (i.e. "monkey patch") the IntArrayFormatter to print integer values with thousands separated by comma as follows:
import pandas
class _IntArrayFormatter(pandas.io.formats.format.GenericArrayFormatter):
def _format_strings(self):
formatter = self.formatter or (lambda x: ' {:,}'.format(x))
fmt_values = [formatter(x) for x in self.values]
return fmt_values
pandas.io.formats.format.IntArrayFormatter = _IntArrayFormatter
Note:
- before 0.20.0, the formatters were in
pandas.formats.format. - before 0.18.1, the formatters were in
pandas.core.format.
Aside
For floats you do not need to jump through those hoops since there is a configuration option for it:
display.float_format: The callable should accept a floating point number and return a string with the desired format of the number. This is used in some places likeSeriesFormatter. Seecore.format.EngFormatterfor an example.
The formatters parameter in to_html will take a dictionary of column names mapped to a formatting function. Below has an example of a function to build a dict that maps the same function to both floats and ints.
In [250]: num_format = lambda x: '{:,}'.format(x)
In [246]: def build_formatters(df, format):
...: return {column:format
...: for (column, dtype) in df.dtypes.iteritems()
...: if dtype in [np.dtype('int64'), np.dtype('float64')]}
...:
In [247]: formatters = build_formatters(df_int, num_format)
In [249]: print df_int.to_html(formatters=formatters)
<table border="1" class="dataframe">
<thead>
<tr style="text-align: right;">
<th></th>
<th>A</th>
</tr>
</thead>
<tbody>
<tr>
<th>0</th>
<td>20,000</td>
</tr>
<tr>
<th>1</th>
<td>10,000</td>
</tr>
</tbody>
</table>
You could monkey-patch pandas.io.formats.format.IntArrayFormatter:
import contextlib
import numpy as np
import pandas as pd
import pandas.io.formats.format as pf
np.random.seed(2015)
@contextlib.contextmanager
def custom_formatting():
orig_float_format = pd.options.display.float_format
orig_int_format = pf.IntArrayFormatter
pd.options.display.float_format = '{:0,.2f}'.format
class IntArrayFormatter(pf.GenericArrayFormatter):
def _format_strings(self):
formatter = self.formatter or '{:,d}'.format
fmt_values = [formatter(x) for x in self.values]
return fmt_values
pf.IntArrayFormatter = IntArrayFormatter
yield
pd.options.display.float_format = orig_float_format
pf.IntArrayFormatter = orig_int_format
df = pd.DataFrame(np.random.randint(10000, size=(5,3)), columns=list('ABC'))
df['D'] = np.random.random(df.shape[0])*10000
with custom_formatting():
print(df)
yields
A B C D
0 2,658 2,828 4,540 8,961.77
1 9,506 2,734 9,805 2,221.86
2 3,765 4,152 4,583 2,011.82
3 5,244 5,395 7,485 8,656.08
4 9,107 6,033 5,998 2,942.53
while outside of the with-statement:
print(df)
yields
A B C D
0 2658 2828 4540 8961.765260
1 9506 2734 9805 2221.864779
2 3765 4152 4583 2011.823701
3 5244 5395 7485 8656.075610
4 9107 6033 5998 2942.530551
Another option for Jupyter notebooks is to use df.style.format('{:,}'), but it only works on a single dataframe as far as I know, so you would have to call this every time:
table.style.format('{:,}')
col1 col2
0s 9,246,452 6,669,310
>0 2,513,002 5,090,144
table
col1 col2
0s 9246452 6669310
>0 2513002 5090144
Styling — pandas 1.1.2 documentation
Two points here. I've 'pd.read_csv'ed a CSV file which has three columns.
I've used the following in order to extract the data and add headings to the columns (as currently the data is just naked)
Column 1 & 3 are text, and column 2 is a number.
How can I output the number with separators? EG 1,000,000 rather than 1000000
Also, what's the best way for formatting this dataframe to be included in an email body?