pandas read_csv thousands separator

pandas reading CSV data formatted with comma for thousands separator

stackoverflow.com › questions › 37439933 › pandas-reading-csv-data-formatted-with-comma-for-thousands-separator

Pass param thousands=',' to read_csv to read those values as thousands:

In [27]:
import pandas as pd
import io

t="""id;value
0;123,123
1;221,323,330
2;32,001"""
pd.read_csv(io.StringIO(t), thousands=r',', sep=';')

Out[27]:
   id      value
0   0     123123
1   1  221323330
2   2      32001

Answer from EdChum on Stack Overflow

Pandas

pandas.pydata.org › docs › reference › api › pandas.read_csv.html

pandas.read_csv — pandas 3.0.1 documentation - PyData |

Character acting as the thousands separator in numerical values. ... Character to recognize as decimal point (e.g., use ‘,’ for European data). ... Character used to denote a line break. Only valid with C parser. ... Character used to denote the start and end of a quoted item.

Stack Overflow

stackoverflow.com › questions › 37439933 › pandas-reading-csv-data-formatted-with-comma-for-thousands-separator

pandas reading CSV data formatted with comma for thousands separator - Stack Overflow

Top answer

1 of 3

Pass param thousands=',' to read_csv to read those values as thousands:

In [27]:
import pandas as pd
import io

t="""id;value
0;123,123
1;221,323,330
2;32,001"""
pd.read_csv(io.StringIO(t), thousands=r',', sep=';')

Out[27]:
   id      value
0   0     123123
1   1  221323330
2   2      32001

2 of 3

The answer to this question should be short:

df=pd.read_csv('filename.csv', thousands=',')

Discussions

Thousands separator for to_csv

Pandas exposes a thousands optional parameter to read_csv used to specify a custom thousands separator, so that 1,000 or 1_000 can be successfully parsed to a numeral in the resulting DataFrame. Un... More on github.com

github.com

December 4, 2019

Option for thousands separator in read_csv()

In my case, the solution was to go upstream to my database manager and change the settings so that it does not write in thousands separators. But this isn't always possible for datasets someone might find in the wild. The pandas read_csv command has an option to specify a thousands separator. More on github.com

github.com

December 14, 2017

Pandas df: How to add thousand separators to a column?

To get the separator you can do: df['col'] = df['col'].apply(lambda x : "{:,}".format(x)) Beware that this converts your integers/floats to strings. To include a Dataframe in an email body, I think converting the df to an html table would make sense. See pandas' .to_html() method. More on reddit.com

r/learnpython

May 29, 2021

python pandas read_csv thousands separator does not work - Stack Overflow

I use pandas read_csv to read a simple csv file. However, it has ValueError: could not convert string to float: which I do not understand why. The code is simply rawdata = pd.read_csv( r'Journal_... More on stackoverflow.com

stackoverflow.com

GitHub

github.com › pandas-dev › pandas › issues › 4322

read csv thousands separator · Issue #4322 · pandas-dev/pandas

July 22, 2013 - From this SO question, with input to include thousands: In [1]: s = '06.02.2013;13:00;1.000,215;0,215;0,185;0,205;0,00' In [2]: pd.read_csv(StringIO(s), sep=';', header=None, parse_dates={'Dates': [0, 1]}, index_col=0, decimal=',') Out[2...

Author hayd

GitHub

github.com › pandas-dev › pandas › issues › 30045

Thousands separator for to_csv · Issue #30045 · pandas-dev/pandas

December 4, 2019 - Pandas exposes a thousands optional parameter to read_csv used to specify a custom thousands separator, so that 1,000 or 1_000 can be successfully parsed to a numeral in the resulting DataFrame. Un...

Author ghisvail

GitHub

github.com › tidyverse › readr › issues › 799

Option for thousands separator in read_csv() · Issue #799 · tidyverse/readr

December 14, 2017 - In my case, the solution was to ... possible for datasets someone might find in the wild. The pandas read_csv command has an option to specify a thousands separator....

Author EricHe98

reddit.com › r/learnpython › pandas df: how to add thousand separators to a column?

r/learnpython on Reddit: Pandas df: How to add thousand separators to a column?

May 29, 2021 -

Two points here. I've 'pd.read_csv'ed a CSV file which has three columns.

I've used the following in order to extract the data and add headings to the columns (as currently the data is just naked)

Column 1 & 3 are text, and column 2 is a number.

How can I output the number with separators? EG 1,000,000 rather than 1000000

Also, what's the best way for formatting this dataframe to be included in an email body?

Top answer

1 of 1

GitHub

github.com › pandas-dev › pandas › issues › 4678

Dates are parsed with read_csv thousand seperator · Issue #4678 · pandas-dev/pandas

In [1]: s = '06.02.2013;13:00;1.000,215;0,215;0,185;0,205;0,00' In [2]: pd.read_csv(StringIO(s), sep=';', header=None, parse_dates={'Dates': [0, 1]}, index_col=0, decimal=',', thousands='.') Out[2]: 2 3 4 5 6 Dates 6022013 13:00 1.000,215 0.215 0.185 0.205 0

Stack Overflow

stackoverflow.com › questions › 42293060 › python-pandas-read-csv-thousands-separator-does-not-work

python pandas read_csv thousands separator does not work - Stack Overflow

Top answer

1 of 2

It seems there is problem with quoting, because if separator is , and thousands is , too, some quoting has to be in csv:

import pandas as pd
from pandas.compat import StringIO
import csv

temp=u"""'a','Base Amount'
'11','79,026,695.50'"""
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), 
                 dtype = { 'Base Amount' : 'float64' },
                 thousands = ',' ,
                 quotechar = "'",
                 quoting = csv.QUOTE_ALL,
                 decimal = '.',
                 encoding = 'ISO-8859-1')

print (df)
    a  Base Amount
0  11   79026695.5

temp=u'''"a","Base Amount"
"11","79,026,695.50"'''
#after testing replace 'StringIO(temp)' to 'filename.csv'
df = pd.read_csv(StringIO(temp), 
                 dtype = { 'Base Amount' : 'float64' },
                 thousands = ',' ,
                 quotechar = '"',
                 quoting = csv.QUOTE_ALL,
                 decimal = '.',
                 encoding = 'ISO-8859-1')

print (df)
    a  Base Amount
0  11   79026695.5

2 of 2

First of all you get rid of the comma: Example:

num = '79,026,695.50' 
print(num)
# '79,026,695.50'
num = num.replace(',', '') 
print(num)
79026695.50 
num = float(num)

in case:

rawdata['base_amount'] = rawdata['base_amount'].str.replace(',', '').astype(np.float64)

Find elsewhere

Google Bing Mojeek

GitHub

github.com › pandas-dev › pandas › issues › 2594

Support parsing thousands separators in floating point data · Issue #2594 · pandas-dev/pandas

December 24, 2012 - It seems that the decimal format works ok for the decimal sign or for the thousands but not combined. Reopen the issue? Example import pandas as pd from StringIO import StringIO data = """A;B;C 0;0,11;0,11 1.000;1000,11;1.000,11 20.000;20000,22;20.000,22 300.000;300000,33;300.000,33 4.000.000;4000000,44;4.000.000,44 5.000.000.000;5000000000,55;5.000.000.000,55""" df = pd.read_csv(StringIO(data), sep=';', thousands='.', decimal =',') print df.dtypes print df Results in A int64 B float64 C object A B C 0 0 1.100000e-01 0,11 1 1000 1.000110e+03 1.000,11 2 20000 2.000022e+04 20.000,22 3 300000 3.000003e+05 300.000,33 4 4000000 4.000000e+06 4.000.000,44 5 5000000000 5.000000e+09 5.000.000.000,55

Author wesm

Stack Overflow

stackoverflow.com › questions › 70568651 › how-do-i-parse-numbers-with-thousands-separator-in-pandas-read-csv › 70568790

numpy - How do I parse numbers with thousands separator in pandas read_csv? - Stack Overflow

Top answer

1 of 2

Use thousands parameter.

df = pd.read_csv("file.csv",  parse_dates=['Date'], thousands=',')

2 of 2

Use converters parameter if you have special format.

converters = {
    'Date': lambda x: datetime.strptime(x, "%b %d, %Y"),
    'Number': lambda x: float(x.replace(',', ''))
}
df = pd.read_csv('data.csv', converters=converters)

Output:

>>> df
        Date   Number
0 2021-12-30  2345.55

>>> df.dtypes
Date      datetime64[ns]
Number           float64
dtype: object

# data.csv
Date,Number
"Dec 30, 2021","2,345.55"

Else use standard parameters:

df = pd.read_csv("data.csv",  header=None, parse_dates=[0], thousands=',', quoting=1)

Output:

>>> df
           0        1        2         3        4
0 2021-12-30  1234.11  1654.22  11876.23  1676234

>>> df.dtypes
0    datetime64[ns]
1           float64
2           float64
3           float64
4             int64
dtype: object

Ideascripting

ideascripting.com › forum › set-decimal-and-thousands-separator-csvdefinition

Set decimal and thousands separator in csvDefinition | IDEAScripting and More

December 1, 2020 - Besides better Python integration hopefully bugs with corpate licensing will be repaired! @Bert as a workaround you can solve your problem in IDEA 10.4.1 with a Python script. Importing the CSV into a pandas dataframe with pandas.read_csv (..., decimal='.', thousands=',').

Towards Data Science

towardsdatascience.com › home › latest › there is more to ‘pandas.read_csv()’ than meets the eye

There is more to 'pandas.read_csv()' than meets the eye | Towards Data Science

January 15, 2025 - The compression parameter by default ... column in the dataset contains a thousand separator, pandas.read_csv() reads it as a string rather than an integer....

reddit.com › r/learnpython › dealing with excel files using mixed thousands and decimal separators

r/learnpython on Reddit: Dealing with excel files using mixed thousands and decimal separators

December 1, 2016 -

I have a few excel files that I want to process but they use a mix of thousands and decimal separators.

For example, 1.000.000 and 50,56 on one file and 1,000,000 and 50.56 on another.

This is the function that I use:

import os
import pandas

def process_file(data_file):
	try:
		df = pd.read_excel(data_file, header=None, na_values=["", "-", " "], 
						   thousands=".", decimal=",")
		print "XLS"
	except Exception as e:
		df = pd.read_html(data_file, header=None, na_values=["", "-", " "],
						  thousands=".", decimal=",")
		df = df[0]
		print "HTML"
				  
	df = df.iloc[3:] #, :8] # Remove header rows
	df.reset_index(drop=True, inplace=True) # Because we removed top rows

	column_names = ["N", "DATE", "J1", "J2", "J3", "J4", "J5", "J6", "T",
					"S5+1", "P5+1", "S5", "P5", "S4+1", "P4+1",
					"S4", "P4", "S3+1", "P3+1", "S3", "P3",
					"S2+1", "P2+1", "S1+1", "P1+1"]
	df.columns = column_names

	numeric_colums = ["N", "J1", "J2", "J3", "J4", "J5", "J6", "T",
					  "S5+1", "P5+1", "S5", "P5", "S4+1", "P4+1",
					  "S4", "P4", "S3+1", "P3+1", "S3", "P3",
					   "S2+1", "P2+1", "S1+1", "P1+1"]
	df.replace(u"Not available", 0, inplace=True)
	df.replace(np.nan, 0, inplace=True)
	
	# Convert coulmns with numbers
	for num_col in numeric_colums:
		try:
			df[num_col] = df[num_col].astype(float)
		except Exception as e:
			print e
			
	df.DATE = pd.to_datetime(df.DATE)
	df.sort_values(by="N", inplace=True)
	
	return df

When the thousands and decimal are wrong I get an error if I try to convert the columns into numbers.

Is there a way to find out which combination of separators is used in each file ?

Top answer

1 of 1

As far as I can tell, if the right separators are entered, all numeric column types are inferred correctly; otherwise some columns will be strings. So to tell if you have the right separator arguments, you can use something like this:

all(df[col].dtype in ['int64', 'float64'] for col in numeric_columns)

At this point if you determine it's not right, your only option would be to try it again with switched separators. Here's kind of a contrived way to do this I came up with while testing, but it reads from a csv and doesn't capture the excel/html determination logic:

import pandas as pd

def read_data(data_file):
    numeric_columns = ['B', 'C']
    separators = {'thousands': ".", 'decimal': ","}
    df = pd.read_csv(data_file, **separators)
    if not parsed_correctly(df, numeric_columns):
        print('retry!')
        separators['thousands'], separators['decimal'] = separators['decimal'], separators['thousands']
        df = pd.read_csv(data_file, **separators)
        print(parsed_correctly(df, numeric_columns))

def parsed_correctly(df, numeric_columns):
    print('parsed_correctly', all(df[col].dtype in ['int64', 'float64'] for col in numeric_columns))
    return all(df[col].dtype in ['int64', 'float64'] for col in numeric_columns)

if __name__ == '__main__':
    files = ['csv_sep.csv', 'csv_sep2.csv']
    for f in files:
        read_data(f)
        print('\n')

Stack Overflow

stackoverflow.com › questions › 42098982 › pandas-read-csv-thousands-separator-working-inconsistent

python - Pandas read_csv - thousands separator working inconsistent - Stack Overflow

Top answer

1 of 1

I decided to change my code into the following:

df_uv = pd.read_csv(file, sep=',', parse_dates=[0, 1, 2], thousands=',').fillna(0)
df_uv = df_uv[columns_to_use]

Which is working completely fine.

Saturn Cloud

saturncloud.io › blog › how-to-read-csv-file-with-space-as-thousandseparator-using-pandasreadcsv

How to Read CSV File with Space as Thousand-Separator using pandas.read_csv | Saturn Cloud Blog

August 25, 2023 - Here is an example of how to read a CSV file with space as a thousand-separator and a comma as a decimal separator: Year;Sales;Expenses 2021;1 000 000,50;500 000,25 2022;1 500 000,75;750 000,38 2023;2 000 000,00;1 000 000,00 · import pandas ...

TECH CHAMPION

tech-champion.com › tech champion › data science › pandas read_csv: handling thousands and decimal separators in csv files

Pandas read_csv: Handling Thousands and Decimal Separators in CSV Files

February 11, 2025 - Let's consider a CSV file named sales_data.csv with sales figures using a comma (,) as the decimal separator and a period (.) as the thousands separator. The columns 'SalesAmount', 'CostPrice', and 'ProfitMargin' represent numerical data requiring ...

Stack Overflow

stackoverflow.com › questions › 46464462 › read-csv-file-with-space-as-thousand-seperator-using-pandas-read-csv

python - Read CSV file with space as thousand-seperator using pandas.read_csv - Stack Overflow

Top answer

1 of 1

If you have non-breaking spaces, I would suggest a more aggressive regular expression with str.replace:

df.col1 = df.col1.str.replace('[^\d.,e+-]', '')\
               .str.replace(',', '.').astype(float)

Regex

[       # character group
^       # negation - ignore everything in this character group
\d      # digit
.       # dot
e       # 'e' - exponent
+-      # signs 
]

Analytics Vidhya

analyticsvidhya.com › home › data analysis & processing using delimiters in pandas (updated 2026)

Pandas Read CSV Delimiter: A Comprehensive Guide (2026)

December 27, 2025 - It uses comma (,) as the default delimiter or separator while parsing a file. Also, CSV data files can be viewed and saved in tabular form using popular tools such as Microsoft Excel and Google Sheets.

Stack Overflow

stackoverflow.com › questions › 56467871 › how-to-set-the-thousands-place-separator-in-pandas-read-csv

python - How to set the thousands place separator in pandas read_csv? - Stack Overflow

Top answer

1 of 2

There is a thousands parameter for this. Try,

arq_pedido = pd.read_csv('Pedido.csv', delimiter=";", encoding = "ISO-8859-1", thousands=".")

You may also wish to set decimal="," to handle decimal numbers correcltly.

2 of 2

The read_csv method has parameters for just about every conceivable scenario. You're probably interested in the thousands parameter for the thousands place separator, the decimal parameter for the decimal point, and the sep parameter for the column separator.

import pandas as pd
import io
foobar = io.StringIO("foo;bar \n 1,000; 2.0")
pd.read_csv(foobar, thousands=",", decimal=".", sep=";")
#    foo  bar 
#0  1000   2.0

GitHub

github.com › pandas-dev › pandas › issues › 59299

BUG: Error message in read_csv misleading when using decimal="," · Issue #59299 · pandas-dev/pandas

June 2, 2024 - In the example, someone added a number with a "." as thousands separator in file_bad. If previous rows contain correct numbers with comma as decimal separator, the later line with the "." suddenly causes a ValueError in a previous line.

Author behrenhoff