Try this example:
df = pd.DataFrame({
'date':['1/15/2016','2/1/2016','2/15/2016','3/15/2016'],
'numA':[1000,2000,3000,4000.3],
'numB':[10000,20000.2,30000,40000]
})
writer = pd.ExcelWriter('c:/.../pandas_excel_test.xlsx', engine = 'xlsxwriter')
df.to_excel(writer, index=False, sheet_name='Sheet1')
workbook = writer.book
worksheet = writer.sheets['Sheet1']
format1 = workbook.add_format({'num_format': '0.00'})
worksheet.set_column('C:C', None, format1) # Adds formatting to column C
writer.save()
You can change the format 0.00 in the .add_format({'num_format': '0.00'}) on your desired format, for example, 0.0000 for four decimal places. Note this only applies to column C.
If you want to change the formatting of all columns, modify the worksheet.set_column('C:C', None, format1), for example
worksheet.set_column(0, 3, None, format1)
where 0 the first column and 3 the final column.
Check more formatting here: https://xlsxwriter.readthedocs.io/worksheet.html
Answer from Eddy Piedad on Stack OverflowTry this example:
df = pd.DataFrame({
'date':['1/15/2016','2/1/2016','2/15/2016','3/15/2016'],
'numA':[1000,2000,3000,4000.3],
'numB':[10000,20000.2,30000,40000]
})
writer = pd.ExcelWriter('c:/.../pandas_excel_test.xlsx', engine = 'xlsxwriter')
df.to_excel(writer, index=False, sheet_name='Sheet1')
workbook = writer.book
worksheet = writer.sheets['Sheet1']
format1 = workbook.add_format({'num_format': '0.00'})
worksheet.set_column('C:C', None, format1) # Adds formatting to column C
writer.save()
You can change the format 0.00 in the .add_format({'num_format': '0.00'}) on your desired format, for example, 0.0000 for four decimal places. Note this only applies to column C.
If you want to change the formatting of all columns, modify the worksheet.set_column('C:C', None, format1), for example
worksheet.set_column(0, 3, None, format1)
where 0 the first column and 3 the final column.
Check more formatting here: https://xlsxwriter.readthedocs.io/worksheet.html
I am new here, would you be able to round to 2 decimal places using this?
df['col_name'].apply(lambda x: round(x, 2))
python - Read excel data into pandas dataframe to 20 decimal places - Stack Overflow
python - Setting default number format when writing to Excel from Pandas - Stack Overflow
BUG: Dataframe.to_excel treats decimal.Decimal as string instead of numeric type, the data in the Excel cell is formatted as a string, not a number
Dataframe.to_excel treats decimal.Decimal as string instead of numeric type, the data in the Excel cell is formatted as a string, not a number
I got this format the floats to 1 decimal place.
data = {'A Prime': {0: 3.26, 1: 3.24, 2: 3.22, 3: 3.2, 4: 3.18, 5: 3.16,
6: 3.14, 7: 1.52, 8: 1.5, 9: 1.48, 10: 1.46, 11: 1.44, 12: 1.42},
'B': {0: 0.16608, 1: 0.16575, 2: 0.1654, 3: 0.16505999999999998, 4: 0.1647, 5: 0.16434, 6: 0.16398, 7: 0.10759, 8: 0.10687, 9: 0.10614000000000001,
10: 0.10540999999999999, 11: 0.10469, 12: 0.10396}, 'Proto Name': {0: 'Alpha',
1: 'Alpha', 2: 'Alpha', 3: 'Alpha', 4: 'Alpha', 5: 'Alpha', 6: 'Alpha', 7: 'Bravo', 8: 'Bravo', 9: 'Bravo', 10: 'Bravo', 11: 'Bravo', 12: 'Bravo'}}
import pandas as pd
df = pd.DataFrame(data)
A Prime B Proto Name
0 3.26 0.16608 Alpha
1 3.24 0.16575 Alpha
2 3.22 0.16540 Alpha
3 3.20 0.16506 Alpha
4 3.18 0.16470 Alpha
5 3.16 0.16434 Alpha
6 3.14 0.16398 Alpha
7 1.52 0.10759 Bravo
8 1.50 0.10687 Bravo
9 1.48 0.10614 Bravo
10 1.46 0.10541 Bravo
11 1.44 0.10469 Bravo
12 1.42 0.10396 Bravo
writer = pd.ExcelWriter(r'c:\temp\output.xlsx')
# This method will truncate the data past the first decimal point
df.to_excel(writer,'Sheet1',float_format = "%0.1f")
writer.save()

but that alas is not perhaps all cases - no joy with say larger numbers and thousands separator
df.to_excel(writer,'Sheet1',float_format = ":,")
However the following seems to work
data = {'A Prime': {0: 326000, 1: 3240000}}
df = pd.DataFrame(data)
A Prime
0 326000
1 3240000
writer = pd.ExcelWriter(r'c:\temp\output.xlsx')
df.to_excel(writer,'Sheet1')
workbook = writer.book
worksheet = writer.sheets['Sheet1']
format1 = workbook.add_format({'num_format': '#,##0.00'})
worksheet.set_column('B:B', 18, format1)
#Alternatively, you could specify a range of columns with 'B:D' and 18 sets the column width
writer.close()

All taken from here: http://xlsxwriter.readthedocs.io/working_with_pandas.html
For what it's worth and because the question was also tagged for openpyxl, you can also also edit the default style of a whole workbook in openpyxl. This could make sense for the number format but can have unexpected consquences if things like the font size is changed, because other GUI elements are affected. The following should work, if used with caution.
wb._named_styles['Normal'].number_format = '#,##0.00'
I think my question is obsolete. Excel recognizes that these are decimal characters and automatically converts them to the specified format. I close the question with that. The code in the question already solves the "problem" by itself.
You could also turn them into strings and use replace:
df1.wordnumber = df1.wordnumber.astype(str)
df1.wordnumber = df1.wordnumber.apply(lambda x: x.replace('.',',')
Then write to excel.
I have a few excel files that I want to process but they use a mix of thousands and decimal separators.
For example, 1.000.000 and 50,56 on one file and 1,000,000 and 50.56 on another.
This is the function that I use:
import os import pandas def process_file(data_file): try: df = pd.read_excel(data_file, header=None, na_values=["", "-", " "], thousands=".", decimal=",") print "XLS" except Exception as e: df = pd.read_html(data_file, header=None, na_values=["", "-", " "], thousands=".", decimal=",") df = df[0] print "HTML" df = df.iloc[3:] #, :8] # Remove header rows df.reset_index(drop=True, inplace=True) # Because we removed top rows column_names = ["N", "DATE", "J1", "J2", "J3", "J4", "J5", "J6", "T", "S5+1", "P5+1", "S5", "P5", "S4+1", "P4+1", "S4", "P4", "S3+1", "P3+1", "S3", "P3", "S2+1", "P2+1", "S1+1", "P1+1"] df.columns = column_names numeric_colums = ["N", "J1", "J2", "J3", "J4", "J5", "J6", "T", "S5+1", "P5+1", "S5", "P5", "S4+1", "P4+1", "S4", "P4", "S3+1", "P3+1", "S3", "P3", "S2+1", "P2+1", "S1+1", "P1+1"] df.replace(u"Not available", 0, inplace=True) df.replace(np.nan, 0, inplace=True) # Convert coulmns with numbers for num_col in numeric_colums: try: df[num_col] = df[num_col].astype(float) except Exception as e: print e df.DATE = pd.to_datetime(df.DATE) df.sort_values(by="N", inplace=True) return df
When the thousands and decimal are wrong I get an error if I try to convert the columns into numbers.
Is there a way to find out which combination of separators is used in each file ?
In addition to the other solutions where the string data is converted to numbers when creating or using the dataframe it is also possible to do it using options to the xlsxwriter engine:
# Versions of Pandas >= 1.3.0:
writer = pd.ExcelWriter('output.xlsx',
engine='xlsxwriter',
engine_kwargs={'options': {'strings_to_numbers': True}})
# Versions of Pandas < 1.3.0:
writer = pd.ExcelWriter('output.xlsx',
engine='xlsxwriter',
options={'strings_to_numbers': True})
From the docs:
strings_to_numbers: Enable theworksheet.write()method to convert strings to numbers, where possible, usingfloat()in order to avoid an Excel warning about "Numbers Stored as Text".
Consider converting numeric columns to floats since the pd.read_html reads web data as string types (i.e., objects). But before converting to floats, you need to replace hyphens to NaNs:
import pandas as pd
import numpy as np
dfs = pd.read_html('https://www.google.com/finance?q=NASDAQ%3AGOOGL' +
'&fstype=ii&ei=9YBMWIiaLo29e83Rr9AM', flavor='html5lib')
xlWriter = pd.ExcelWriter('Output.xlsx', engine='xlsxwriter')
workbook = xlWriter.book
for i, df in enumerate(dfs):
for col in df.columns[1:]: # UPDATE ONLY NUMERIC COLS
df.loc[df[col] == '-', col] = np.nan # REPLACE HYPHEN WITH NaNs
df[col] = df[col].astype(float) # CONVERT TO FLOAT
df.to_excel(xlWriter, sheet_name='Sheet{}'.format(i))
xlWriter.save()
Hi - how to convert decimal numbers such as 0.0555 to percent to display percentage sign as 5.55% ?
I tried below I have two problems:
-
some values have 1 and some two decimals in my output excel file.
-
when I export to excel it's string not value. When I convert to value manually in excel I get two see two decimal places.
How to solve this in pandas, to display two decimal places with percent sign and to be value. Keep in mind that excel file will feed client database and can't be text or string.
df["Value"]= df["Value"].astype(float)
df["Value"]=df["Value"]*100
df["Value"]=df["Value"].round(2)
df["Value"]=df["Value"].astype(str) + "%"
I have an .xlsx file that I am reading in using pd.read_excel.
The format of entire Excel sheet seems to be 'General', and contains text and numbers.
When I read the data in, and extract the number data, the numbers seem to be rounded, so a number like 64.119, will appear in the DataFrame as 64.12. Is there any way to ensure I get the correct number (i.e. 64.119)?
Thanks.
Let us try
out = (s//1).astype(int).astype(str)+','+(s%1*100).astype(int).astype(str).str.zfill(2)
0 105,00
1 99,20
dtype: object
Input data
s=pd.Series([105,99.2])
s = pd.Series([105, 99.22]).apply(lambda x: f"{x:.2f}".replace('.', ',')
First .apply takes a function inside and
f string: f"{x:.2f} turns float into 2 decimal point string with '.'.
After that .replace('.', ',') just replaces '.' with ','.
You can change the pd.Series([105, 99.22]) to match it with your dataframe.