You can use DataFrame.fillna or Series.fillna which will replace the Python object None, not the string 'None'.
import pandas as pd
import numpy as np
For dataframe:
df = df.fillna(value=np.nan)
For column or series:
df.mycol.fillna(value=np.nan, inplace=True)
Answer from Guillaume Jacquenot on Stack OverflowYou can use DataFrame.fillna or Series.fillna which will replace the Python object None, not the string 'None'.
import pandas as pd
import numpy as np
For dataframe:
df = df.fillna(value=np.nan)
For column or series:
df.mycol.fillna(value=np.nan, inplace=True)
Here's another option:
df.replace(to_replace=[None], value=np.nan, inplace=True)
Problem with DataFrame.replace using None
python - How to replace 0 with None in a pandas column? - Stack Overflow
python - Rename "None" value in Pandas - Stack Overflow
python - Pandas - How to replace string with zero values in a DataFrame series? - Stack Overflow
Hey All,
I am using the below code to export Jira issues. Not all of my issues have all the custom fields I am trying to export so they end up being 'None'. I would like to replace the 'None' with an empty string. I am trying to use pandas df.fillna but it's not doing what I thought it would. In spite of using fillna, my output is unchanged and still contains 'None' values.
What am I doing wrong?
Thanks!
Code:
# Search all issues, and time execution
jira_issues = jira.search_issues(jql,maxResults=50)
# Converge JSON to Pandas DataFrame
for issue in jira_issues:
try:
issue_fields = pd.DataFrame({
'id': [issue.id],
'Due date': str(issue.fields.duedate),
'Actual Start Date': str(issue.fields.customfield_10055),
'Actual Completion Date': str(issue.fields.customfield_10057)
})
except AttributeError:
pass
issues = pd.concat([issues,issue_fields])
issues.fillna("",inplace=True)
print(issues)Output:
0 79906 None None None 0 79904 2022-07-07 2022-05-18 2022-05-25 0 79903 2022-04-13 2022-04-22 2022-04-22 0 79902 2022-06-15 2022-06-04 2022-06-04 0 79901 None None None 0 79900 None 2022-06-14 None 0 79899 2022-05-06 2022-05-02 2022-05-06 0 79897 None None None 0 79896 None None None
Checking column types using dtypes:
id object Due date object Actual Start Date object Actual Completion Date object dtype: object
Updated answer, April 2025:
pd.to_numeric can convert arguments to a numeric type. The option errors='coerce' sets things to NaN. However, it can only work on 1D objects (i.e. scalar, list, tuple, 1-d array, or Series). Therefore, to use it on a DataFrame, we need to use df.apply to convert each column individually. Note that any **kwargs given to apply will be passed onto the function, so we can still set errors='coerce'.
Using pd.to_numeric along with df.apply will set any strings to NaN. If we want to convert those to 0 values, we can then use .fillna(0) on the resulting DataFrame.
For example (and note this also works with the strings suggested by the original question "$-" and "($24)"):
import pandas as pd
df = pd.DataFrame({
'a': (1, 'sd', 1),
'b': (2., 2., 'fg'),
'c': (4, "$-", "($24)")
})
print(df)
# a b c
# 0 1 2.0 4
# 1 sd 2.0 $-
# 2 1 fg ($24)
df = df.apply(pd.to_numeric, errors='coerce').fillna(0)
print(df)
# a b c
# 0 1.0 2.0 4.0
# 1 0.0 2.0 0.0
# 2 1.0 0.0 0.0
My original answer from 2015, which is now deprecated
You can use the convert_objects method of the DataFrame, with convert_numeric=True to change the strings to NaNs
From the docs:
convert_numeric: If True, attempt to coerce to numbers (including strings), with unconvertible values becoming NaN.
In [17]: df
Out[17]:
a b c
0 1. 2. 4
1 sd 2. 4
2 1. fg 5
In [18]: df2 = df.convert_objects(convert_numeric=True)
In [19]: df2
Out[19]:
a b c
0 1 2 4
1 NaN 2 4
2 1 NaN 5
Finally, if you want to convert those NaNs to 0's, you can use df.replace
In [20]: df2.replace('NaN',0)
Out[20]:
a b c
0 1 2 4
1 0 2 4
2 1 0 5
Use .to_numeric to covert the strings to numeric (set strings to NaN using the errors option 'coerce'):
df = pd.to_numeric(df, errors='coerce')
and then convert the NaN value to zeros using replace:
df.replace('NaN',0)
This is sufficient
df.fillna("",inplace=True)
df
Out[142]:
A B C D E
0 A 2014-01-02 02:00:00 A 1
1 B 2014-01-02 03:00:00 B B 2
2 2014-01-02 04:00:00 C C
3 C C 4
edit 2021-07-26 complete response following @dWitty's comment
If you really want to keep Nat and NaN values on other than text, you just need fill Na for your text column In your exemple this is A, C, D
You just send a dict of replacement value for your columns. value can be differents for each column. For your case you just need construct the dict
# default values to replace NA (None)
# values = {"A": "", "C": "", "D": ""}
values = (dict([[e,""] for e in ['A','C','D']]))
df.fillna(value=values, inplace=True)
df
Out[142]:
A B C D E
0 A 2014-01-02 02:00:00 A 1.0
1 B 2014-01-02 03:00:00 B B 2.0
2 2014-01-02 04:00:00 C C NaN
3 C NaT C 4.0
It looks like None is being promoted to NaN and so you cannot use replace like usual, the following works:
In [126]:
mask = df.applymap(lambda x: x is None)
cols = df.columns[(mask).any()]
for col in df[cols]:
df.loc[mask[col], col] = ''
df
Out[126]:
A B C D E
0 A 2014-01-02 02:00:00 A 1
1 B 2014-01-02 03:00:00 B B 2
2 2014-01-02 04:00:00 C C NaN
3 C NaT C 4
So we generate a mask of the None values using applymap, we then use this mask to iterate over each column of interest and using the boolean mask set the values.
