You can use DataFrame.fillna or Series.fillna which will replace the Python object None, not the string 'None'.
import pandas as pd
import numpy as np
For dataframe:
df = df.fillna(value=np.nan)
For column or series:
df.mycol.fillna(value=np.nan, inplace=True)
Answer from Guillaume Jacquenot on Stack OverflowYou can use DataFrame.fillna or Series.fillna which will replace the Python object None, not the string 'None'.
import pandas as pd
import numpy as np
For dataframe:
df = df.fillna(value=np.nan)
For column or series:
df.mycol.fillna(value=np.nan, inplace=True)
Here's another option:
df.replace(to_replace=[None], value=np.nan, inplace=True)
Hey All,
I am using the below code to export Jira issues. Not all of my issues have all the custom fields I am trying to export so they end up being 'None'. I would like to replace the 'None' with an empty string. I am trying to use pandas df.fillna but it's not doing what I thought it would. In spite of using fillna, my output is unchanged and still contains 'None' values.
What am I doing wrong?
Thanks!
Code:
# Search all issues, and time execution
jira_issues = jira.search_issues(jql,maxResults=50)
# Converge JSON to Pandas DataFrame
for issue in jira_issues:
try:
issue_fields = pd.DataFrame({
'id': [issue.id],
'Due date': str(issue.fields.duedate),
'Actual Start Date': str(issue.fields.customfield_10055),
'Actual Completion Date': str(issue.fields.customfield_10057)
})
except AttributeError:
pass
issues = pd.concat([issues,issue_fields])
issues.fillna("",inplace=True)
print(issues)Output:
0 79906 None None None 0 79904 2022-07-07 2022-05-18 2022-05-25 0 79903 2022-04-13 2022-04-22 2022-04-22 0 79902 2022-06-15 2022-06-04 2022-06-04 0 79901 None None None 0 79900 None 2022-06-14 None 0 79899 2022-05-06 2022-05-02 2022-05-06 0 79897 None None None 0 79896 None None None
Checking column types using dtypes:
id object Due date object Actual Start Date object Actual Completion Date object dtype: object
python - Replace invalid values with None in Pandas DataFrame - Stack Overflow
Problem with DataFrame.replace using None
python - Rename "None" value in Pandas - Stack Overflow
python - How to replace None only with empty string using pandas? - Stack Overflow
Videos
Actually in later versions of pandas this will give a TypeError:
df.replace('-', None)
TypeError: If "to_replace" and "value" are both None then regex must be a mapping
You can do it by passing either a list or a dictionary:
In [11]: df.replace('-', df.replace(['-'], [None]) # or .replace('-', {0: None})
Out[11]:
0
0 None
1 3
2 2
3 5
4 1
5 -5
6 -1
7 None
8 9
But I recommend using NaNs rather than None:
In [12]: df.replace('-', np.nan)
Out[12]:
0
0 NaN
1 3
2 2
3 5
4 1
5 -5
6 -1
7 NaN
8 9
I prefer the solution using replace with a dict because of its simplicity and elegance:
df.replace({'-': None})
You can also have more replacements:
df.replace({'-': None, 'None': None})
And even for larger replacements, it is always obvious and clear what is replaced by what - which is way harder for long lists, in my opinion.
This is sufficient
df.fillna("",inplace=True)
df
Out[142]:
A B C D E
0 A 2014-01-02 02:00:00 A 1
1 B 2014-01-02 03:00:00 B B 2
2 2014-01-02 04:00:00 C C
3 C C 4
edit 2021-07-26 complete response following @dWitty's comment
If you really want to keep Nat and NaN values on other than text, you just need fill Na for your text column In your exemple this is A, C, D
You just send a dict of replacement value for your columns. value can be differents for each column. For your case you just need construct the dict
# default values to replace NA (None)
# values = {"A": "", "C": "", "D": ""}
values = (dict([[e,""] for e in ['A','C','D']]))
df.fillna(value=values, inplace=True)
df
Out[142]:
A B C D E
0 A 2014-01-02 02:00:00 A 1.0
1 B 2014-01-02 03:00:00 B B 2.0
2 2014-01-02 04:00:00 C C NaN
3 C NaT C 4.0
It looks like None is being promoted to NaN and so you cannot use replace like usual, the following works:
In [126]:
mask = df.applymap(lambda x: x is None)
cols = df.columns[(mask).any()]
for col in df[cols]:
df.loc[mask[col], col] = ''
df
Out[126]:
A B C D E
0 A 2014-01-02 02:00:00 A 1
1 B 2014-01-02 03:00:00 B B 2
2 2014-01-02 04:00:00 C C NaN
3 C NaT C 4
So we generate a mask of the None values using applymap, we then use this mask to iterate over each column of interest and using the boolean mask set the values.