Since version 0.15.0 this can now be easily done using .dt to access just the date component:
df['just_date'] = df['dates'].dt.date
The above returns datetime.date, so object dtype. If you want to keep the dtype as datetime64 then you can just normalize:
df['normalised_date'] = df['dates'].dt.normalize()
This sets the time component to midnight, i.e. 00:00:00, but the display shows just the date value.
pandas.Series.dt
Since version 0.15.0 this can now be easily done using .dt to access just the date component:
df['just_date'] = df['dates'].dt.date
The above returns datetime.date, so object dtype. If you want to keep the dtype as datetime64 then you can just normalize:
df['normalised_date'] = df['dates'].dt.normalize()
This sets the time component to midnight, i.e. 00:00:00, but the display shows just the date value.
pandas.Series.dt
Simple Solution:
df['date_only'] = df['date_time_column'].dt.date
Videos
Good morning, I have been struggling with converting a pandas dataframe column from Object type to Datetime. The dates in the column are in the following format:
Thursday, Mar 9
I have tried several variations of the following code:
df['date']=pd.to_datetime(df['date'], format='%A, %b %d')
Every different variation of the above code returns an error saying
the errortime data 'Thursday, Mar 9' does not match format '%A,%b %d'
I feel like this should be simple but I'm banging my head against the wall trying to get over this. Thanks in advance for any help
The format in your error isn't the same as the line above, no space between `%A,` and `%b`. You sure there isn't a typo in your original format. This conversion isn't really going to work without some year information though.
x = ['Thursday, Mar 9', 'Wednesday, Mar 10']
pd.to_datetime(x, format='%A, %b %d')
DatetimeIndex(['1900-03-09', '1900-03-10'], dtype='datetime64[ns]', freq=None)
I could be wrong, but the %d means
%d Day of the month as a zero-padded decimal number.
So the 9 should be 09
There is the %-d option
%-d Day of the month as a decimal number. (Platform specific)
More info about time formats here
Essentially equivalent to @waitingkuo, but I would use pd.to_datetime here (it seems a little cleaner, and offers some additional functionality e.g. dayfirst):
In [11]: df
Out[11]:
a time
0 1 2013-01-01
1 2 2013-01-02
2 3 2013-01-03
In [12]: pd.to_datetime(df['time'])
Out[12]:
0 2013-01-01 00:00:00
1 2013-01-02 00:00:00
2 2013-01-03 00:00:00
Name: time, dtype: datetime64[ns]
In [13]: df['time'] = pd.to_datetime(df['time'])
In [14]: df
Out[14]:
a time
0 1 2013-01-01 00:00:00
1 2 2013-01-02 00:00:00
2 3 2013-01-03 00:00:00
Handling ValueErrors
If you run into a situation where doing
df['time'] = pd.to_datetime(df['time'])
Throws a
ValueError: Unknown string format
That means you have invalid (non-coercible) values. If you are okay with having them converted to pd.NaT, you can add an errors='coerce' argument to to_datetime:
df['time'] = pd.to_datetime(df['time'], errors='coerce')
Use astype
In [31]: df
Out[31]:
a time
0 1 2013-01-01
1 2 2013-01-02
2 3 2013-01-03
In [32]: df['time'] = df['time'].astype('datetime64[ns]')
In [33]: df
Out[33]:
a time
0 1 2013-01-01 00:00:00
1 2 2013-01-02 00:00:00
2 3 2013-01-03 00:00:00