This has been answered in the comments where it was noted that the following works:
Copydf.astype({'date': 'datetime64[ns]'})
In addition, you can set the dtype when reading in the data:
Copypd.read_csv('path/to/file.csv', parse_dates=['date'])
This has been answered in the comments where it was noted that the following works:
Copydf.astype({'date': 'datetime64[ns]'})
In addition, you can set the dtype when reading in the data:
Copypd.read_csv('path/to/file.csv', parse_dates=['date'])
datetime
Since you can't pass datetime format to astype(), it's a little primitive and it's better to use pd.to_datetime() instead.
Copydf['date'] = pd.to_datetime(df['date'])
For example, if the dates in the data are of the format %d/%m/%Y such as 01/04/2020, astype() would incorrectly parse it as 2020-01-04 whereas with pd.to_datetime(), you can pass the correct format.
If you need to convert multiple columns into datetime64 (which is often the reason astype() is used), then you can apply pd.to_datetime().
Copydf = pd.DataFrame({'date1': ['01/04/2020'], 'date2': ['02/04/2020']})
df = df.apply(pd.to_datetime, format='%d/%m/%Y')
Even with read_csv, you have some control over the format, e.g.
Copydf = pd.read_csv('file.csv', parse_dates=['date'], dayfirst=True)
date
If you want to cast into date, then you can first cast to datetime64[ns] and then use dt.date to get a column of datetime.date objects:
Copydf['date'] = pd.to_datetime(df['date']).dt.date
The column dtype will become object though (on which you can still perform vectorized operations such as adding days, comparing dates etc.), so if you plan to work on it a lot in pandas, it's more performative to use datetime64 instead. For example, adding a day is extremely fast on datetime64 columns, not so much on date columns:
Copys_dt = pd.Series(pd.date_range('1700', None, 10000, 'D'))
s_d = s_dt.dt.date
%timeit x = s_dt + pd.Timedelta(days=1)
# 344 µs ± 17.3 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
%timeit y = s_d + pd.Timedelta(days=1)
# 56.1 ms ± 11.2 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
With that being said, if you dump it into a database (such as sqlite), an object dtype column of datetime.date objects stored as a DATE type (whereas datetime64[ns] will be stored as TIMESTAMP).
Pandas datetime dtype is from numpy datetime64, so if you have pandas<2.0, you can use the following as well (since pandas 2.0, unitless datetime64 is not supported anymore). There's no date dtype (although you can perform vectorized operations on a column that holds datetime.date values).
Copydf = df.astype({'date': np.datetime64})
# or (on a little endian system)
df = df.astype({'date': '<M8'})
# (on a big endian system)
df = df.astype({'date': '>M8'})
pd.to_datetime returns a Series of datetime64 dtype, as described here:
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html
df['DATE'] = df['Date'].dt.date
or this:
df['Date'].map(datetime.datetime.date)
You can use apply function on the dataframe column to convert the necessary column to String. For example:
df['DATE'] = df['Date'].apply(lambda x: x.strftime('%Y-%m-%d'))
Make sure to import datetime module.
apply() will take each cell at a time for evaluation and apply the formatting as specified in the lambda function.