For this kind of regression, I usually convert the dates or timestamps to an integer number of days since the start of the data.

This does the trick nicely:

df = pd.read_csv('test.csv')
df['date'] = pd.to_datetime(df['date'])    
df['date_delta'] = (df['date'] - df['date'].min())  / np.timedelta64(1,'D')
city_data = df[df['city'] == 'London']
result = sm.ols(formula = 'sales ~ date_delta', data = city_data).fit()

The advantage of this method is that you're sure of the units involved in the regression (days), whereas an automatic conversion may implicitly use other units, creating confusing coefficients in your linear model. It also allows you to combine data from multiple sales campaigns that started at different times into your regression (say you're interested in effectiveness of a campaign as a function of days into the campaign). You could also pick Jan 1st as your 0 if you're interested in measuring the day of year trend. Picking your own 0 date puts you in control of all that.

There's also evidence that statsmodels supports timeseries from pandas. You may be able to apply this to linear models as well: http://statsmodels.sourceforge.net/stable/examples/generated/ex_dates.html

Also, a quick note: You should be able to read column names directly out of the csv automatically as in the sample code I posted. In your example I see there are spaces between the commas in the first line of the csv file, resulting in column names like ' date'. Remove the spaces and automatic csv header reading should just work.

Answer from Tom Q. on Stack Overflow
Top answer
1 of 4
44

For this kind of regression, I usually convert the dates or timestamps to an integer number of days since the start of the data.

This does the trick nicely:

df = pd.read_csv('test.csv')
df['date'] = pd.to_datetime(df['date'])    
df['date_delta'] = (df['date'] - df['date'].min())  / np.timedelta64(1,'D')
city_data = df[df['city'] == 'London']
result = sm.ols(formula = 'sales ~ date_delta', data = city_data).fit()

The advantage of this method is that you're sure of the units involved in the regression (days), whereas an automatic conversion may implicitly use other units, creating confusing coefficients in your linear model. It also allows you to combine data from multiple sales campaigns that started at different times into your regression (say you're interested in effectiveness of a campaign as a function of days into the campaign). You could also pick Jan 1st as your 0 if you're interested in measuring the day of year trend. Picking your own 0 date puts you in control of all that.

There's also evidence that statsmodels supports timeseries from pandas. You may be able to apply this to linear models as well: http://statsmodels.sourceforge.net/stable/examples/generated/ex_dates.html

Also, a quick note: You should be able to read column names directly out of the csv automatically as in the sample code I posted. In your example I see there are spaces between the commas in the first line of the csv file, resulting in column names like ' date'. Remove the spaces and automatic csv header reading should just work.

2 of 4
3

get date as floating point year

I prefer a date-format, which can be understood without context. Hence, the floating point year representation. The nice thing here is, that the solution works on a numpy level - hence should be fast.

import numpy as np
import pandas as pd

def dt64_to_float(dt64):
    """Converts numpy.datetime64 to year as float.

    Rounded to days

    Parameters
    ----------
    dt64 : np.datetime64 or np.ndarray(dtype='datetime64[X]')
        date data

    Returns
    -------
    float or np.ndarray(dtype=float)
        Year in floating point representation
    """

    year = dt64.astype('M8[Y]')
    # print('year:', year)
    days = (dt64 - year).astype('timedelta64[D]')
    # print('days:', days)
    year_next = year + np.timedelta64(1, 'Y')
    # print('year_next:', year_next)
    days_of_year = (year_next.astype('M8[D]') - year.astype('M8[D]')
                    ).astype('timedelta64[D]')
    # print('days_of_year:', days_of_year)
    dt_float = 1970 + year.astype(float) + days / (days_of_year)
    # print('dt_float:', dt_float)
    return dt_float

if __name__ == "__main__":

    dates = np.array([
        '1970-01-01', '2014-01-01', '2020-12-31', '2019-12-31', '2010-04-28'],
        dtype='datetime64[D]')

    df = pd.DataFrame({
        'date': dates,
        'number': np.arange(5)
        })

    df['date_float'] = dt64_to_float(df['date'].to_numpy())
    print('df:', df, sep='\n')
    print()

    dt64 = np.datetime64( "2011-11-11" )
    print('dt64:', dt64_to_float(dt64))

output

df:
        date  number   date_float
0 1970-01-01       0  1970.000000
1 2014-01-01       1  2014.000000
2 2020-12-31       2  2020.997268
3 2019-12-31       3  2019.997260
4 2010-04-28       4  2010.320548

dt64: 2011.8602739726027
Discussions

python - How to turn a series that contains pandas.datetime.timedelta to floats e.g. hours or days? - Data Science Stack Exchange
I want to plot some data that contains timedeltas. However, the numpy plot() and errorbar() return: TypeError: float() argument must be a string or a number, not 'Timedelta' The question is how ca... More on datascience.stackexchange.com
🌐 datascience.stackexchange.com
April 19, 2018
python - Converting Pandas DatetimeIndex to a numeric format - Stack Overflow
I want to convert the DatetimeIndex in my DataFrame to float format,which can be analysed in my model.Could someone tell me how to do it? Do I need to use date2num()function? Many thanks! More on stackoverflow.com
🌐 stackoverflow.com
python - Convert float64 column to datetime pandas - Stack Overflow
I have the following pandas DataFrame column dfA['TradeDate']: 0 20100329.0 1 20100328.0 2 20100329.0 ... and I wish to transform it to a datetime. Based on another tread on SO, I con... More on stackoverflow.com
🌐 stackoverflow.com
What is the float number associated with datetime objects when plotting?
I haven't used pandas much, but those look like decimal days. Note that datetime according to the documentation gives time since Jan 1, 1970 and as of today there have been 19,193 days since that date. That is, 19193 means today, June 20, 2022 and 19191 means June 18. So if you multiply the fractional part by 3600 you get the number of seconds from midnight of that day. More on reddit.com
🌐 r/learnpython
4
2
June 4, 2022
🌐
Pandas
pandas.pydata.org › docs › reference › api › pandas.to_datetime.html
pandas.to_datetime — pandas 3.0.1 documentation - PyData |
For example when one of ‘year’, ‘month’, day’ columns is missing in a DataFrame, or when a Timezone-aware datetime.datetime is found in an array-like of mixed time offsets, and utc=False, or when parsing datetimes with mixed time zones unless utc=True. If parsing datetimes with mixed time zones, please specify utc=True. ... Cast argument to a specified dtype. ... Convert argument to timedelta. ... Convert dtypes. ... scalars can be int, float, str, datetime object (from stdlib datetime module or numpy).
🌐
CopyProgramming
copyprogramming.com › howto › pandas-way-convert-time-of-the-day-valid-datetime-time-to-float-variables
Converting Pandas Datetime to Float: Complete Guide with Code Examples
February 27, 2024 - Unix timestamp conversion is the industry standard for datetime-to-float transformation. It represents time as the number of seconds elapsed since January 1, 1970, 00:00:00 UTC. import pandas as pd import numpy as np # Create a datetime Series dates = pd.to_datetime(['2024-01-15 14:30:00', '2024-06-20 09:15:30', '2024-12-25 23:59:59']) # Convert to Unix timestamp (fastest vectorized method) unix_timestamps = dates.view(int) // 10**9 print(unix_timestamps) # Output: [1705338600 1718866530 1735084799]
🌐
Pandas
pandas.pydata.org › pandas-docs › version › 0.15.1 › timeseries.html
Time Series / Date functionality — pandas 0.15.1 documentation
In [12]: periods = PeriodIndex([Period('2012-01'), Period('2012-02'), ....: Period('2012-03')]) ....: In [13]: ts = Series(np.random.randn(3), periods) In [14]: type(ts.index) Out[14]: pandas.tseries.period.PeriodIndex In [15]: ts Out[15]: 2012-01 -1.219217 2012-02 -1.226825 2012-03 0.769804 Freq: M, dtype: float64 · Starting with 0.8, pandas allows you to capture both representations and convert between them. Under the hood, pandas represents timestamps using instances of Timestamp and sequences of timestamps using instances of DatetimeIndex.
Find elsewhere
🌐
Saturn Cloud
saturncloud.io › blog › how-to-convert-pandas-time-object-to-float-in-python
How to Convert Pandas Time Object to Float in Python | Saturn Cloud Blog
October 4, 2023 - We can use the datetime.timestamp() method to convert a Pandas time object to a Unix timestamp, which is a float representing the number of seconds since January 1, 1970, 00:00:00 (UTC).
🌐
GeeksforGeeks
geeksforgeeks.org › python › how-to-convert-float-to-datetime-in-pandas-dataframe
How to Convert Float to Datetime in Pandas DataFrame? - GeeksforGeeks
July 15, 2025 - In the above example, we change the data type of column 'Dates' from 'float64' to 'datetime64[ns]' and format from 'yymmdd' to 'yyyymmdd'. Example 3: When we have to convert the float column to Date and Time format ... # importing pandas library import pandas as pd # Initializing the nested list with Data set player_list = [[20200112082520.0,'Mathematics'], [20200114085020.0,'English'], [20200116093529.0,'Physics'], [20200119101530.0,'Chemistry'], [20200121104060.0,'French'], [20200124113541.0,'Biology'], [20200129125023.0,'Sanskrit']] # creating a pandas dataframe df = pd.DataFrame(player_list,columns=['Dates','Test']) # printing dataframe print(df) print() # checking the type print(df.dtypes)
🌐
Medium
medium.com › @timsennett › converting-integers-and-floats-to-datetime-in-pandas-4e754dc978fb
A Faster Way to Convert to Datetime | by Tim Sennett | Medium
April 7, 2020 - A Faster Way to Convert to Datetime Even with very large DataFrames, you can quickly typecast ill-formatted integers, floats and strings to datetime For my latest end-of-module project in the …
🌐
Reddit
reddit.com › r/learnpython › what is the float number associated with datetime objects when plotting?
r/learnpython on Reddit: What is the float number associated with datetime objects when plotting?
June 4, 2022 -

I am plotting time series data from a pandas dataframe using matplotlib. When I plot the data and open up the figure options window from the matplotlib figure toolbar to adjust axis scales the x-axis (datetime) is given as floats, sometimes with quite a few decimal places.

https://imgur.com/a/eq0PDyo

https://imgur.com/a/JzIbbpv

I want to be able to set my x-scale from this "Figure options" window. How do I figure out the float that corresponds to my desired datetime?

If more info is needed... I am reading a csv file and converting a "Time" column of strings to datetime using `df["Time"] = pd.to_datetime(df["Time"])`

🌐
TutorialsPoint
tutorialspoint.com › article › how-to-convert-float-to-datetime-in-pandas-dataframe
How to Convert Float to Datetime in Pandas DataFrame?
August 4, 2023 - We start by importing the required libraries, including Pandas. We create a sample DataFrame df with a column named 'timestamp' that contains float values representing Unix timestamps. Using the pd.to_datetime() function, we convert the 'timestamp' column to datetime format.
🌐
GitHub
github.com › pandas-dev › pandas › issues › 14104
first/last converts datetime into float · Issue #14104 · pandas-dev/pandas
October 27, 2023 - import pandas as pd df = pd.DataFrame({'a': [1, 1, 2, 2], 'b': pd.to_datetime(['a', 'a', '12-23-2015', '12-24-2015'], errors='coerce')}) df.groupby('a').b.first().dtype · last line is supposed to show datetime64[ns] but it actually is float64 this only happens when the entire group has NaT datetime
🌐
Saturn Cloud
saturncloud.io › blog › how-to-convert-dates-to-float-for-linear-regression-on-pandas-data-frame
How to Convert Dates to Float for Linear Regression on Pandas Data Frame | Saturn Cloud Blog
October 4, 2023 - We then apply a lambda function to the date column using the apply method, which converts each datetime object to a Unix timestamp using the datetime.datetime.timestamp function. We can also convert dates to floats using other numerical representations, such as the number of days since a certain date, or the number of milliseconds since a certain date. The choice of numerical representation depends on the specific use case and the granularity of the data. Once we have converted our date column to a float column, we can use it as an input variable in a linear regression model.
🌐
Python.org
discuss.python.org › python help
Conveting timestamp into integer or float - Python Help - Discussions on Python.org
January 19, 2023 - Hello all, I have this timestamp 2021 21:35:18.608757000 and need to convert it into integer or float. I tried to use datetime function put it gives me an error of invalid syntax. any help please.
🌐
DEV Community
dev.to › maikomiyazaki › python-date-time-conversion-cheatsheet-3m69
Python date & time conversion CheatSheet - DEV Community
April 29, 2021 - Then we can use strptimeto specify and convert into the desired format. Input Example: float 1617836400.0 (UTC timestamp in seconds) Output: DateTime object with "2021-04-08 00:00:00"
🌐
w3resource
w3resource.com › python-exercises › pandas › time-series › pandas-time-series-exercise-26.php
Pandas: Converting integer or float epoch times to Timestamp and DatetimeIndex - w3resource
May 12, 2025 - import pandas as pd dates1 = pd.to_datetime([1329806505, 129806505, 1249892905, 1249979305, 1250065705], unit='s') print("Convert integer or float epoch times to Timestamp and DatetimeIndex upto second:") print(dates1) print("\nConvert integer or float epoch times to Timestamp and DatetimeIndex upto milisecond:") dates2 = pd.to_datetime([1249720105100, 1249720105200, 1249720105300, 1249720105400, 1249720105500], unit='ms') print(dates2)