get date as floating point year

stackoverflow.com › questions › 24588437 › convert-date-to-float-for-linear-regression-on-pandas-data-frame

python - Convert date to float for linear regression on Pandas data frame - Stack Overflow

1 of 4

44

For this kind of regression, I usually convert the dates or timestamps to an integer number of days since the start of the data.

This does the trick nicely:

df = pd.read_csv('test.csv')
df['date'] = pd.to_datetime(df['date'])    
df['date_delta'] = (df['date'] - df['date'].min())  / np.timedelta64(1,'D')
city_data = df[df['city'] == 'London']
result = sm.ols(formula = 'sales ~ date_delta', data = city_data).fit()

The advantage of this method is that you're sure of the units involved in the regression (days), whereas an automatic conversion may implicitly use other units, creating confusing coefficients in your linear model. It also allows you to combine data from multiple sales campaigns that started at different times into your regression (say you're interested in effectiveness of a campaign as a function of days into the campaign). You could also pick Jan 1st as your 0 if you're interested in measuring the day of year trend. Picking your own 0 date puts you in control of all that.

There's also evidence that statsmodels supports timeseries from pandas. You may be able to apply this to linear models as well: http://statsmodels.sourceforge.net/stable/examples/generated/ex_dates.html

Also, a quick note: You should be able to read column names directly out of the csv automatically as in the sample code I posted. In your example I see there are spaces between the commas in the first line of the csv file, resulting in column names like ' date'. Remove the spaces and automatic csv header reading should just work.

2 of 4

3

get date as floating point year

I prefer a date-format, which can be understood without context. Hence, the floating point year representation. The nice thing here is, that the solution works on a numpy level - hence should be fast.

import numpy as np
import pandas as pd

def dt64_to_float(dt64):
    """Converts numpy.datetime64 to year as float.

    Rounded to days

    Parameters
    ----------
    dt64 : np.datetime64 or np.ndarray(dtype='datetime64[X]')
        date data

    Returns
    -------
    float or np.ndarray(dtype=float)
        Year in floating point representation
    """

    year = dt64.astype('M8[Y]')
    # print('year:', year)
    days = (dt64 - year).astype('timedelta64[D]')
    # print('days:', days)
    year_next = year + np.timedelta64(1, 'Y')
    # print('year_next:', year_next)
    days_of_year = (year_next.astype('M8[D]') - year.astype('M8[D]')
                    ).astype('timedelta64[D]')
    # print('days_of_year:', days_of_year)
    dt_float = 1970 + year.astype(float) + days / (days_of_year)
    # print('dt_float:', dt_float)
    return dt_float

if __name__ == "__main__":

    dates = np.array([
        '1970-01-01', '2014-01-01', '2020-12-31', '2019-12-31', '2010-04-28'],
        dtype='datetime64[D]')

    df = pd.DataFrame({
        'date': dates,
        'number': np.arange(5)
        })

    df['date_float'] = dt64_to_float(df['date'].to_numpy())
    print('df:', df, sep='\n')
    print()

    dt64 = np.datetime64( "2011-11-11" )
    print('dt64:', dt64_to_float(dt64))

output

df:
        date  number   date_float
0 1970-01-01       0  1970.000000
1 2014-01-01       1  2014.000000
2 2020-12-31       2  2020.997268
3 2019-12-31       3  2019.997260
4 2010-04-28       4  2010.320548

dt64: 2011.8602739726027

stackoverflow.com › questions › 28353218 › pandas-change-time-object-to-a-float

datetime - pandas - change time object to a float? - Stack Overflow

1 of 4

11

You can use time deltas to do this more directly:

In [11]: s = pd.Series(["00:10:30"])

In [12]: s = pd.to_timedelta(s)

In [13]: s
Out[13]:
0   00:10:30
dtype: timedelta64[ns]

In [14]: s / pd.offsets.Minute(1)
Out[14]:
0    10.5
dtype: float64

2 of 4

How to Convert datetime Objects into float in Pandas - YouTube

I would convert the string to a datetime and then use the dt accessor to access the components of the time and generate your minutes column:

In [16]:

df = pd.DataFrame({'time':['00:10:30']})
df['time'] = pd.to_datetime(df['time'])
df['minutes'] = df['time'].dt.hour * 60 + df['time'].dt.minute + df['time'].dt.second/60
df
Out[16]:
                 time  minutes
0 2015-02-05 00:10:30     10.5

Discussions

python - How to turn a series that contains pandas.datetime.timedelta to floats e.g. hours or days? - Data Science Stack Exchange

I want to plot some data that contains timedeltas. However, the numpy plot() and errorbar() return: TypeError: float() argument must be a string or a number, not 'Timedelta' The question is how ca... More on datascience.stackexchange.com

datascience.stackexchange.com

April 19, 2018

python - Converting Pandas DatetimeIndex to a numeric format - Stack Overflow

I want to convert the DatetimeIndex in my DataFrame to float format,which can be analysed in my model.Could someone tell me how to do it? Do I need to use date2num()function? Many thanks! More on stackoverflow.com

stackoverflow.com

python - Convert float64 column to datetime pandas - Stack Overflow

I have the following pandas DataFrame column dfA['TradeDate']: 0 20100329.0 1 20100328.0 2 20100329.0 ... and I wish to transform it to a datetime. Based on another tread on SO, I con... More on stackoverflow.com

stackoverflow.com

What is the float number associated with datetime objects when plotting?

I haven't used pandas much, but those look like decimal days. Note that datetime according to the documentation gives time since Jan 1, 1970 and as of today there have been 19,193 days since that date. That is, 19193 means today, June 20, 2022 and 19191 means June 18. So if you multiply the fractional part by 3600 you get the number of seconds from midnight of that day. More on reddit.com

r/learnpython

4

2

June 4, 2022

Videos

01:50

YouTube

CHANGE COLUMN DTYPE | How to change the datatype of a column in ...

September 23, 2020

10:38

YouTube

#80 Pandas (Part 57) Time: Epoch timestamp, float precision, change ...

pandas.pydata.org › docs › reference › api › pandas.to_datetime.html

pandas.to_datetime — pandas 3.0.1 documentation - PyData |

For example when one of ‘year’, ‘month’, day’ columns is missing in a DataFrame, or when a Timezone-aware datetime.datetime is found in an array-like of mixed time offsets, and utc=False, or when parsing datetimes with mixed time zones unless utc=True. If parsing datetimes with mixed time zones, please specify utc=True. ... Cast argument to a specified dtype. ... Convert argument to timedelta. ... Convert dtypes. ... scalars can be int, float, str, datetime object (from stdlib datetime module or numpy).

Stack Exchange

datascience.stackexchange.com › questions › 30500 › how-to-turn-a-series-that-contains-pandas-datetime-timedelta-to-floats-e-g-hour

python - How to turn a series that contains pandas.datetime.timedelta to floats e.g. hours or days? - Data Science Stack Exchange

1 of 2

copyprogramming.com › howto › pandas-way-convert-time-of-the-day-valid-datetime-time-to-float-variables

One way to do this would be to use Series.dt.seconds and ´Series.dt.days´ and multiply with a factor for the desired unit:

(Series.dt.seconds/3600) + (Series.dt.days*24)  # for values with [hours]

2 of 2

0

You can divide by the timedelta you want to use as unit:

totalDays = my_timedelta = pd.TimeDelta('1D')

CopyProgramming

Converting Pandas Datetime to Float: Complete Guide with Code Examples

February 27, 2024 - Unix timestamp conversion is the industry standard for datetime-to-float transformation. It represents time as the number of seconds elapsed since January 1, 1970, 00:00:00 UTC. import pandas as pd import numpy as np # Create a datetime Series dates = pd.to_datetime(['2024-01-15 14:30:00', '2024-06-20 09:15:30', '2024-12-25 23:59:59']) # Convert to Unix timestamp (fastest vectorized method) unix_timestamps = dates.view(int) // 10**9 print(unix_timestamps) # Output: [1705338600 1718866530 1735084799]

Pandas

pandas.pydata.org › pandas-docs › version › 0.15.1 › timeseries.html

Time Series / Date functionality — pandas 0.15.1 documentation

In [12]: periods = PeriodIndex([Period('2012-01'), Period('2012-02'), ....: Period('2012-03')]) ....: In [13]: ts = Series(np.random.randn(3), periods) In [14]: type(ts.index) Out[14]: pandas.tseries.period.PeriodIndex In [15]: ts Out[15]: 2012-01 -1.219217 2012-02 -1.226825 2012-03 0.769804 Freq: M, dtype: float64 · Starting with 0.8, pandas allows you to capture both representations and convert between them. Under the hood, pandas represents timestamps using instances of Timestamp and sequences of timestamps using instances of DatetimeIndex.

stackoverflow.com › questions › 46501626 › converting-pandas-datetimeindex-to-a-numeric-format

python - Converting Pandas DatetimeIndex to a numeric format - Stack Overflow

saturncloud.io › blog › how-to-convert-pandas-time-object-to-float-in-python

1 of 5

18

Convert to Timedelta and extract the total seconds from dt.total_seconds:

df

        date
0 2013-01-01
1 2013-01-02
2 2013-01-03
3 2013-01-04
4 2013-01-05
5 2013-01-06
6 2013-01-07
7 2013-01-08
8 2013-01-09
9 2013-01-10

pd.to_timedelta(df.date).dt.total_seconds()

0    1.356998e+09
1    1.357085e+09
2    1.357171e+09
3    1.357258e+09
4    1.357344e+09
5    1.357430e+09
6    1.357517e+09
7    1.357603e+09
8    1.357690e+09
9    1.357776e+09
Name: date, dtype: float64

Or, maybe, the data would be more useful presented as an int type:

pd.to_timedelta(df.date).dt.total_seconds().astype(int)

0    1356998400
1    1357084800
2    1357171200
3    1357257600
4    1357344000
5    1357430400
6    1357516800
7    1357603200
8    1357689600
9    1357776000
Name: date, dtype: int64

2 of 5

11

Use astype float i.e if you have a dataframe like

df = pd.DataFrame({'date': ['1998-03-01 00:00:01', '2001-04-01 00:00:01','1998-06-01 00:00:01','2001-08-01 00:00:01','2001-05-03 00:00:01','1994-03-01 00:00:01'] })
df['date'] = pd.to_datetime(df['date'])
df['x'] = list('abcdef')
df = df.set_index('date')

Then

df.index.values.astype(float)

array([  8.88710401e+17,   9.86083201e+17,   8.96659201e+17,
     9.96624001e+17,   9.88848001e+17,   7.62480001e+17])

pd.to_datetime(df.index.values.astype(float))

DatetimeIndex(['1998-03-01 00:00:01', '2001-04-01 00:00:01',
           '1998-06-01 00:00:01', '2001-08-01 00:00:01',
           '2001-05-03 00:00:01', '1994-03-01 00:00:01'],
          dtype='datetime64[ns]', freq=None)

Find elsewhere

Google Bing Mojeek

Saturn Cloud

How to Convert Pandas Time Object to Float in Python | Saturn Cloud Blog

October 4, 2023 - We can use the datetime.timestamp() method to convert a Pandas time object to a Unix timestamp, which is a float representing the number of seconds since January 1, 1970, 00:00:00 (UTC).

GeeksforGeeks

geeksforgeeks.org › python › how-to-convert-float-to-datetime-in-pandas-dataframe

How to Convert Float to Datetime in Pandas DataFrame? - GeeksforGeeks

July 15, 2025 - In the above example, we change the data type of column 'Dates' from 'float64' to 'datetime64[ns]' and format from 'yymmdd' to 'yyyymmdd'. Example 3: When we have to convert the float column to Date and Time format ... # importing pandas library import pandas as pd # Initializing the nested list with Data set player_list = [[20200112082520.0,'Mathematics'], [20200114085020.0,'English'], [20200116093529.0,'Physics'], [20200119101530.0,'Chemistry'], [20200121104060.0,'French'], [20200124113541.0,'Biology'], [20200129125023.0,'Sanskrit']] # creating a pandas dataframe df = pd.DataFrame(player_list,columns=['Dates','Test']) # printing dataframe print(df) print() # checking the type print(df.dtypes)

Medium

medium.com › @timsennett › converting-integers-and-floats-to-datetime-in-pandas-4e754dc978fb

A Faster Way to Convert to Datetime | by Tim Sennett | Medium

April 7, 2020 - A Faster Way to Convert to Datetime Even with very large DataFrames, you can quickly typecast ill-formatted integers, floats and strings to datetime For my latest end-of-module project in the …

stackoverflow.com › questions › 43116897 › convert-float64-column-to-datetime-pandas

python - Convert float64 column to datetime pandas - Stack Overflow

reddit.com › r/learnpython › what is the float number associated with datetime objects when plotting?

1 of 4

13

You can use:

df['TradeDate'] = pd.to_datetime(df['TradeDate'], format='%Y%m%d.0')
print (df)
   TradeDate
0 2010-03-29
1 2010-03-28
2 2010-03-29

But if some bad values, add errors='coerce' for replace them to NaT

print (df)
    TradeDate
0  20100329.0
1  20100328.0
2  20100329.0
3  20153030.0
4         yyy

df['TradeDate'] = pd.to_datetime(df['TradeDate'], format='%Y%m%d.0', errors='coerce')
print (df)
   TradeDate
0 2010-03-29
1 2010-03-28
2 2010-03-29
3        NaT
4        NaT

2 of 4

1

You can use to_datetime with a custom format on a string representation of the values:

import pandas as pd
pd.to_datetime(pd.Series([20100329.0, 20100328.0, 20100329.0]).astype(str), format='%Y%m%d.0')

r/learnpython on Reddit: What is the float number associated with datetime objects when plotting?

June 4, 2022 -

I am plotting time series data from a pandas dataframe using matplotlib. When I plot the data and open up the figure options window from the matplotlib figure toolbar to adjust axis scales the x-axis (datetime) is given as floats, sometimes with quite a few decimal places.

https://imgur.com/a/eq0PDyo

https://imgur.com/a/JzIbbpv

I want to be able to set my x-scale from this "Figure options" window. How do I figure out the float that corresponds to my desired datetime?

If more info is needed... I am reading a csv file and converting a "Time" column of strings to datetime using `df["Time"] = pd.to_datetime(df["Time"])`

1 of 1

I haven't used pandas much, but those look like decimal days. Note that datetime according to the documentation gives time since Jan 1, 1970 and as of today there have been 19,193 days since that date. That is, 19193 means today, June 20, 2022 and 19191 means June 18. So if you multiply the fractional part by 3600 you get the number of seconds from midnight of that day.

TutorialsPoint

tutorialspoint.com › article › how-to-convert-float-to-datetime-in-pandas-dataframe

How to Convert Float to Datetime in Pandas DataFrame?

August 4, 2023 - We start by importing the required libraries, including Pandas. We create a sample DataFrame df with a column named 'timestamp' that contains float values representing Unix timestamps. Using the pd.to_datetime() function, we convert the 'timestamp' column to datetime format.

GitHub

github.com › pandas-dev › pandas › issues › 14104

first/last converts datetime into float · Issue #14104 · pandas-dev/pandas

October 27, 2023 - import pandas as pd df = pd.DataFrame({'a': [1, 1, 2, 2], 'b': pd.to_datetime(['a', 'a', '12-23-2015', '12-24-2015'], errors='coerce')}) df.groupby('a').b.first().dtype · last line is supposed to show datetime64[ns] but it actually is float64 this only happens when the entire group has NaT datetime

stackoverflow.com › questions › 36083857 › pandas-way-convert-time-of-the-day-valid-datetime-time-to-float-variables

python - pandas way convert time of the day (valid datetime.time) to float variables - Stack Overflow

saturncloud.io › blog › how-to-convert-dates-to-float-for-linear-regression-on-pandas-data-frame

1 of 1

5

If your time column is already datetime then you can access the time components using .dt accessor:

In [18]:
df['Time'].dt.hour + df['Time'].dt.minute/60

Out[18]:
0    16.533333
1    17.216667
2    18.166667
Name: Time, dtype: float64

if required then you can convert doing df['Time'] = pd.to_datetime(df['Time'])

Saturn Cloud

How to Convert Dates to Float for Linear Regression on Pandas Data Frame | Saturn Cloud Blog

October 4, 2023 - We then apply a lambda function to the date column using the apply method, which converts each datetime object to a Unix timestamp using the datetime.datetime.timestamp function. We can also convert dates to floats using other numerical representations, such as the number of days since a certain date, or the number of milliseconds since a certain date. The choice of numerical representation depends on the specific use case and the granularity of the data. Once we have converted our date column to a float column, we can use it as an input variable in a linear regression model.

Python.org

discuss.python.org › python help

Conveting timestamp into integer or float - Python Help - Discussions on Python.org

January 19, 2023 - Hello all, I have this timestamp 2021 21:35:18.608757000 and need to convert it into integer or float. I tried to use datetime function put it gives me an error of invalid syntax. any help please.

DEV Community

dev.to › maikomiyazaki › python-date-time-conversion-cheatsheet-3m69

Python date & time conversion CheatSheet - DEV Community

April 29, 2021 - Then we can use strptimeto specify and convert into the desired format. Input Example: float 1617836400.0 (UTC timestamp in seconds) Output: DateTime object with "2021-04-08 00:00:00"

stackoverflow.com › questions › 54313461 › pandas-convert-float-to-proper-datetime-or-time-object

python - Pandas - convert float to proper datetime or time object - Stack Overflow

1 of 3

6

When you read the excel file specify the dtype of col itime as a str:

df = pd.read_excel("test.xlsx", dtype={'itime':str})

then you will have a time column of strings looking like:

df = pd.DataFrame({'itime':['2300', '0100', '0500', '1000']})

Then specify the format and convert to time:

df['Time'] = pd.to_datetime(df['itime'], format='%H%M').dt.time

    itime   Time
0   2300    23:00:00
1   0100    01:00:00
2   0500    05:00:00
3   1000    10:00:00

2 of 3