pandas datetime between

stackoverflow.com › questions › 29370057 › select-dataframe-rows-between-two-dates

There are two possible solutions:

Use a boolean mask, then use df.loc[mask]
Set the date column as a DatetimeIndex, then use df[start_date : end_date]

Using a boolean mask:

Ensure df['date'] is a Series with dtype datetime64[ns]:

Copydf['date'] = pd.to_datetime(df['date'])

Make a boolean mask. start_date and end_date can be datetime.datetimes, np.datetime64s, pd.Timestamps, or even datetime strings:

Copy#greater than the start date and smaller than the end date
mask = (df['date'] > start_date) & (df['date'] <= end_date)

Select the sub-DataFrame:

Copydf.loc[mask]

or re-assign to df

Copydf = df.loc[mask]

For example,

Copyimport numpy as np
import pandas as pd

df = pd.DataFrame(np.random.random((200,3)))
df['date'] = pd.date_range('2000-1-1', periods=200, freq='D')
mask = (df['date'] > '2000-6-1') & (df['date'] <= '2000-6-10')
print(df.loc[mask])

yields

Copy            0         1         2       date
153  0.208875  0.727656  0.037787 2000-06-02
154  0.750800  0.776498  0.237716 2000-06-03
155  0.812008  0.127338  0.397240 2000-06-04
156  0.639937  0.207359  0.533527 2000-06-05
157  0.416998  0.845658  0.872826 2000-06-06
158  0.440069  0.338690  0.847545 2000-06-07
159  0.202354  0.624833  0.740254 2000-06-08
160  0.465746  0.080888  0.155452 2000-06-09
161  0.858232  0.190321  0.432574 2000-06-10

Using a DatetimeIndex:

If you are going to do a lot of selections by date, it may be quicker to set the date column as the index first. Then you can select rows by date using df.loc[start_date:end_date].

Copyimport numpy as np
import pandas as pd

df = pd.DataFrame(np.random.random((200,3)))
df['date'] = pd.date_range('2000-1-1', periods=200, freq='D')
df = df.set_index(['date'])
print(df.loc['2000-6-1':'2000-6-10'])

yields

Copy                   0         1         2
date                                    
2000-06-01  0.040457  0.326594  0.492136    # <- includes start_date
2000-06-02  0.279323  0.877446  0.464523
2000-06-03  0.328068  0.837669  0.608559
2000-06-04  0.107959  0.678297  0.517435
2000-06-05  0.131555  0.418380  0.025725
2000-06-06  0.999961  0.619517  0.206108
2000-06-07  0.129270  0.024533  0.154769
2000-06-08  0.441010  0.741781  0.470402
2000-06-09  0.682101  0.375660  0.009916
2000-06-10  0.754488  0.352293  0.339337

While Python list indexing, e.g. seq[start:end] includes start but not end, in contrast, Pandas df.loc[start_date : end_date] includes both end-points in the result if they are in the index. Neither start_date nor end_date has to be in the index however.

Also note that pd.read_csv has a parse_dates parameter which you could use to parse the date column as datetime64s. Thus, if you use parse_dates, you would not need to use df['date'] = pd.to_datetime(df['date']).

Answer from unutbu on Stack Overflow

Pandas

pandas.pydata.org › docs › reference › api › pandas.DataFrame.between_time.html

pandas.DataFrame.between_time — pandas 3.0.3 documentation

Select values between particular times of the day (e.g., 9:00-9:30 AM) · By setting start_time to be later than end_time, you can get the times that are not between the two times

Stack Overflow

stackoverflow.com › questions › 29370057 › select-dataframe-rows-between-two-dates

python - Select DataFrame rows between two dates - Stack Overflow

Top answer

1 of 14

719

There are two possible solutions:

Use a boolean mask, then use df.loc[mask]
Set the date column as a DatetimeIndex, then use df[start_date : end_date]

Using a boolean mask:

Ensure df['date'] is a Series with dtype datetime64[ns]:

Copydf['date'] = pd.to_datetime(df['date'])

Make a boolean mask. start_date and end_date can be datetime.datetimes, np.datetime64s, pd.Timestamps, or even datetime strings:

Copy#greater than the start date and smaller than the end date
mask = (df['date'] > start_date) & (df['date'] <= end_date)

Select the sub-DataFrame:

Copydf.loc[mask]

or re-assign to df

Copydf = df.loc[mask]

For example,

Copyimport numpy as np
import pandas as pd

df = pd.DataFrame(np.random.random((200,3)))
df['date'] = pd.date_range('2000-1-1', periods=200, freq='D')
mask = (df['date'] > '2000-6-1') & (df['date'] <= '2000-6-10')
print(df.loc[mask])

yields

Copy            0         1         2       date
153  0.208875  0.727656  0.037787 2000-06-02
154  0.750800  0.776498  0.237716 2000-06-03
155  0.812008  0.127338  0.397240 2000-06-04
156  0.639937  0.207359  0.533527 2000-06-05
157  0.416998  0.845658  0.872826 2000-06-06
158  0.440069  0.338690  0.847545 2000-06-07
159  0.202354  0.624833  0.740254 2000-06-08
160  0.465746  0.080888  0.155452 2000-06-09
161  0.858232  0.190321  0.432574 2000-06-10

Using a DatetimeIndex:

If you are going to do a lot of selections by date, it may be quicker to set the date column as the index first. Then you can select rows by date using df.loc[start_date:end_date].

Copyimport numpy as np
import pandas as pd

df = pd.DataFrame(np.random.random((200,3)))
df['date'] = pd.date_range('2000-1-1', periods=200, freq='D')
df = df.set_index(['date'])
print(df.loc['2000-6-1':'2000-6-10'])

yields

Copy                   0         1         2
date                                    
2000-06-01  0.040457  0.326594  0.492136    # <- includes start_date
2000-06-02  0.279323  0.877446  0.464523
2000-06-03  0.328068  0.837669  0.608559
2000-06-04  0.107959  0.678297  0.517435
2000-06-05  0.131555  0.418380  0.025725
2000-06-06  0.999961  0.619517  0.206108
2000-06-07  0.129270  0.024533  0.154769
2000-06-08  0.441010  0.741781  0.470402
2000-06-09  0.682101  0.375660  0.009916
2000-06-10  0.754488  0.352293  0.339337

2 of 14

138

I feel the best option will be to use the direct checks rather than using loc function:

Copydf = df[(df['date'] > '2000-6-1') & (df['date'] <= '2000-6-10')]

It works for me.

Major issue with loc function with a slice is that the limits should be present in the actual values, if not this will result in KeyError.

Pandas

pandas.pydata.org › docs › user_guide › timeseries.html

Time series / date functionality — pandas 3.0.3 documentation

pandas allows you to capture both representations and convert between them. Under the hood, pandas represents timestamps using instances of Timestamp and sequences of timestamps using instances of DatetimeIndex. For regular time spans, pandas uses Period objects for scalar values and PeriodIndex ...

Pandas

pandas.pydata.org › docs › reference › api › pandas.DatetimeIndex.indexer_between_time.html

pandas.DatetimeIndex.indexer_between_time — pandas 3.0.3 documentation

Select values between particular times of day. ... >>> idx = pd.date_range("2023-01-01", periods=4, freq="h") >>> idx DatetimeIndex(['2023-01-01 00:00:00', '2023-01-01 01:00:00', '2023-01-01 02:00:00', '2023-01-01 03:00:00'], dtype='datetime64[us]', freq='h') >>> idx.indexer_between_time("00:00", "2:00", include_end=False) array([0, 1])

Statology

statology.org › home › pandas: how to select rows between two dates

Pandas: How to Select Rows Between Two Dates

August 23, 2022 - Note that if your date column is ... it to a datetime format: ... Once you’ve done this, you can proceed to use the between() function to select rows between specific dates. The following tutorials explain how to perform other common operations ...

Statology

statology.org › home › pandas: how to calculate a difference between two dates

Pandas: How to Calculate a Difference Between Two Dates

March 24, 2022 - Note that we can replace the ‘D’ in the timedelta64() function with the following values to calculate the date difference in different units: ... The following examples show how to calculate a date difference in a pandas DataFrame in practice.

Pandas

pandas.pydata.org › pandas-docs › version › 0.23 › generated › pandas.DataFrame.between_time.html

pandas.DataFrame.between_time — pandas 0.23.1 documentation

>>> ts.between_time('0:15', '0:45') A 2018-04-10 00:20:00 2 2018-04-11 00:40:00 3

Saturn Cloud

saturncloud.io › blog › how-to-get-the-number-of-days-between-two-dates-using-pandas

How to Get the Number of Days Between Two Dates Using Pandas | Saturn Cloud Blog

August 19, 2023 - We then subtract date1 from date2 to get a timedelta object representing the difference between the two dates. Finally, we extract the number of days from the timedelta object using the days attribute and store it in the num_days variable.

Pandas

pandas.pydata.org › docs › reference › api › pandas.date_range.html

pandas.date_range — pandas 3.0.3 documentation - PyData |

Of the four parameters start, end, periods, and freq, a maximum of three can be specified at once. Of the three parameters start, end, and periods, at least two must be specified. If freq is omitted, the resulting DatetimeIndex will have periods linearly spaced elements between start and end ...

Find elsewhere

Google Bing Mojeek

InterviewQs

interviewqs.com › ddi-code-snippets › select-pandas-dataframe-rows-between-two-dates

Select Pandas dataframe rows between two dates - InterviewQs

Select Pandas dataframe rows between two dates · We can perform this using a boolean mask First, lets ensure the 'birth_date' column is in date format · df['birth_date'] = pd.to_datetime(df['birth_date']) next, set the desired start date and end date to filter df with -- these can be in datetime (numpy and pandas), timestamp, or string format ·

GeeksforGeeks

geeksforgeeks.org › python-pandas-dataframe-between_time

Pandas dataframe.between_time() - GeeksforGeeks

November 29, 2024 - The between_time() function filters the rows with times between 09:00:00 and 12:00:00, including both the start_time and end_time by default. The resulting DataFrame only contains rows that fall within the specified time range, making it easy ...

Pandas

pandas.pydata.org › pandas-docs › version › 1.5 › reference › api › pandas.DataFrame.between_time.html

pandas.DataFrame.between_time — pandas 1.5.2 documentation

Select values between particular times of the day (e.g., 9:00-9:30 AM) · By setting start_time to be later than end_time, you can get the times that are not between the two times

GeeksforGeeks

geeksforgeeks.org › select-pandas-dataframe-rows-between-two-dates

Select Pandas dataframe rows between two dates | GeeksforGeeks

February 24, 2021 - Dates can be represented initially in several ways : ... To manipulate dates in pandas, we use the pd.to_datetime() function in pandas to convert different date representations to datetime64[ns] format.

Pandas

pandas.pydata.org › pandas-docs › stable › reference › api › pandas.Series.between_time.html

pandas.Series.between_time — pandas 2.3.3 documentation

Select values between particular times of the day (e.g., 9:00-9:30 AM) · By setting start_time to be later than end_time, you can get the times that are not between the two times

reddit.com › r/learnpython › check if value is in between two dates?

r/learnpython on Reddit: Check if value is in between two dates?

November 25, 2022 -

I am trying to see if the values in my Initiated Date column are between 1 and 2 weeks, again for 2 weeks to 1 month, etc.

df['Initiated Date'] = pd.to_datetime(df['Initiated Date'], format='%Y-%m-%d')

# Filter data between two dates
now = datetime.now()
one_week = now - timedelta(days=7)
two_week = now - timedelta(days=14)
filtered_df = df.loc[(df['Initiated Date'] >= one_week) & (df['Initiated Date'] < two_week)]

This method isn't working and seems overly complicated. Am I thinking about this wrong?

error: TypeError: can't compare offset-naive and offset-aware datetimes