There are two possible solutions:

  • Use a boolean mask, then use df.loc[mask]
  • Set the date column as a DatetimeIndex, then use df[start_date : end_date]

Using a boolean mask:

Ensure df['date'] is a Series with dtype datetime64[ns]:

Copydf['date'] = pd.to_datetime(df['date'])  

Make a boolean mask. start_date and end_date can be datetime.datetimes, np.datetime64s, pd.Timestamps, or even datetime strings:

Copy#greater than the start date and smaller than the end date
mask = (df['date'] > start_date) & (df['date'] <= end_date)

Select the sub-DataFrame:

Copydf.loc[mask]

or re-assign to df

Copydf = df.loc[mask]

For example,

Copyimport numpy as np
import pandas as pd

df = pd.DataFrame(np.random.random((200,3)))
df['date'] = pd.date_range('2000-1-1', periods=200, freq='D')
mask = (df['date'] > '2000-6-1') & (df['date'] <= '2000-6-10')
print(df.loc[mask])

yields

Copy            0         1         2       date
153  0.208875  0.727656  0.037787 2000-06-02
154  0.750800  0.776498  0.237716 2000-06-03
155  0.812008  0.127338  0.397240 2000-06-04
156  0.639937  0.207359  0.533527 2000-06-05
157  0.416998  0.845658  0.872826 2000-06-06
158  0.440069  0.338690  0.847545 2000-06-07
159  0.202354  0.624833  0.740254 2000-06-08
160  0.465746  0.080888  0.155452 2000-06-09
161  0.858232  0.190321  0.432574 2000-06-10

Using a DatetimeIndex:

If you are going to do a lot of selections by date, it may be quicker to set the date column as the index first. Then you can select rows by date using df.loc[start_date:end_date].

Copyimport numpy as np
import pandas as pd

df = pd.DataFrame(np.random.random((200,3)))
df['date'] = pd.date_range('2000-1-1', periods=200, freq='D')
df = df.set_index(['date'])
print(df.loc['2000-6-1':'2000-6-10'])

yields

Copy                   0         1         2
date                                    
2000-06-01  0.040457  0.326594  0.492136    # <- includes start_date
2000-06-02  0.279323  0.877446  0.464523
2000-06-03  0.328068  0.837669  0.608559
2000-06-04  0.107959  0.678297  0.517435
2000-06-05  0.131555  0.418380  0.025725
2000-06-06  0.999961  0.619517  0.206108
2000-06-07  0.129270  0.024533  0.154769
2000-06-08  0.441010  0.741781  0.470402
2000-06-09  0.682101  0.375660  0.009916
2000-06-10  0.754488  0.352293  0.339337

While Python list indexing, e.g. seq[start:end] includes start but not end, in contrast, Pandas df.loc[start_date : end_date] includes both end-points in the result if they are in the index. Neither start_date nor end_date has to be in the index however.


Also note that pd.read_csv has a parse_dates parameter which you could use to parse the date column as datetime64s. Thus, if you use parse_dates, you would not need to use df['date'] = pd.to_datetime(df['date']).

Answer from unutbu on Stack Overflow
🌐
Pandas
pandas.pydata.org › docs › reference › api › pandas.DataFrame.between_time.html
pandas.DataFrame.between_time — pandas 3.0.3 documentation
Select values between particular times of the day (e.g., 9:00-9:30 AM) · By setting start_time to be later than end_time, you can get the times that are not between the two times
Top answer
1 of 14
719

There are two possible solutions:

  • Use a boolean mask, then use df.loc[mask]
  • Set the date column as a DatetimeIndex, then use df[start_date : end_date]

Using a boolean mask:

Ensure df['date'] is a Series with dtype datetime64[ns]:

Copydf['date'] = pd.to_datetime(df['date'])  

Make a boolean mask. start_date and end_date can be datetime.datetimes, np.datetime64s, pd.Timestamps, or even datetime strings:

Copy#greater than the start date and smaller than the end date
mask = (df['date'] > start_date) & (df['date'] <= end_date)

Select the sub-DataFrame:

Copydf.loc[mask]

or re-assign to df

Copydf = df.loc[mask]

For example,

Copyimport numpy as np
import pandas as pd

df = pd.DataFrame(np.random.random((200,3)))
df['date'] = pd.date_range('2000-1-1', periods=200, freq='D')
mask = (df['date'] > '2000-6-1') & (df['date'] <= '2000-6-10')
print(df.loc[mask])

yields

Copy            0         1         2       date
153  0.208875  0.727656  0.037787 2000-06-02
154  0.750800  0.776498  0.237716 2000-06-03
155  0.812008  0.127338  0.397240 2000-06-04
156  0.639937  0.207359  0.533527 2000-06-05
157  0.416998  0.845658  0.872826 2000-06-06
158  0.440069  0.338690  0.847545 2000-06-07
159  0.202354  0.624833  0.740254 2000-06-08
160  0.465746  0.080888  0.155452 2000-06-09
161  0.858232  0.190321  0.432574 2000-06-10

Using a DatetimeIndex:

If you are going to do a lot of selections by date, it may be quicker to set the date column as the index first. Then you can select rows by date using df.loc[start_date:end_date].

Copyimport numpy as np
import pandas as pd

df = pd.DataFrame(np.random.random((200,3)))
df['date'] = pd.date_range('2000-1-1', periods=200, freq='D')
df = df.set_index(['date'])
print(df.loc['2000-6-1':'2000-6-10'])

yields

Copy                   0         1         2
date                                    
2000-06-01  0.040457  0.326594  0.492136    # <- includes start_date
2000-06-02  0.279323  0.877446  0.464523
2000-06-03  0.328068  0.837669  0.608559
2000-06-04  0.107959  0.678297  0.517435
2000-06-05  0.131555  0.418380  0.025725
2000-06-06  0.999961  0.619517  0.206108
2000-06-07  0.129270  0.024533  0.154769
2000-06-08  0.441010  0.741781  0.470402
2000-06-09  0.682101  0.375660  0.009916
2000-06-10  0.754488  0.352293  0.339337

While Python list indexing, e.g. seq[start:end] includes start but not end, in contrast, Pandas df.loc[start_date : end_date] includes both end-points in the result if they are in the index. Neither start_date nor end_date has to be in the index however.


Also note that pd.read_csv has a parse_dates parameter which you could use to parse the date column as datetime64s. Thus, if you use parse_dates, you would not need to use df['date'] = pd.to_datetime(df['date']).

2 of 14
138

I feel the best option will be to use the direct checks rather than using loc function:

Copydf = df[(df['date'] > '2000-6-1') & (df['date'] <= '2000-6-10')]

It works for me.

Major issue with loc function with a slice is that the limits should be present in the actual values, if not this will result in KeyError.

🌐
Pandas
pandas.pydata.org › docs › user_guide › timeseries.html
Time series / date functionality — pandas 3.0.3 documentation
pandas allows you to capture both representations and convert between them. Under the hood, pandas represents timestamps using instances of Timestamp and sequences of timestamps using instances of DatetimeIndex. For regular time spans, pandas uses Period objects for scalar values and PeriodIndex ...
🌐
Pandas
pandas.pydata.org › docs › reference › api › pandas.DatetimeIndex.indexer_between_time.html
pandas.DatetimeIndex.indexer_between_time — pandas 3.0.3 documentation
Select values between particular times of day. ... >>> idx = pd.date_range("2023-01-01", periods=4, freq="h") >>> idx DatetimeIndex(['2023-01-01 00:00:00', '2023-01-01 01:00:00', '2023-01-01 02:00:00', '2023-01-01 03:00:00'], dtype='datetime64[us]', freq='h') >>> idx.indexer_between_time("00:00", "2:00", include_end=False) array([0, 1])
🌐
Statology
statology.org › home › pandas: how to select rows between two dates
Pandas: How to Select Rows Between Two Dates
August 23, 2022 - Note that if your date column is ... it to a datetime format: ... Once you’ve done this, you can proceed to use the between() function to select rows between specific dates. The following tutorials explain how to perform other common operations ...
🌐
Statology
statology.org › home › pandas: how to calculate a difference between two dates
Pandas: How to Calculate a Difference Between Two Dates
March 24, 2022 - Note that we can replace the ‘D’ in the timedelta64() function with the following values to calculate the date difference in different units: ... The following examples show how to calculate a date difference in a pandas DataFrame in practice.
🌐
Saturn Cloud
saturncloud.io › blog › how-to-get-the-number-of-days-between-two-dates-using-pandas
How to Get the Number of Days Between Two Dates Using Pandas | Saturn Cloud Blog
August 19, 2023 - We then subtract date1 from date2 to get a timedelta object representing the difference between the two dates. Finally, we extract the number of days from the timedelta object using the days attribute and store it in the num_days variable.
🌐
Pandas
pandas.pydata.org › docs › reference › api › pandas.date_range.html
pandas.date_range — pandas 3.0.3 documentation - PyData |
Of the four parameters start, end, periods, and freq, a maximum of three can be specified at once. Of the three parameters start, end, and periods, at least two must be specified. If freq is omitted, the resulting DatetimeIndex will have periods linearly spaced elements between start and end ...
Find elsewhere
🌐
InterviewQs
interviewqs.com › ddi-code-snippets › select-pandas-dataframe-rows-between-two-dates
Select Pandas dataframe rows between two dates - InterviewQs
Select Pandas dataframe rows between two dates · We can perform this using a boolean mask First, lets ensure the 'birth_date' column is in date format · df['birth_date'] = pd.to_datetime(df['birth_date']) next, set the desired start date and end date to filter df with -- these can be in datetime (numpy and pandas), timestamp, or string format ·
🌐
GeeksforGeeks
geeksforgeeks.org › python-pandas-dataframe-between_time
Pandas dataframe.between_time() - GeeksforGeeks
November 29, 2024 - The between_time() function filters the rows with times between 09:00:00 and 12:00:00, including both the start_time and end_time by default. The resulting DataFrame only contains rows that fall within the specified time range, making it easy ...
🌐
Pandas
pandas.pydata.org › pandas-docs › version › 1.5 › reference › api › pandas.DataFrame.between_time.html
pandas.DataFrame.between_time — pandas 1.5.2 documentation
Select values between particular times of the day (e.g., 9:00-9:30 AM) · By setting start_time to be later than end_time, you can get the times that are not between the two times
🌐
GeeksforGeeks
geeksforgeeks.org › select-pandas-dataframe-rows-between-two-dates
Select Pandas dataframe rows between two dates | GeeksforGeeks
February 24, 2021 - Dates can be represented initially in several ways : ... To manipulate dates in pandas, we use the pd.to_datetime() function in pandas to convert different date representations to datetime64[ns] format.
🌐
Pandas
pandas.pydata.org › pandas-docs › stable › reference › api › pandas.Series.between_time.html
pandas.Series.between_time — pandas 2.3.3 documentation
Select values between particular times of the day (e.g., 9:00-9:30 AM) · By setting start_time to be later than end_time, you can get the times that are not between the two times
🌐
Reddit
reddit.com › r/learnpython › check if value is in between two dates?
r/learnpython on Reddit: Check if value is in between two dates?
November 25, 2022 -

I am trying to see if the values in my Initiated Date column are between 1 and 2 weeks, again for 2 weeks to 1 month, etc.

df['Initiated Date'] = pd.to_datetime(df['Initiated Date'], format='%Y-%m-%d')

# Filter data between two dates
now = datetime.now()
one_week = now - timedelta(days=7)
two_week = now - timedelta(days=14)
filtered_df = df.loc[(df['Initiated Date'] >= one_week) & (df['Initiated Date'] < two_week)]

This method isn't working and seems overly complicated. Am I thinking about this wrong?

error: TypeError: can't compare offset-naive and offset-aware datetimes

🌐
w3resource
w3resource.com › pandas › dataframe › dataframe-between_time.php
Pandas DataFrame: - between_time() function - w3resource
August 19, 2022 - The between_time() function is used to select values between particular times of the day (e.g., 9:00-9:30 AM).
🌐
Kanoki
kanoki.org › pandas-select-rows-between-two-dates
Pandas select rows between two dates | kanoki
October 2, 2022 - We could use boolean indexing to select the rows between the dates, you can pass the dates as datetime.datetime,np.datetime64, pd.Timestamp, or even datetime strings
🌐
Pandas
pandas.pydata.org › pandas-docs › version › 0.22 › generated › pandas.DataFrame.between_time.html
pandas.DataFrame.between_time — pandas 0.22.0 documentation
DatetimeIndex · TimedeltaIndex ... DataFrame.between_time(start_time, end_time, include_start=True, include_end=True)[source]¶ · Select values between particular times of the day (e.g., 9:00-9:30 AM). index · modules | next | previous | pandas 0.22.0 documentation » ...
🌐
InfluxData
influxdata.com › home › pandas datetime: when and how to use it | influxdata
Pandas DateTime: When and How to Use It | InfluxData
December 6, 2023 - This is useful for tasks such as measuring the duration between two events or calculating time intervals in a time series dataset. Start Time: 2023-09-15 08:00:00 End Time: 2023-09-15 10:30:00 Time Difference: 0 days 02:30:00 · Grouping data by DateTime components (e.g., monthly, weekly) and applying aggregate functions can provide valuable insights into your dataset. import pandas ...
🌐
Spark By {Examples}
sparkbyexamples.com › home › pandas › pandas select dataframe rows between two dates
Pandas Select DataFrame Rows Between Two Dates - Spark By {Examples}
November 27, 2024 - Let's see how to select/filter rows between two dates in Pandas DataFrame, in real-time applications you would often be required to select rows between