To directly answer this question's original title "How to delete rows from a pandas DataFrame based on a conditional expression" (which I understand is not necessarily the OP's problem but could help other users coming across this question) one way to do this is to use the drop method:
df = df.drop(some labels)
df = df.drop(df[<some boolean condition>].index)
Example
To remove all rows where column 'score' is < 50:
df = df.drop(df[df.score < 50].index)
In place version (as pointed out in comments)
df.drop(df[df.score < 50].index, inplace=True)
Multiple conditions
(see Boolean Indexing)
The operators are:
|foror,&forand, and~fornot. These must be grouped by using parentheses.
To remove all rows where column 'score' is < 50 and > 20
df = df.drop(df[(df.score < 50) & (df.score > 20)].index)
Answer from User on Stack OverflowTo directly answer this question's original title "How to delete rows from a pandas DataFrame based on a conditional expression" (which I understand is not necessarily the OP's problem but could help other users coming across this question) one way to do this is to use the drop method:
df = df.drop(some labels)
df = df.drop(df[<some boolean condition>].index)
Example
To remove all rows where column 'score' is < 50:
df = df.drop(df[df.score < 50].index)
In place version (as pointed out in comments)
df.drop(df[df.score < 50].index, inplace=True)
Multiple conditions
(see Boolean Indexing)
The operators are:
|foror,&forand, and~fornot. These must be grouped by using parentheses.
To remove all rows where column 'score' is < 50 and > 20
df = df.drop(df[(df.score < 50) & (df.score > 20)].index)
When you do len(df['column name']) you are just getting one number, namely the number of rows in the DataFrame (i.e., the length of the column itself). If you want to apply len to each element in the column, use df['column name'].map(len). So try
df[df['column name'].map(len) < 2]
How to remove rows from a dataframe based on conditions across two rows?
how to select rows based on some conditions [pandas]
Dropping rows based on values in columns?
Delete duplicates with a condition
Videos
My data contains values for dates such as 30 Feb, 31 April and so on. I just want to remove the rows containing such values.
The problematic part of dataframe looks like this:
Year Month Day Value
1989 2 30 1
1989 2 31 2
1989 3 1 5
Where Month = 2 and Day = 30 means 30th Feb (an impossible date), Month = 2 and Day = 31 means 31st Feb (an impossible date) and Month = 3 and Day = 1 means 1st March (a possible date). This data extends upto 12/31/2017. Basically each month of each year has 31 data points. I just need to eliminate the ones which don't exist on a calendar.