The best way to do this in Pandas is to use drop:
df = df.drop('column_name', axis=1)
where 1 is the axis number (0 for rows and 1 for columns.)
Or, the drop() method accepts index/columns keywords as an alternative to specifying the axis. So we can now just do:
df = df.drop(columns=['column_nameA', 'column_nameB'])
- This was introduced in v0.21.0 (October 27, 2017)
To delete the column without having to reassign df you can do:
df.drop('column_name', axis=1, inplace=True)
Finally, to drop by column number instead of by column label, try this to delete, e.g. the 1st, 2nd and 4th columns:
df = df.drop(df.columns[[0, 1, 3]], axis=1) # df.columns is zero-based pd.Index
Also working with "text" syntax for the columns:
df.drop(['column_nameA', 'column_nameB'], axis=1, inplace=True)
Answer from LondonRob on Stack OverflowThe best way to do this in Pandas is to use drop:
df = df.drop('column_name', axis=1)
where 1 is the axis number (0 for rows and 1 for columns.)
Or, the drop() method accepts index/columns keywords as an alternative to specifying the axis. So we can now just do:
df = df.drop(columns=['column_nameA', 'column_nameB'])
- This was introduced in v0.21.0 (October 27, 2017)
To delete the column without having to reassign df you can do:
df.drop('column_name', axis=1, inplace=True)
Finally, to drop by column number instead of by column label, try this to delete, e.g. the 1st, 2nd and 4th columns:
df = df.drop(df.columns[[0, 1, 3]], axis=1) # df.columns is zero-based pd.Index
Also working with "text" syntax for the columns:
df.drop(['column_nameA', 'column_nameB'], axis=1, inplace=True)
As you've guessed, the right syntax is
del df['column_name']
It's difficult to make del df.column_name work simply as the result of syntactic limitations in Python. del df[name] gets translated to df.__delitem__(name) under the covers by Python.
Videos
df.columns = df.columns.str.replace(r"\(.*\)","") #Parathesis contain no useful data; delete all values with parathesis
df.columns = df.columns.str.replace(".","") #Periods waste space
df.columns = df.columns.str.replace(" ","") #Spaces waste space and complicate code
df.columns = df.columns.str.replace("/","") #forwardslashes waste space and complicate code
df.columns = df.columns.str.replace("Province","") #there are no providences in our data we just call them states anyways
df.columns = df.columns.str.replace("PostalCode","") #we are calling it zip for space
df.columns = df.columns.str.replace("PotentialCustomer", "customer") #shortening to save space
df.columns = df.columns.str.replace("LineOfBusiness", "industry") #shorten to one word
df.columns = df.columns.str.replace("TotalDealRevenue", "estrev") #replace long bad description with short good one
df.columns = df.columns.str.replace("ActualRevenue", "actrev") #shorten to 6 char
df.columns = map(str.lower, df.columns) #all char lowercase for simplicity
df.drop(columns=['opportunity', 'rowchecksum', 'modifiedon', 'dealname',
'probability', 'createdon', 'owner', 'email', 'payterms', 'project',
'certificateofinsurance', 'commissionform', 'equipmentrep', 'rentalrep', 'accountsterms'])So the output has all the columns except for the ones that I have dropped. However when I enter
df.head(5)
I am getting output of all columns including the ones that were supposedly dropped.