pandas delete dataframe

How to delete multiple pandas (python) dataframes from memory to save RAM?

stackoverflow.com › questions › 32247643 › how-to-delete-multiple-pandas-python-dataframes-from-memory-to-save-ram

`del` statement does not delete an instance, it merely deletes a name.

When you do del i, you are deleting just the name i - but the instance is still bound to some other name, so it won't be Garbage-Collected.

If you want to release memory, your dataframes has to be Garbage-Collected, i.e. delete all references to them.

If you created your dateframes dynamically to list, then removing that list will trigger Garbage Collection.

>>> lst = [pd.DataFrame(), pd.DataFrame(), pd.DataFrame()]
>>> del lst     # memory is released

If you created some variables, you have to delete them all.

>>> a, b, c = pd.DataFrame(), pd.DataFrame(), pd.DataFrame()
>>> lst = [a, b, c]
>>> del a, b, c # dfs still in list
>>> del lst     # memory release now

Answer from pacholik on Stack Overflow

Stack Overflow

stackoverflow.com › questions › 32247643 › how-to-delete-multiple-pandas-python-dataframes-from-memory-to-save-ram

How to delete multiple pandas (python) dataframes from memory to save RAM? - Stack Overflow

Top answer

1 of 5

`del` statement does not delete an instance, it merely deletes a name.

When you do del i, you are deleting just the name i - but the instance is still bound to some other name, so it won't be Garbage-Collected.

If you want to release memory, your dataframes has to be Garbage-Collected, i.e. delete all references to them.

If you created your dateframes dynamically to list, then removing that list will trigger Garbage Collection.

>>> lst = [pd.DataFrame(), pd.DataFrame(), pd.DataFrame()]
>>> del lst     # memory is released

If you created some variables, you have to delete them all.

>>> a, b, c = pd.DataFrame(), pd.DataFrame(), pd.DataFrame()
>>> lst = [a, b, c]
>>> del a, b, c # dfs still in list
>>> del lst     # memory release now

2 of 5

In python automatic garbage collection deallocates the variable (pandas DataFrame are also just another object in terms of python). There are different garbage collection strategies that can be tweaked (requires significant learning).

You can manually trigger the garbage collection using

import gc
gc.collect()

But frequent calls to garbage collection is discouraged as it is a costly operation and may affect performance.

Reference

Pandas

pandas.pydata.org › docs › reference › api › pandas.DataFrame.drop.html

pandas.DataFrame.drop — pandas 3.0.2 documentation

Drop a specific index combination from the MultiIndex DataFrame, i.e., drop the combination 'falcon' and 'weight', which deletes only the corresponding row

Discussions

Removing rows from pandas dataframe efficiently?

I have to use data from two pandas dataframes but I’m having trouble figuring out how to remove data efficiently from the datasets. The df_books dataframe contains roughly 300k entries which includes book details (isbn, title, and author), while the df_ratings dataframe contains 1.1 million ... More on forum.freecodecamp.org

forum.freecodecamp.org

July 27, 2021

[Pandas] Efficiently delete rows from dataframe

I do not use pandas that often, but instead of dropping values have you tried to select the values you want to keep df = df[np.asarray(df.X != x) & np.asarray(df.Y != y) & np.asarray(df.Z != Z)] Notice that instead of == use != More on reddit.com

r/learnpython

April 21, 2015

How to release the memory of dataframe ?

Not sure if this will work better, but you could try: data = (pd.read_csv(file, usecols=['item id']) for file in folder) df = pd.concat(data, ignore_index=True) More on reddit.com

r/learnpython

July 19, 2021

r - How do I delete all pandas dataframe created by my python code - Stack Overflow

I'm using python 3.x.I would like to delete all pandas dataframe created by my python code. More on stackoverflow.com

stackoverflow.com

Videos

00:47

YouTube

How to Remove Rows from a Pandas Dataframe - YouTube

August 28, 2023

youtube.com

Delete Rows From Pandas Dataframe - YouTube

February 10, 2022

01:57

YouTube

How to Remove a Row From a Data Frame in Pandas (Python) - YouTube

September 3, 2020

youtube.com

L13: Delete Rows & Columns of Pandas DataFrame ...

09:39

YouTube

17. How to delete rows/ columns from a Pandas DataFrame | Python ...

February 13, 2024

View all

GeeksforGeeks

geeksforgeeks.org › python › memory-leak-using-pandas-dataframe

Memory leak using Pandas DataFrame - GeeksforGeeks

July 23, 2025 - You can use the del keyword in Python to delete a DataFrame object and free up the memory used by it. Loading only the data you need into your DataFrame: To avoid memory leaks, you should only load the data that you actually need into your ...

Pandas

pandas.pydata.org › docs › reference › api › pandas.DataFrame.dropna.html

pandas.DataFrame.dropna — pandas 3.0.1 documentation

DataFrame.dropna(*, axis=0, how=<no_default>, thresh=<no_default>, subset=None, inplace=False, ignore_index=False)[source]#

freeCodeCamp

forum.freecodecamp.org › python

Removing rows from pandas dataframe efficiently? - Python - The freeCodeCamp Forum

July 27, 2021 - I have to use data from two pandas dataframes but I’m having trouble figuring out how to remove data efficiently from the datasets. The df_books dataframe contains roughly 300k entries which includes book details (isbn, …

Shane Lynn

shanelynn.ie › home › delete rows & columns in dataframes quickly using pandas drop

Delete Rows & Columns in DataFrames using Pandas Drop

December 17, 2021 - Learn how to drop or delete rows & columns from Python Pandas DataFrames using "pandas drop". Delete rows and columns by number, index, or by boolean values.

Find elsewhere

Google Bing Mojeek

reddit.com › r/learnpython › [pandas] efficiently delete rows from dataframe

r/learnpython on Reddit: [Pandas] Efficiently delete rows from dataframe

April 21, 2015 -

I have a dataframe containing around 2M rows and 6 columns. Based on 3 of those columns I want to delete certain rows. ATM my code looks like this:

df = df.drop( df[ (df.X == x) & (df.Y==y)  & (df.Z==Z)].index )

Unsurprisingly this isn't really fast, however, I couldn't find a way to do it faster.

PS: It's not that it takes ages, just 1 or 2 seconds, but I have to do it 30-40 times each run, so it adds up.

Top answer

1 of 1

Quora

quora.com › How-should-I-delete-rows-from-a-DataFrame-in-Python-Pandas

How should I delete rows from a DataFrame in Python-Pandas? - Quora

You can use drop function in pandas. It is used to delete row and column from a DataFrame.

Sentry

sentry.io › sentry answers › python › delete a column from a dataframe in python pandas

Delete a column from a DataFrame in Python Pandas | Sentry

June 15, 2023 - This code will print the products DataFrame with three columns and then with two columns. We’ve used the following arguments in our drop method call: 'sale_price' is the name of the column to remove. We could also provide a column index (e.g. 2) or a list of indices or names to delete multiple columns.

Saturn Cloud

saturncloud.io › blog › how-to-release-memory-used-by-a-pandas-dataframe

How to Release Memory Used by a Pandas DataFrame | Saturn Cloud Blog

December 7, 2023 - One simple way to release memory used by a Pandas DataFrame is to use the del statement to delete the DataFrame object.

reddit.com › r/learnpython › how to release the memory of dataframe ?

r/learnpython on Reddit: How to release the memory of dataframe ?

July 19, 2021 -

I have several big csv file. I want to extract the column "item id" in each on them.

And combine all of them and return a unique one.

My code is as follow:

    for csv_file in folder:
        df = pd.read_csv(csv_file)
        list_df.append(df['item id'])

    df_all_itemNo = pd.concat(list_df, ignore_index=True)
    df_all_itemNo = df_all_itemNo.drop_duplicates()

It is working when there is only a few csv file. The problem is when several big csv is read, all of my computer memory is used up.

From the memory usage graph, I see that the memory was keep on increasing. It never release back when every time

df = pd.read_csv(csv_file) is executed. The old df was stuck in memory.

Is there any solutions ?

Top answer

1 of 3

Not sure if this will work better, but you could try: data = (pd.read_csv(file, usecols=['item id']) for file in folder) df = pd.concat(data, ignore_index=True)

2 of 3

One nifty trick I saw that might help you here is using import multiprocessing. First up, put your code into a function def get_column(params): df = pd.read_csv(params[0]) return df[params[1]] and then call it using multiprocessing with a context manager: import multiprocessing if __name__ == '__main__': for csv_file in folder: with multiprocessing.Pool(1) as p: items = p.map(get_column, [(csv_file,'item id')])[0] list_df.append(items) This forces the read to be in a separate process and lets the OS reclaim the memory instead of letting python attempt to hold on to it. You could/should probably extend this into running n processes instead of just 1 at a time. if __name__ == '__main__': with multiprocessing.Pool(len(folder)) as p: list_df = p.map(get_column, [(csv_file,'item id') for csv_file in folder]) If you try this, please let me know if it makes a difference for you. edit: So I tested this locally and it seems to work. With the second one obviously being much faster

W3Schools

w3schools.com › python › pandas › ref_df_drop.asp

Pandas DataFrame drop() Method

A DataFrame with the result, or None if the inplace parameter is set to True.

MLJAR

mljar.com › docs › pandas-delete-column

Delete Column in Pandas DataFrame

Use `drop()` function from Pandas package to delete column in DataFrame. You can drop single column or multiple columns at once. The drop operation is performed inplace.

ActiveState

activestate.com › home › resources › quick read › how to delete a column/row from a dataframe

How to Delete a Column/Row From a DataFrame using Pandas - ActiveState

January 24, 2024 - You can use the drop function to delete rows and columns in a Pandas DataFrame.

Stack Overflow

stackoverflow.com › questions › 63775198 › how-do-i-delete-all-pandas-dataframe-created-by-my-python-code

r - How do I delete all pandas dataframe created by my python code - Stack Overflow

Top answer

1 of 1

It's not as straightforward in Python as it is in R. The best and safest option is to manually del each dataframe by name (as well as any other references to the object, e.g. if they are also in a list). However, if this isn't an option, you can iterate through all available variables, check if they are an instance of a pandas Dataframe, and then delete them.

If your goal is to free memory, you should manually run garbage collection after references to the objects have been deleted.

The following works for me:

import gc
import pandas as pd

for i in dir():
    if isinstance(globals()[i], pd.DataFrame):
        del globals()[i]

gc.collect()

Stack Overflow

stackoverflow.com › questions › 52123778 › conditional-delete-in-pandas-dataframe

python - Conditional delete in pandas dataframe - Stack Overflow

Top answer

1 of 1

IIUC, you can do this rather than iterating through your frame with iterrows:

df = df[df.email.str.endswith('.com')]

which returns:

>>> df
             email
0    [email protected]
1    [email protected]
3  [email protected]

Or, for larger dataframes, it's sometimes faster to not use the str methods provided by pandas, but just to do it in a plain list comprehension with python's built in string methods:

df = df[[i.endswith('.com') for i in df.email]]

GeeksforGeeks

geeksforgeeks.org › python › how-to-drop-one-or-multiple-columns-in-pandas-dataframe

How to Drop One or Multiple Columns in Pandas DataFrame - GeeksforGeeks

08:10

import pandas as pd df = pd.DataFrame({ 'A': [1, 2, None, 4], 'B': [None, None, None, 4], 'C': [1, 2, 3, 4] }) threshold = len(df) * 0.5 df = df.dropna(thresh=threshold, axis=1) print(df) Output · A C 0 1.0 1 1 2.0 2 2 NaN 3 3 4.0 4 · Explanation: len(df) * 0.5: sets 50% as the minimum non-null count required to keep a column. dropna(thresh=threshold, axis=1): removes columns with more than 50% missing data. Pandas Tutorial · Python | Delete rows/columns from DataFrame using Pandas.drop() Comment ·

Published November 1, 2025

Vultr Docs

docs.vultr.com › python › third-party › pandas › drop

Python Pandas drop() - Remove Data Entries | Vultr Docs

December 30, 2024 - Drop rows based on a condition applied to the DataFrame. Use boolean indexing to specify the condition and drop() to remove the rows. ... This removes rows where 'Age' is 30 or less. It then attempts to drop rows labeled 'Peter' and 'Linda' directly, but notice a mistake: the correct index or labels are needed for successful deletion.

Edlitera

edlitera.com › blog › posts › pandas-add-rename-remove-columns

Intro to Pandas: How to Add, Rename, and Remove Columns in Pandas | Edlitera

November 15, 2022 - To make sure your DataFrame contains only the data that you want use in your project, you can add columns and remove columns from a DataFrame.

Thinking Neuron

thinkingneuron.com › home › how to delete variables from a pandas data frame

How to delete variables from a pandas data frame - Thinking Neuron

September 21, 2020 - If you need to delete some variables from the pandas dataframe, you can use the drop() function.

del statement does not delete an instance, it merely deletes a name.

If you created some variables, you have to delete them all.

del statement does not delete an instance, it merely deletes a name.

If you created some variables, you have to delete them all.

Videos

`del` statement does not delete an instance, it merely deletes a name.

`del` statement does not delete an instance, it merely deletes a name.