pandas memory_usage - Brave Search

How to estimate how much memory a Pandas' DataFrame will need?

stackoverflow.com › questions › 18089667 › how-to-estimate-how-much-memory-a-pandas-dataframe-will-need

df.memory_usage() will return how many bytes each column occupies:

>>> df.memory_usage()

Row_ID            20906600
Household_ID      20906600
Vehicle           20906600
Calendar_Year     20906600
Model_Year        20906600
...

The values are in units of bytes.

To include indexes, pass index=True.

So to get overall memory consumption:

>>> df.memory_usage(index=True).sum()
731731000

As before, the value is in units of bytes.

Also, passing deep=True will enable a more accurate memory usage report, that accounts for the full usage of the contained objects.

This is because memory usage does not include memory consumed by elements that are not components of the array if deep=False (default case).

Answer from Oleksiy Syvokon on Stack Overflow

pandas.pydata.org › docs › reference › api › pandas.DataFrame.memory_usage.html

pandas.DataFrame.memory_usage — pandas 3.0.2 documentation

Return the memory usage of each column in bytes.

pandas.pydata.org › pandas-docs › stable › reference › api › pandas.DataFrame.memory_usage.html

pandas.DataFrame.memory_usage — pandas 3.0.1 documentation

Return the memory usage of each column in bytes.

Videos

How to process large dataset with pandas | Avoid out of memory ...

December 12, 2022

Three ways to optimize your Pandas data frame's memory footprint ...

November 8, 2022

Pandas Memory Optimization Tips - YouTube

Stop wasting memory in your Pandas DataFrame! - YouTube

070-Pandas Memory Usage - YouTube

December 1, 2020

How to Reduce Memory Usage and Loading Time of a Pandas DataFrame ...

January 26, 2020

stackoverflow.com › questions › 18089667 › how-to-estimate-how-much-memory-a-pandas-dataframe-will-need

python - How to estimate how much memory a Pandas' DataFrame will need? - Stack Overflow

df.memory_usage() will return how many bytes each column occupies:

>>> df.memory_usage()

Row_ID            20906600
Household_ID      20906600
Vehicle           20906600
Calendar_Year     20906600
Model_Year        20906600
...

The values are in units of bytes.

To include indexes, pass index=True.

So to get overall memory consumption:

>>> df.memory_usage(index=True).sum()
731731000

As before, the value is in units of bytes.

Also, passing deep=True will enable a more accurate memory usage report, that accounts for the full usage of the contained objects.

This is because memory usage does not include memory consumed by elements that are not components of the array if deep=False (default case).

Here's a comparison of the different methods - sys.getsizeof(df) is simplest.

For this example, df is a dataframe with 814 rows, 11 columns (2 ints, 9 objects) - read from a 427kb shapefile

sys.getsizeof(df)

>>> import sys
>>> sys.getsizeof(df)
(gives results in bytes)
462456

df.memory_usage()

>>> df.memory_usage()
...
(lists each column at 8 bytes/row)

>>> df.memory_usage().sum()
71712
(roughly rows * cols * 8 bytes)

>>> df.memory_usage(deep=True)
(lists each column's full memory usage)

>>> df.memory_usage(deep=True).sum()
(gives results in bytes)
462432

df.info()

Prints dataframe info to stdout. Technically these are kibibytes (KiB), not kilobytes - as the docstring says, "Memory usage is shown in human-readable units (base-2 representation)." So to get bytes would multiply by 1024, e.g. 451.6 KiB = 462,438 bytes.

>>> df.info()
...
memory usage: 70.0+ KB

>>> df.info(memory_usage='deep')
...
memory usage: 451.6 KB

pandas.pydata.org › docs › reference › api › pandas.Series.memory_usage.html

pandas.Series.memory_usage — pandas 3.0.2 documentation

>>> s = pd.Series(["a", "b"]) >>> s.values <ArrowStringArray> ['a', 'b'] Length: 2, dtype: str >>> s.memory_usage() 150 >>> s.memory_usage(deep=True) 150

pandas.pydata.org › docs › user_guide › gotchas.html

Frequently Asked Questions (FAQ) — pandas 3.0.2 documentation

A configuration option, display.memory_usage (see the list of options), specifies if the DataFrame memory usage will be displayed when invoking the info() method.

w3schools.com › python › pandas › ref_df_memory_usage.asp

Pandas DataFrame memory_usage() Method

Pandas HOME Pandas Intro Pandas ... Wrong Data Removing Duplicates ... The memory_usage() method returns a Series that contains the memory usage of each column....

geeksforgeeks.org › pandas-memory-management

Pandas Memory Management - GeeksforGeeks

April 2, 2025 - Python is a great language for ... analyzing data much easier.Pandas dataframe.memory_usage() function return the memory usage of each column in byte...

pythonspeed.com › articles › estimating-pandas-memory

Don’t bother trying to estimate Pandas memory usage

February 1, 2023 - Can an uncompressed Parquet file help us estimate the memory usage? For numeric data, we might actually reasonably expect a 1-to-1 ratio. For strings… not necessarily. Let’s create an uncompressed Parquet with a string column; the strings look like "A B A D E": from itertools import product import pandas as pd alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZ" data = { "strings": [ " ".join(values) for values in product( *([alphabet] * 5) ) ] } df = pd.DataFrame(data) print( "{:.1f}".format(df.memory_usage(deep=True).sum() / (1024 * 1024)) ) df.to_parquet("strings.parquet", compression=None)

Find elsewhere

Google Bing Mojeek

geeksforgeeks.org › python › python-pandas-dataframe-memory_usage

Python | Pandas dataframe.memory_usage() - GeeksforGeeks

June 22, 2021 - Python is a great language for ... analyzing data much easier. Pandas dataframe.memory_usage() function return the memory usage of each column in bytes....

medium.com › @gautamrajotya › how-to-reduce-memory-usage-in-python-pandas-158427a99001

How to reduce memory usage in Python (Pandas)? | by Gautamrajotya | Medium

October 20, 2022 - The memory_usage() method gives us the total memory being used by each column in the dataframe. It returns a Pandas series which lists the space being taken up by each column in bytes.

pandas.pydata.org › pandas-docs › stable › generated › pandas.Series.memory_usage.html

pandas.Series.memory_usage — pandas 2.2.1 documentation

The page has been moved to this page

pandas.pydata.org › pandas-docs › stable › reference › api › pandas.Series.memory_usage.html

pandas.Series.memory_usage — pandas 3.0.0 documentation

>>> s = pd.Series(["a", "b"]) >>> s.values <ArrowStringArray> ['a', 'b'] Length: 2, dtype: str >>> s.memory_usage() 150 >>> s.memory_usage(deep=True) 150

pandas.pydata.org › docs › user_guide › scale.html

Scaling to large datasets — pandas 3.0.2 documentation

Now we’ll implement an out-of-core pandas.Series.value_counts(). The peak memory usage of this workflow is the single largest chunk, plus a small series storing the unique value counts up to this point.

w3resource.com › pandas › series › series-memory_usage.php

Pandas Series: memory_usage() function - w3resource

Pandas Series - memory_usage() function: The memory_usage() function is used to return the memory usage of the Series.

w3resource.com › pandas › dataframe › dataframe-memory_usage.php

Pandas DataFrame: memory_usage() function - w3resource

August 19, 2022 - Pandas DataFrame - memory_usage() function: The memory_usage function is used to return the memory usage of each column in bytes.

Analytics Vidhya

analyticsvidhya.com › home › how to reduce memory usage in python (pandas)?

How to reduce memory usage in Python (Pandas)?

April 24, 2021 - The memory_usage() method gives us the total memory being used by each column in the dataframe. It returns a Pandas series which lists the space being taken up by each column in bytes.

educative.io › answers › what-is-memoryusage-in-pandas

What is memory_usage() in pandas?

Data frames are the two-dimensional ... takes up for data processing. The memory_usage() function is used to get the memory taken up by each column in bytes....

pythonspeed.com › articles › pandas-dataframe-series-memory-usage

Measuring the memory usage of a Pandas DataFrame

October 1, 2021 - A bunch of memory is being used by just import Pandas and its dependencies, but if we focus on the memory usage of creating the series, on the right, we see it’s using… 67MB? Shouldn’t it be using 317MB?

thinhdanggroup.github.io › pandas-memory-optimization

Mastering Memory Optimization for Pandas DataFrames - ThinhDA

August 25, 2024 - The article then explains how to understand DataFrame memory usage using Pandas’ built-in functions. It delves into optimizing data types by downcasting numeric types and converting object types to more efficient ones. Efficient data loading techniques are discussed, including selective column ...

Towards Data Science

towardsdatascience.com › home › latest › pandas – save memory with these simple tricks

Pandas - Save Memory with These Simple Tricks | Towards Data Science

January 24, 2025 - Let’s start with reading the data into a Pandas DataFrame. ... The dataframe has almost 1 million rows and 13 columns. It includes historical prices of cryptocurrencies. ... df.memory_usage() Index 80 slug 7538376 symbol 7538376 name 7538376 date 7538376 ranknow 7538376 open 7538376 high 7538376 low 7538376 close 7538376 volume 7538376 market 7538376 close_ratio 7538376 spread 7538376 dtype: int64