pandas describe by group

Pandas dataframe: how to apply describe() to each group and add to new columns?

stackoverflow.com › questions › 33575587 › pandas-dataframe-how-to-apply-describe-to-each-group-and-add-to-new-columns

there is even a shorter one :)

print df.groupby('name').describe().unstack(1)

Nothing beats one-liner:

In [145]:

print df.groupby('name').describe().reset_index().pivot(index='name', values='score', columns='level_1')

Answer from Andrey Vykhodtsev on Stack Overflow

Statology

statology.org › home › pandas: how to use describe() by group

Pandas: How to Use describe() by Group

August 30, 2022 - This tutorial explains how to use the describe() function for each group in a pandas DataFrame, including an example.

Stack Overflow

stackoverflow.com › questions › 33575587 › pandas-dataframe-how-to-apply-describe-to-each-group-and-add-to-new-columns

python - Pandas dataframe: how to apply describe() to each group and add to new columns? - Stack Overflow

Define some data

In[1]:
import pandas as pd
import io

data = """
name score
A      1
A      2
A      3
A      4
A      5
B      2
B      4
B      6
B      8
    """

df = pd.read_csv(io.StringIO(data), delimiter='\s+')
print(df)

Out[1]:
  name  score
0    A      1
1    A      2
2    A      3
3    A      4
4    A      5
5    B      2
6    B      4
7    B      6
8    B      8

Solution

A nice approach to this problem uses a generator expression (see footnote) to allow pd.DataFrame() to iterate over the results of groupby, and construct the summary stats dataframe on the fly:

In[2]:
df2 = pd.DataFrame(group.describe().rename(columns={'score':name}).squeeze()
                         for name, group in df.groupby('name'))

print(df2)

Out[2]:
   count  mean       std  min  25%  50%  75%  max
A      5     3  1.581139    1  2.0    3  4.0    5
B      4     5  2.581989    2  3.5    5  6.5    8

Here the squeeze function is squeezing out a dimension, to convert the one-column group summary stats Dataframe into a Series.

Footnote: A generator expression has the form my_function(a) for a in iterator, or if iterator gives us back two-element tuples, as in the case of groupby: my_function(a,b) for a,b in iterator

Pandas

pandas.pydata.org › docs › reference › api › pandas.DataFrame.groupby.html

pandas.DataFrame.groupby — pandas 3.0.2 documentation

If a list or ndarray of length equal to the number of rows is passed (see the groupby user guide), the values are used as-is to determine the groups. A label or list of labels may be passed to group by the columns in self.

Apache

spark.apache.org › docs › latest › api › python › reference › pyspark.pandas › api › pyspark.pandas.groupby.DataFrameGroupBy.describe.html

pyspark.pandas.groupby.DataFrameGroupBy.describe — PySpark 4.1.1 documentation

Unlike pandas, the percentiles in pandas-on-Spark are based upon approximate percentile computation because computing percentiles across a large dataset is extremely expensive. ... Summary statistics of the DataFrame provided. ... >>> df = ps.DataFrame({'a': [1, 1, 3], 'b': [4, 5, 6], 'c': [7, 8, 9]}) >>> df a b c 0 1 4 7 1 1 5 8 2 3 6 9 · Describing a DataFrame. By default only numeric fields are returned. >>> described = df.groupby('a').describe() >>> described.sort_index() b c count mean std min 25% 50% 75% max count mean std min 25% 50% 75% max a 1 2.0 4.5 0.707107 4.0 4.0 4.0 5.0 5.0 2.0 7.5 0.707107 7.0 7.0 7.0 8.0 8.0 3 1.0 6.0 NaN 6.0 6.0 6.0 6.0 6.0 1.0 9.0 NaN 9.0 9.0 9.0 9.0 9.0

Pandas

pandas.pydata.org › docs › reference › api › pandas.core.groupby.DataFrameGroupBy.describe.html

pandas.core.groupby.DataFrameGroupBy.describe — pandas 2.3.3 documentation

Describing a column from a DataFrame by accessing it as an attribute.

GeeksforGeeks

geeksforgeeks.org › pandas › python-pandas-dataframe-groupby

Pandas dataframe.groupby() Method - GeeksforGeeks

14:51

Pandas groupby() function is a powerful tool used to split a DataFrame into groups based on one or more columns, allowing for efficient data analysis and aggregation. It follows a "split-apply-combine" strategy, where data is divided into groups, ...

Published July 11, 2025

Stack Overflow

stackoverflow.com › questions › 40346436 › describe-function-with-groupby-pandas-python-3-5-1 › 40346674

Describe Function With Groupby Pandas (Python 3.5.1) - Stack Overflow

Top answer

1 of 2

You can use groupby.describe:

df.groupby('gender').describe()
Out: 
                    age  postTestScore  preTestScore
gender                                              
female count   3.000000       3.000000      3.000000
       mean   53.666667      73.666667     19.333333
       std    18.556221      18.770544     14.571662
       min    36.000000      57.000000      3.000000
       25%    44.000000      63.500000     13.500000
       50%    52.000000      70.000000     24.000000
       75%    62.500000      82.000000     27.500000
       max    73.000000      94.000000     31.000000
male   count   2.000000       2.000000      2.000000
       mean   33.000000      43.500000      3.000000
       std    12.727922      26.162951      1.414214
       min    24.000000      25.000000      2.000000
       25%    28.500000      34.250000      2.500000
       50%    33.000000      43.500000      3.000000
       75%    37.500000      52.750000      3.500000
       max    42.000000      62.000000      4.000000

2 of 2

If you want two separate outputs, you could do the following:

df[df.gender == 'male'].describe()
df[df.gender == 'female'].describe()

Pandas

pandas.pydata.org › pandas-docs › version › 1.1 › reference › api › pandas.core.groupby.DataFrameGroupBy.describe.html

pandas.core.groupby.DataFrameGroupBy.describe — pandas 1.1.5 documentation

>>> df.describe(exclude=[object]) categorical numeric count 3 3.0 unique 3 NaN top f NaN freq 1 NaN mean NaN 2.0 std NaN 1.0 min NaN 1.0 25% NaN 1.5 50% NaN 2.0 75% NaN 2.5 max NaN 3.0 · pandas.core.groupby.DataFrameGroupBy.cumsum pandas.core.groupby.DataFrameGroupBy.diff

Spark By {Examples}

sparkbyexamples.com › home › pandas › pandas get statistics for each group?

Pandas Get Statistics For Each Group? - Spark By {Examples}

June 19, 2025 - This can be done by defining your own custom aggregation function and then passing it to the agg() method within the groupby() operation. In this article, I have explained how to groupby() single and multiple columns and get counts, size, max, ...

Find elsewhere

Google Bing Mojeek

Pandas

pandas.pydata.org › pandas-docs › version › 0.18 › generated › pandas.core.groupby.DataFrameGroupBy.describe.html

pandas.core.groupby.DataFrameGroupBy.describe — pandas 0.18.1 documentation

DataFrameGroupBy.describe(percentiles=None, include=None, exclude=None)¶

Built In

builtin.com › data-science › pandas-groupby

Pandas Groupby: 5 Methods to Know in Python | Built In

The groupby() function in Pandas splits all the records from a data set into different categories or groups, offering flexibility to analyze the data by these groups. Pandas is a widely used Python library for data analytics projects, but it ...

Pandas

pandas.pydata.org › pandas-docs › version › 0.20 › generated › pandas.core.groupby.DataFrameGroupBy.describe.html

pandas.core.groupby.DataFrameGroupBy.describe — pandas 0.20.3 documentation

pandas.core.groupby.DataFrameGroupBy.boxplot · Resampling · Style · General utility functions · Internals · Release Notes · Enter search terms or a module, class or function name. DataFrameGroupBy.describe(**kwargs)[source]¶ · Notes · For numeric data, the result’s index will include count, mean, std, min, max as well as lower, 50 and upper percentiles. By default the lower percentile is 25 and the upper percentile is 75.

Delft Stack

delftstack.com › home › howto › python pandas › pandas groupby describe

Pandas Groupby Describe | Delft Stack

January 30, 2023 - So far, we have created a data frame; next, let’s group the data using the groupby() function and see the statistical analysis using the describe(). # import pandas import pandas as pd # create DataFrame df = pd.DataFrame( { "teams": ["A", ...

Pandas How To

pandashowto.com › pandas how to › data transformation › pandas groupby(): complete guide with examples • pandas how to

Pandas Groupby(): Complete Guide With Examples • Pandas How To

December 9, 2025 - The describe() method provides a complete statistical summary for each group: ... This returns count, mean, std, min, 25%, 50%, 75%, and max for each group—all in one operation. To group by multiple columns, pass a list of column names to groupby.

Earth Data Science

earthdatascience.org › home

Run Calculations and Summary Statistics on Pandas Dataframes | Earth Data Science - Earth Lab

September 15, 2020 - This would run .describe() on the precipitation values for each season as a grouped dataset. In this example, the label_column on which you want to group data is seasons and the value_column that you want to summarize is precip. # Group data by seasons and summarize precip precip_by_season=avg_monthly_precip.groupby(["seasons"])[["precip"]].describe() precip_by_season

Stack Overflow

stackoverflow.com › questions › 64919984 › how-to-use-describe-for-a-pandas-dataframe

python - How to use describe() for a pandas dataframe? - Stack Overflow

This can be achieved using the .groupby() function on the DataFrame and passing the column name to be grouped; in this case 'y'.

Pandas

pandas.pydata.org › docs › user_guide › groupby.html

Group by: split-apply-combine — pandas 3.0.1 documentation

SELECT Column1, Column2, mean(Column3), sum(Column4) FROM SomeTable GROUP BY Column1, Column2 · We aim to make operations like this natural and easy to express using pandas.

Towards Data Science

towardsdatascience.com › home › latest › all about pandas groupby explained with 25 examples

All About Pandas Groupby Explained with 25 Examples | Towards Data Science

January 29, 2025 - The groupby is one of the most frequently used Pandas functions in data analysis. It is used for grouping the data points (i.e. rows) based on the distinct values in the given column or columns.

GitHub

github.com › pola-rs › polars › issues › 8066

`describe` method for expressions · Issue #8066 · pola-rs/polars

April 7, 2023 - Would be great to have describe method for expressions. Use case for example is to do something like df.groupby("x").agg(pl.col("y").describe())

Author wangkev

GitHub

github.com › pandas-dev › pandas › issues › 4792

describe on a groupby · Issue #4792 · pandas-dev/pandas

September 9, 2013 - df = pd.DataFrame({'PRICE': {pd.Timestamp('2011-01-06 10:59:05', tz=None): 24990, pd.Timestamp('2011-01-06 12:43:33', tz=None): 25499, pd.Timestamp('2011-01-06 12:54:09', tz=None): 25499}, 'VOLUME': {pd.Timestamp('2011-01-06 10:59:05', tz=None): 1500000000, pd.Timestamp('2011-01-06 12:43:33', tz=None): 5000000000, pd.Timestamp('2011-01-06 12:54:09', tz=None): 100000000}}) In [2]: g = df.groupby('PRICE') In [3]: df.groupby('PRICE').describe() # expected .unstack(1) Out[3]: PRICE VOLUME PRICE 24990 count 1 1.000000e+00 mean 24990 1.500000e+09 std NaN NaN min 24990 1.500000e+09 25% 24990 1.500000e+09 50% 24990 1.500000e+09 75% 24990 1.500000e+09 max 24990 1.500000e+09 25499 count 2 2.000000e+00 mean 25499 2.550000e+09 std 0 3.464823e+09 min 25499 1.000000e+08 25% 25499 1.325000e+09 50% 25499 2.550000e+09 75% 25499 3.775000e+09 max 25499 5.000000e+09

Author hayd