Brave Search

GroupBy pandas DataFrame and select most common value

stackoverflow.com › questions › 15222754 › groupby-pandas-dataframe-and-select-most-common-value

Pandas >= 0.16

`pd.Series.mode` is available!

Use groupby, GroupBy.agg, and apply the pd.Series.mode function to each group:

source.groupby(['Country','City'])['Short name'].agg(pd.Series.mode)

Country  City            
Russia   Sankt-Petersburg    Spb
USA      New-York             NY
Name: Short name, dtype: object

If this is needed as a DataFrame, use

source.groupby(['Country','City'])['Short name'].agg(pd.Series.mode).to_frame()

                         Short name
Country City                       
Russia  Sankt-Petersburg        Spb
USA     New-York                 NY

The useful thing about Series.mode is that it always returns a Series, making it very compatible with agg and apply, especially when reconstructing the groupby output. It is also faster.

# Accepted answer.
%timeit source.groupby(['Country','City']).agg(lambda x:x.value_counts().index[0])
# Proposed in this post.
%timeit source.groupby(['Country','City'])['Short name'].agg(pd.Series.mode)

5.56 ms ± 343 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
2.76 ms ± 387 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Dealing with Multiple Modes

Series.mode also does a good job when there are multiple modes:

source2 = source.append(
    pd.Series({'Country': 'USA', 'City': 'New-York', 'Short name': 'New'}),
    ignore_index=True)

# Now `source2` has two modes for the 
# ("USA", "New-York") group, they are "NY" and "New".
source2

  Country              City Short name
0     USA          New-York         NY
1     USA          New-York        New
2  Russia  Sankt-Petersburg        Spb
3     USA          New-York         NY
4     USA          New-York        New

source2.groupby(['Country','City'])['Short name'].agg(pd.Series.mode)

Country  City            
Russia   Sankt-Petersburg          Spb
USA      New-York            [NY, New]
Name: Short name, dtype: object

Or, if you want a separate row for each mode, you can use GroupBy.apply:

source2.groupby(['Country','City'])['Short name'].apply(pd.Series.mode)

Country  City               
Russia   Sankt-Petersburg  0    Spb
USA      New-York          0     NY
                           1    New
Name: Short name, dtype: object

If you don't care which mode is returned as long as it's either one of them, then you will need a lambda that calls mode and extracts the first result.

source2.groupby(['Country','City'])['Short name'].agg(
    lambda x: pd.Series.mode(x)[0])

Country  City            
Russia   Sankt-Petersburg    Spb
USA      New-York             NY
Name: Short name, dtype: object

Alternatives to (not) consider

You can also use statistics.mode from python, but...

source.groupby(['Country','City'])['Short name'].apply(statistics.mode)

Country  City            
Russia   Sankt-Petersburg    Spb
USA      New-York             NY
Name: Short name, dtype: object

...it does not work well when having to deal with multiple modes; a StatisticsError is raised. This is mentioned in the docs:

If data is empty, or if there is not exactly one most common value, StatisticsError is raised.

But you can see for yourself...

statistics.mode([1, 2])
# ---------------------------------------------------------------------------
# StatisticsError                           Traceback (most recent call last)
# ...
# StatisticsError: no unique mode; found 2 equally common values

Answer from coldspeed95 on Stack Overflow

Stack Overflow

stackoverflow.com › questions › 15222754 › groupby-pandas-dataframe-and-select-most-common-value

python - GroupBy pandas DataFrame and select most common value - Stack Overflow

Pandas >= 0.16

`pd.Series.mode` is available!

Use groupby, GroupBy.agg, and apply the pd.Series.mode function to each group:

source.groupby(['Country','City'])['Short name'].agg(pd.Series.mode)

Country  City            
Russia   Sankt-Petersburg    Spb
USA      New-York             NY
Name: Short name, dtype: object

If this is needed as a DataFrame, use

source.groupby(['Country','City'])['Short name'].agg(pd.Series.mode).to_frame()

                         Short name
Country City                       
Russia  Sankt-Petersburg        Spb
USA     New-York                 NY

The useful thing about Series.mode is that it always returns a Series, making it very compatible with agg and apply, especially when reconstructing the groupby output. It is also faster.

# Accepted answer.
%timeit source.groupby(['Country','City']).agg(lambda x:x.value_counts().index[0])
# Proposed in this post.
%timeit source.groupby(['Country','City'])['Short name'].agg(pd.Series.mode)

5.56 ms ± 343 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
2.76 ms ± 387 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Dealing with Multiple Modes

Series.mode also does a good job when there are multiple modes:

source2 = source.append(
    pd.Series({'Country': 'USA', 'City': 'New-York', 'Short name': 'New'}),
    ignore_index=True)

# Now `source2` has two modes for the 
# ("USA", "New-York") group, they are "NY" and "New".
source2

  Country              City Short name
0     USA          New-York         NY
1     USA          New-York        New
2  Russia  Sankt-Petersburg        Spb
3     USA          New-York         NY
4     USA          New-York        New

source2.groupby(['Country','City'])['Short name'].agg(pd.Series.mode)

Country  City            
Russia   Sankt-Petersburg          Spb
USA      New-York            [NY, New]
Name: Short name, dtype: object

Or, if you want a separate row for each mode, you can use GroupBy.apply:

source2.groupby(['Country','City'])['Short name'].apply(pd.Series.mode)

Country  City               
Russia   Sankt-Petersburg  0    Spb
USA      New-York          0     NY
                           1    New
Name: Short name, dtype: object

If you don't care which mode is returned as long as it's either one of them, then you will need a lambda that calls mode and extracts the first result.

source2.groupby(['Country','City'])['Short name'].agg(
    lambda x: pd.Series.mode(x)[0])

Country  City            
Russia   Sankt-Petersburg    Spb
USA      New-York             NY
Name: Short name, dtype: object

Alternatives to (not) consider

You can also use statistics.mode from python, but...

source.groupby(['Country','City'])['Short name'].apply(statistics.mode)

Country  City            
Russia   Sankt-Petersburg    Spb
USA      New-York             NY
Name: Short name, dtype: object

...it does not work well when having to deal with multiple modes; a StatisticsError is raised. This is mentioned in the docs:

If data is empty, or if there is not exactly one most common value, StatisticsError is raised.

But you can see for yourself...

statistics.mode([1, 2])
# ---------------------------------------------------------------------------
# StatisticsError                           Traceback (most recent call last)
# ...
# StatisticsError: no unique mode; found 2 equally common values

2 of 14

203

You can use value_counts() to get a count series, and get the first row:

source.groupby(['Country','City']).agg(lambda x: x.value_counts().index[0])

In case you are wondering about performing other agg functions in the .agg(), try this.

# Let's add a new col, "account"
source['account'] = [1, 2, 3, 3]

source.groupby(['Country','City']).agg(
    mod=('Short name', lambda x: x.value_counts().index[0]),
    avg=('account', 'mean'))

Statology

statology.org › home › pandas: how to calculate mode in a groupby object

Pandas: How to Calculate Mode in a GroupBy Object

March 14, 2022 - import pandas as pd #create DataFrame df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'B', 'B', 'C', 'C', 'C'], 'points': [10, 10, 12, 15, 19, 23, 20, 20, 26]}) #view DataFrame print(df) team points 0 A 10 1 A 10 2 A 12 3 A 15 4 B 19 5 B 23 6 C 20 7 C 20 8 C 26 · We can use the following syntax to calculate the mode points value for each team: #calculate mode points value for each team df.groupby(['team'])['points'].agg(pd.Series.mode) team A 10 B [19, 23] C 20 Name: points, dtype: object

Videos

05:58

YouTube

How to use Pandas Groupby Like a Pro! - YouTube

December 6, 2024

44:17

YouTube

The Complete Guide to Python Pandas Groupby - YouTube

June 22, 2023

9.95K

youtube.com

How to use PANDAS : DataFrame Functions: min, max, mode, plot, ...

February 3, 2022

02:01

YouTube

How to Fix the ValueError When Getting Mode from a Groupby Object ...

May 24, 2025

08:44

YouTube

GroupBy pandas DataFrame and select most common value - YouTube

March 30, 2023

View all

Pandas

pandas.pydata.org › docs › reference › api › pandas.DataFrame.groupby.html

pandas.DataFrame.groupby — pandas 3.0.2 documentation

A groupby operation involves some combination of splitting the object, applying a function, and combining the results.

GitHub

github.com › pandas-dev › pandas › issues › 11562

'mode' not recognized by df.groupby().agg(), but pd.Series.mode works · Issue #11562 · pandas-dev/pandas

November 9, 2015 - This works: df = pd.DataFrame({'A': [1, 2, 1, 2, 1, 2, 3], 'B': [1, 1, 1, 2, 2, 2, 2]}) df.groupby('B').agg(pd.Series.mode) but this doesn't: df.groupby('B').agg('mode') ... AttributeError: Cannot access callable attribute 'mode' of 'Dat...

Author patricksurry

Mode

mode.com › python-tutorial › pandas-groupby-and-python-lambda-functions

Pandas .groupby(), Lambda Function, & Pivot Table Tutorial | Python Analysis Tutorial - Mode

May 23, 2016 - The .groupby() function allows us to group records into buckets by categorical values, such as carrier, origin, and destination in this dataset.

Pandas

pandas.pydata.org › docs › user_guide › groupby.html

Group by: split-apply-combine — pandas 3.0.2 documentation

The name GroupBy should be quite familiar to those who have used a SQL-based tool (or itertools), in which you can write code like: SELECT Column1, Column2, mean(Column3), sum(Column4) FROM SomeTable GROUP BY Column1, Column2 · We aim to make operations like this natural and easy to express using pandas...

Medium

medium.com › data-science › 4-all-time-useful-use-cases-of-pandas-group-by-77aae706322b

4 All Time Useful Use-cases Of Pandas Group By | TDS Archive

March 25, 2023 - That’s where pandas function — groupby() — is useful, which groups the data based on values from a categorical or non-numerical column and helps you to analyse the data by these newly formed groups.

Find elsewhere

Google Bing Mojeek

Analytics Vidhya

analyticsvidhya.com › home › pandas groupby for data aggregation

Pandas Groupby for Data Aggregation

April 4, 2025 - It allows you to split data into groups based on specific criteria, apply functions to each group, and combine the results. This is particularly useful for aggregating, transforming, or filtering data efficiently.

Spark By {Examples}

sparkbyexamples.com › home › pandas › pandas groupby() and count() with examples

Pandas groupby() and count() with Examples

July 3, 2025 - Pandas groupby().count() is used to group columns and count the number of occurrences of each unique value in a specific column or combination of columns.

Pandas

pandas.pydata.org › docs › reference › groupby.html

GroupBy — pandas 3.0.2 documentation

pandas.api.typing.DataFrameGroupBy and pandas.api.typing.SeriesGroupBy instances are returned by groupby calls pandas.DataFrame.groupby() and pandas.Series.groupby() respectively · DataFrameGroupBy.__iter__()

GitHub

github.com › pandas-dev › pandas › issues › 19254

Groupby.mode() - feature request · Issue #19254 · pandas-dev/pandas

January 15, 2018 - Feature Request Hi, I use pandas a lot in my projects and I got stack with a problem of running the "mode" function (most common element) on a huge groupby object. There are few solutions available using aggregate and scipy.stats.mode, b...

Author j-musial

Pandas

pandas.pydata.org › docs › reference › api › pandas.DataFrame.drop.html

pandas.DataFrame.drop — pandas 3.0.2 documentation

Drop specified labels from rows or columns · Remove rows or columns by specifying label names and corresponding axis, or by directly specifying index or column names. When using a multi-index, labels on different levels can be removed by specifying the level.

AskPython

askpython.com › home › pandas: conditionally grouping values

Pandas: Conditionally Grouping Values - AskPython

August 6, 2022 - # We split the dataset by column 'Branch'. # Rows having the same Branch will be in the same group. groupby = df.groupby('Branch', axis=0) # We apply the accumulator function that we want.

Stack Overflow

stackoverflow.com › questions › 73178297 › how-to-groupby-transform-to-find-mode-in-dataframe

python - How to groupby().transform() to find mode in dataframe? - Stack Overflow

index	Class	Grade	majority_vote
0	High	A	A
1	High	A	A
2	High	B	A
3	Medium	A	x
4	Medium	B	x
5	Medium	C	x

Pandas >= 0.16

pd.Series.mode is available!

Dealing with Multiple Modes

Alternatives to (not) consider

Pandas >= 0.16

pd.Series.mode is available!

Dealing with Multiple Modes

Alternatives to (not) consider

Videos

`pd.Series.mode` is available!

`pd.Series.mode` is available!