pandas get row with max value in column

Find row where values for column is maximal in a pandas DataFrame

stackoverflow.com › questions › 10202570 › find-row-where-values-for-column-is-maximal-in-a-pandas-dataframe

Use the pandas idxmax function. It's straightforward:

>>> import pandas
>>> import numpy as np
>>> df = pandas.DataFrame(np.random.randn(5,3),columns=['A','B','C'])
>>> df
          A         B         C
0  1.232853 -1.979459 -0.573626
1  0.140767  0.394940  1.068890
2  0.742023  1.343977 -0.579745
3  2.125299 -0.649328 -0.211692
4 -0.187253  1.908618 -1.862934
>>> df['A'].idxmax()
3
>>> df['B'].idxmax()
4
>>> df['C'].idxmax()
1

Alternatively you could also use numpy.argmax, such as numpy.argmax(df['A']) -- it provides the same thing, and appears at least as fast as idxmax in cursory observations.
idxmax() returns indices labels, not integers.
Example': if you have string values as your index labels, like rows 'a' through 'e', you might want to know that the max occurs in row 4 (not row 'd').
if you want the integer position of that label within the Index you have to get it manually (which can be tricky now that duplicate row labels are allowed).

HISTORICAL NOTES:

idxmax() used to be called argmax() prior to 0.11
argmax was deprecated prior to 1.0.0 and removed entirely in 1.0.0
back as of Pandas 0.16, argmax used to exist and perform the same function (though appeared to run more slowly than idxmax).
argmax function returned the integer position within the index of the row location of the maximum element.
pandas moved to using row labels instead of integer indices. Positional integer indices used to be very common, more common than labels, especially in applications where duplicate row labels are common.

For example, consider this toy DataFrame with a duplicate row label:

In [19]: dfrm
Out[19]: 
          A         B         C
a  0.143693  0.653810  0.586007
b  0.623582  0.312903  0.919076
c  0.165438  0.889809  0.000967
d  0.308245  0.787776  0.571195
e  0.870068  0.935626  0.606911
f  0.037602  0.855193  0.728495
g  0.605366  0.338105  0.696460
h  0.000000  0.090814  0.963927
i  0.688343  0.188468  0.352213
i  0.879000  0.105039  0.900260

In [20]: dfrm['A'].idxmax()
Out[20]: 'i'

In [21]: dfrm.iloc[dfrm['A'].idxmax()]  # .ix instead of .iloc in older versions of pandas
Out[21]: 
          A         B         C
i  0.688343  0.188468  0.352213
i  0.879000  0.105039  0.900260

So here a naive use of idxmax is not sufficient, whereas the old form of argmax would correctly provide the positional location of the max row (in this case, position 9).

This is exactly one of those nasty kinds of bug-prone behaviors in dynamically typed languages that makes this sort of thing so unfortunate, and worth beating a dead horse over. If you are writing systems code and your system suddenly gets used on some data sets that are not cleaned properly before being joined, it's very easy to end up with duplicate row labels, especially string labels like a CUSIP or SEDOL identifier for financial assets. You can't easily use the type system to help you out, and you may not be able to enforce uniqueness on the index without running into unexpectedly missing data.

So you're left with hoping that your unit tests covered everything (they didn't, or more likely no one wrote any tests) -- otherwise (most likely) you're just left waiting to see if you happen to smack into this error at runtime, in which case you probably have to go drop many hours worth of work from the database you were outputting results to, bang your head against the wall in IPython trying to manually reproduce the problem, finally figuring out that it's because idxmax can only report the label of the max row, and then being disappointed that no standard function automatically gets the positions of the max row for you, writing a buggy implementation yourself, editing the code, and praying you don't run into the problem again.

Answer from ely on Stack Overflow

DataScience Made Simple

datasciencemadesimple.com › home › select row with maximum and minimum value in python pandas

select row with maximum and minimum value in python pandas - DataScience Made Simple

November 15, 2019 - df[‘Score’].idxmax() – > returns the index of the row where column name “Score” has maximum value. ... So let’s extract the entire row where score is minimum i.e. get all the details of student with minimum score as shown below · # get the row of minimum value df.loc[df['Score'].idxmin()] ... With close to 10 years on Experience in data science and machine learning Have extensively worked on programming languages like R, Python (Pandas), SAS, Pyspark.

Stack Overflow

stackoverflow.com › questions › 10202570 › find-row-where-values-for-column-is-maximal-in-a-pandas-dataframe

python - Find row where values for column is maximal in a pandas DataFrame - Stack Overflow

Top answer

1 of 15

388

Use the pandas idxmax function. It's straightforward:

>>> import pandas
>>> import numpy as np
>>> df = pandas.DataFrame(np.random.randn(5,3),columns=['A','B','C'])
>>> df
          A         B         C
0  1.232853 -1.979459 -0.573626
1  0.140767  0.394940  1.068890
2  0.742023  1.343977 -0.579745
3  2.125299 -0.649328 -0.211692
4 -0.187253  1.908618 -1.862934
>>> df['A'].idxmax()
3
>>> df['B'].idxmax()
4
>>> df['C'].idxmax()
1

Alternatively you could also use numpy.argmax, such as numpy.argmax(df['A']) -- it provides the same thing, and appears at least as fast as idxmax in cursory observations.
idxmax() returns indices labels, not integers.
Example': if you have string values as your index labels, like rows 'a' through 'e', you might want to know that the max occurs in row 4 (not row 'd').
if you want the integer position of that label within the Index you have to get it manually (which can be tricky now that duplicate row labels are allowed).

HISTORICAL NOTES:

idxmax() used to be called argmax() prior to 0.11
argmax was deprecated prior to 1.0.0 and removed entirely in 1.0.0
back as of Pandas 0.16, argmax used to exist and perform the same function (though appeared to run more slowly than idxmax).
argmax function returned the integer position within the index of the row location of the maximum element.
pandas moved to using row labels instead of integer indices. Positional integer indices used to be very common, more common than labels, especially in applications where duplicate row labels are common.

For example, consider this toy DataFrame with a duplicate row label:

In [19]: dfrm
Out[19]: 
          A         B         C
a  0.143693  0.653810  0.586007
b  0.623582  0.312903  0.919076
c  0.165438  0.889809  0.000967
d  0.308245  0.787776  0.571195
e  0.870068  0.935626  0.606911
f  0.037602  0.855193  0.728495
g  0.605366  0.338105  0.696460
h  0.000000  0.090814  0.963927
i  0.688343  0.188468  0.352213
i  0.879000  0.105039  0.900260

In [20]: dfrm['A'].idxmax()
Out[20]: 'i'

In [21]: dfrm.iloc[dfrm['A'].idxmax()]  # .ix instead of .iloc in older versions of pandas
Out[21]: 
          A         B         C
i  0.688343  0.188468  0.352213
i  0.879000  0.105039  0.900260

So here a naive use of idxmax is not sufficient, whereas the old form of argmax would correctly provide the positional location of the max row (in this case, position 9).

This is exactly one of those nasty kinds of bug-prone behaviors in dynamically typed languages that makes this sort of thing so unfortunate, and worth beating a dead horse over. If you are writing systems code and your system suddenly gets used on some data sets that are not cleaned properly before being joined, it's very easy to end up with duplicate row labels, especially string labels like a CUSIP or SEDOL identifier for financial assets. You can't easily use the type system to help you out, and you may not be able to enforce uniqueness on the index without running into unexpectedly missing data.

So you're left with hoping that your unit tests covered everything (they didn't, or more likely no one wrote any tests) -- otherwise (most likely) you're just left waiting to see if you happen to smack into this error at runtime, in which case you probably have to go drop many hours worth of work from the database you were outputting results to, bang your head against the wall in IPython trying to manually reproduce the problem, finally figuring out that it's because idxmax can only report the label of the max row, and then being disappointed that no standard function automatically gets the positions of the max row for you, writing a buggy implementation yourself, editing the code, and praying you don't run into the problem again.

2 of 15

103

You might also try idxmax:

In [5]: df = pandas.DataFrame(np.random.randn(10,3),columns=['A','B','C'])

In [6]: df
Out[6]: 
          A         B         C
0  2.001289  0.482561  1.579985
1 -0.991646 -0.387835  1.320236
2  0.143826 -1.096889  1.486508
3 -0.193056 -0.499020  1.536540
4 -2.083647 -3.074591  0.175772
5 -0.186138 -1.949731  0.287432
6 -0.480790 -1.771560 -0.930234
7  0.227383 -0.278253  2.102004
8 -0.002592  1.434192 -1.624915
9  0.404911 -2.167599 -0.452900

In [7]: df.idxmax()
Out[7]: 
A    0
B    8
C    7

e.g.

In [8]: df.loc[df['A'].idxmax()]
Out[8]: 
A    2.001289
B    0.482561
C    1.579985

Discussions

pandas - select rows containing max value basing on another duplicated rows of a group - Data Science Stack Exchange

I want to select rows by the maximum values of another column which would be the duplicated rows containing duplicated maximum values of a group. This should contain three steps: (1) group datafram... More on datascience.stackexchange.com

datascience.stackexchange.com

November 14, 2022

Pandas: Get the max value of a group ONLY if the value satisfies given conditions

Can't you just: sort by: , Possible, Total keep the last record for each ? Sorting by keeps your groups together. Sorting by Possible puts False before True in each group, so if you have True values, keeping the last will ensure you pick a True. Then sorting by Total ensures the last value is the largest of the Trues, or if all Falses, the largest of the Falses. (Edit: typo) More on reddit.com

r/learnpython

6

1

September 20, 2022

pandas - How do I find pairwise maximum of multiple rows in a column using python? - Data Science Stack Exchange

For example: if my column has 2, 25, 1, 24 as row values, I want to find max of 2 and 25, then max of 25 and 1 and so on. I also want to be able to create a new column with max values. How do I do it? ... $\begingroup$ how do I compare the last value to the first? Just so it has all values included $\endgroup$ ... You can create a new column like the one you have but "shifted" one position down, and the compute the maximum of these two columns: import pandas ... More on datascience.stackexchange.com

datascience.stackexchange.com

June 8, 2021

Pandas return row with the maximum value of a column

I’m trying to figure out how to return the row of a pandas dataframe with the maximum value in a certain column. I know that to find the maximum value in a column I use: df['columnName'].max() But I’m having a hard time figuring out how to return the whole row with the maximum value in ... More on community.dataquest.io

community.dataquest.io

5

0

January 30, 2020

Videos

04:44

YouTube

Find the Min and Max Values for Rows and Columns - Pandas - YouTube

October 20, 2020

youtube.com

pandas get row with max value in column

youtube.com

Find row where values for column is maximal in a pandas ...

03:54

YouTube

Get Max Value in Pandas Column | Python Tutorial - YouTube

February 12, 2023

06:17

YouTube

Python DataFrame: finding max, mean, statistical functions, and ...

Get Max & Min Value of Column & Index in pandas DataFrame in Python ...

December 28, 2022

View all

Medium

medium.com › nerd-for-tech › pandas-find-column-with-min-max-value-for-each-row-in-dataframe-a2f2d2b2ea7a

pandas: Find column with min/max value for each row in dataframe | by José Fernando Costa | Nerd For Tech | Medium

May 8, 2021 - The answer is the idxmin function. As per the documentation, this function returns the index or column name of the cell with the minimum value, depending on the axis specified.

TutorialsPoint

tutorialspoint.com › article › python-pandas-find-the-maximum-value-of-a-column-and-return-its-corresponding-row-values

Python Pandas – Find the maximum value of a column and return its corresponding row values

March 26, 2026 - To find the maximum value of a column and return its corresponding row values in Pandas, we can use df.loc[df[col].idxmax()]. This method first finds the index of the maximum value using idxmax(), then uses loc[] to retrieve the entire row. ...

Stack Exchange

datascience.stackexchange.com › questions › 116165 › select-rows-containing-max-value-basing-on-another-duplicated-rows-of-a-group

pandas - select rows containing max value basing on another duplicated rows of a group - Data Science Stack Exchange

Top answer

1 of 1

2

This is more of a programming than a data science question, and would therefore be better suited for stackoverflow, but this can be achieved relatively easily using a combination of sorting and grouping:

(
    df
    # sort such that the first row is within each group is the one you want
    .sort_values(["A", "B", "C"], ascending=[True, False, False])
    # group based on column A
    .groupby("A")
    # select the first row within each group
    .first()
    # reset the index such that A is a column instead of the index
    .reset_index()
)

Which gives the following result:

A	B	C
1	3	85
2	3	70
3	4	90
4	5	85

IncludeHelp

includehelp.com › python › pandas-dataframe-select-row-by-max-value-in-group.aspx

Python - Pandas dataframe select row by max value in group

November 24, 2022 - To select row by max value in group, we will simply groupby the columns and use the idxmax() method this method returns the index labels. Let us understand with the help of an example · # Importing pandas package import pandas as pd # Importing numpy package import numpy as np # Creating a ...

Skytowner

skytowner.com › explore › getting_column_label_of_max_value_in_each_row_in_pandas_datafrme

Getting column label of max value in each row in Pandas DataFrme

Master the mathematics behind data science with 100+ top-tier guides Start your free 7-days trial now! To get the column label of the max value in each row of a Pandas DataFrame, use the idxmax(axis=1) method.

Find elsewhere

Google Bing Mojeek

LeetCode

leetcode.com › problems › nth-highest-salary

Nth Highest Salary - LeetCode

Can you solve this real interview question? Nth Highest Salary - Table: Employee +-------------+------+ | Column Name | Type | +-------------+------+ | id | int | | salary | int | +-------------+------+ id is the primary key (column with unique values) for this table.

GeeksforGeeks

geeksforgeeks.org › python › select-row-with-maximum-and-minimum-value-in-pandas-dataframe

Select row with maximum and minimum value in Pandas dataframe - GeeksforGeeks

July 11, 2025 - # creating dataframe using DataFrame constructor df = pd.DataFrame(dict1) # the result shows max on # Driver, Points, Age columns. print(df.max()) ... # creating dataframe using DataFrame constructor df = pd.DataFrame(dict1) # Who scored more points ? print(df[df.Points == df.Points.max()]) ... # creating dataframe using DataFrame constructor df = pd.DataFrame(dict1) # what is the maximum age ? print(df.Age.max()) ... # creating dataframe using DataFrame constructor df = pd.DataFrame(dict1) # Which row has maximum age | # who is the oldest driver ?

GeeksforGeeks

geeksforgeeks.org › python › find-maximum-values-position-in-columns-and-rows-of-a-dataframe-in-pandas

Find maximum values & position in columns and rows of a Dataframe in Pandas - GeeksforGeeks

July 15, 2025 - If the input is a Dataframe, then ... this method. To find the maximum value of each column, call the max() method on the Dataframe object without taking any argument....

ProjectPro

projectpro.io › recipes › find-largest-value-in-pandas-dataframe

How to find the largest value in a Pandas DataFrame? -

May 11, 2022 - This recipe helps you find the largest value in a Pandas DataFrame

reddit.com › r/learnpython › pandas: get the max value of a group only if the value satisfies given conditions

r/learnpython on Reddit: Pandas: Get the max value of a group ONLY if the value satisfies given conditions

September 20, 2022 -

I have a huge datatset.

The data is grouped by col, row, year, no, potveg, and total. I am trying to get the maximum value of the 'total' column in each specific group year ONLY if its 'possible' value is TRUE. If the max 'total' value is FALSE, then get the second max value, and so on.

If all the values of the 'possible' column in a specific year group = False, then I want to pick the max out of the False so that I don't skip any years.

i.e., for the dataset below:

col	row	year	no	potveg	total	possible
                        						
-125	42.5	2015	1	9	697.3	FALSE
-125	42.5	2015	2	13	535.2	TRUE
-125	42.5	2015	3	15	82.3	TRUE
-125	42.5	2016	1	9	907.8	TRUE
-125	42.5	2016	2	13	137.6	FALSE
-125	42.5	2016	3	15	268.4	TRUE
-125	42.5	2017	1	9	961.9	FALSE
-125	42.5	2017	2	13	74.2	TRUE
-125	42.5	2017	3	15	248	TRUE
-125	42.5	2018	1	9	937.9	TRUE
-125	42.5	2018	2	13	575.6	TRUE
-125	42.5	2018	3	15	215.5	FALSE
-135	70.5	2015	1	8	697.3	FALSE
-135	70.5	2015	2	10	535.2	TRUE
-135	70.5	2015	3	19	82.3	TRUE
-135	70.5	2016	1	8	907.8	TRUE
-135	70.5	2016	2	10	137.6	FALSE
-135	70.5	2016	3	19	268.4	TRUE
-135	70.5	2017	1	8	961.9	FALSE
-135	70.5	2017	2	10	74.2	TRUE
-135	70.5	2017	3	19	248	TRUE
-135	70.5	2018	1	8	937.9	TRUE
-135	70.5	2018	2	10	575.6	TRUE
-135	70.5	2018	3	19	215.5	FALSE
-135	70.5	2019	1	8	937.9	FALSE
-135	70.5	2019	2	10	575.6	FALSE
-135	70.5	2019	3	19	215.5	FALSE

The output would be:

col	row	year	no	potveg	total	possible
-125	42.5	2015	2	13	535.2	TRUE
-125	42.5	2016	1	9	907.8	TRUE
-125	42.5	2017	3	15	248	TRUE
-125	42.5	2018	1	9	937.9	TRUE
-135	70.5	2015	2	10	535.2	TRUE
-135	70.5	2016	1	8	907.8	TRUE
-135	70.5	2017	3	19	248	TRUE
-135	70.5	2018	1	8	937.9	TRUE
-135	70.5	2019	1	8	937.9	FALSE

I have tried

# Separate out the true and false possibilities by grouping by ['col','row','year','possible']
#  and getting the idxmax for column total. At the end, we sort the result on possible in descending order.
#  This  puts all idxmax values (now in total) with True in possible first.

idx = df.groupby(['col','row','year','possible'], as_index=False)['total']\
    .idxmax().sort_values('possible', ascending=False)['total']

#we then apply a second groupby, this time only on['col', 'row', 'year'] and simply get the first.

result = df.iloc[idx].groupby(['col', 'row', 'year']).first()
orig_index = df.set_index(['col', 'row', 'year']).index.drop_duplicates()

#re-establishing the original order by using df.reindex based on the original df 
# with index there set to ['col','row','year'] and getting rid of the duplicates first.

result_reordered = result.reindex(orig_index)

But I am still getting some years where the max value is not picked resulting in duplicates.

Top answer

1 of 3

2

Can't you just: sort by: , Possible, Total keep the last record for each ? Sorting by keeps your groups together. Sorting by Possible puts False before True in each group, so if you have True values, keeping the last will ensure you pick a True. Then sorting by Total ensures the last value is the largest of the Trues, or if all Falses, the largest of the Falses. (Edit: typo)

2 of 3

2

You could put that logic into a function and run it on the groups. if "possible" has any true values - include only the true rows. >>> def max_possible(group): ... if group['possible'].any(): ... total = group[ group['possible'] ]['total'].idxmax() ... else: ... total = group['total'].idxmax() ... return group.loc[total] >>> df.groupby(['col', 'row', 'year'], sort=False).apply(max_possible) col row year no potveg total possible col row year -125 42.5 2015 -125 42.5 2015 2 13 535.2 True 2016 -125 42.5 2016 1 9 907.8 True 2017 -125 42.5 2017 3 15 248.0 True 2018 -125 42.5 2018 1 9 937.9 True -135 70.5 2015 -135 70.5 2015 2 10 535.2 True 2016 -135 70.5 2016 1 8 907.8 True 2017 -135 70.5 2017 3 19 248.0 True 2018 -135 70.5 2018 1 8 937.9 True 2019 -135 70.5 2019 1 8 937.9 False

Stack Exchange

datascience.stackexchange.com › questions › 96408 › how-do-i-find-pairwise-maximum-of-multiple-rows-in-a-column-using-python

pandas - How do I find pairwise maximum of multiple rows in a column using python? - Data Science Stack Exchange

Top answer

1 of 2

2

This would be another approach a little bit shorter:

df.assign(resultado = lambda x: x.rolling(2).max())

EDIT:

For your comment try:

def idx(x):
    return x.index.values[np.argmax(x.values)]

df.rolling(2).agg(['max', idx])

will return both, the pairwise maximum and the index that corresponds to that value.

2 of 2

1

You can create a new column like the one you have but "shifted" one position down, and the compute the maximum of these two columns:

import pandas as pd
import numpy as np

data = np.random.randint(0, 50, size=20)
df = pd.DataFrame(data, columns=['values'])
df['prev'] = df['values'].shift(1)
df['max'] = df[['values', 'prev']].max(axis=1)

The result is

+----+----------+--------+-------+
|    |   values |   prev |   max |
|----+----------+--------+-------|
|  0 |       17 |    nan |    17 |
|  1 |       32 |     17 |    32 |
|  2 |        3 |     32 |    32 |
|  3 |        4 |      3 |     4 |
|  4 |        4 |      4 |     4 |
|  5 |       17 |      4 |    17 |
|  6 |       12 |     17 |    17 |
...

You can then remove the first row if you don't need it.

Dataquest Community

community.dataquest.io › q&a › dq courses

Pandas return row with the maximum value of a column - DQ Courses - Dataquest Community

January 30, 2020 - I’m trying to figure out how to return the row of a pandas dataframe with the maximum value in a certain column. I know that to find the maximum value in a column I use: df['columnName'].max() But I’m having a hard ti…

Saturn Cloud

saturncloud.io › blog › how-to-select-row-with-max-value-in-column-from-pandas-groupby-groups

How to Select Row with Max Value in Column from Pandas groupby Groups | Saturn Cloud Blog

October 19, 2023 - Another method to find the row with the maximum value in a column from Pandas groupby() groups is using groupby() and apply(). We can define a function that returns the row with the maximum value in a column and apply it to each group. def ...

Quora

quora.com › How-can-I-find-the-maximum-values-in-a-row-with-Pandas-Idxmx-gives-only-the-first-maximum-value-but-in-the-case-of-two-or-more-values-tied-at-maximum-I-want-to-be-able-to-see-all-the-indexes

How can I find the maximum values in a row with Pandas? Idxmx gives only the first maximum value, but in the case of two or more values "...

Answer (1 of 2): Logical indexing is your friend. First, let’s define a dataframe with more than one max value: [code]A = np.arang(10).repeat(2).astype(float) a = pd.DataFrame(A, columns=['one']) [/code]Now, we use logical indexing to find all maxima: [code]a.loc[a['one'] == a['one'].max()] [/...

Arab Psychology

scales.arabpsychology.com › stats › how-can-i-use-pandas-to-find-the-maximum-value-in-each-row-of-a-data-frame

How Can I Use Pandas To Find The Maximum Value In Each Row Of A Data Frame?

June 24, 2024 - One of its useful features is the ability to find the maximum value in each row of a data frame. This can be achieved by using the “max” function along with the “axis=1” parameter, which specifies that the operation should be performed ...

Data Science Parichay

datascienceparichay.com › home › blog › pandas – get rows with maximum and minimum column values

Pandas - Get Rows with Maximum and Minimum Column Values - Data Science Parichay

November 9, 2022 - You can use the pandas loc[] property along with the idxmax() and idxmin() functions to get the row with maximum and minimum column values.

Quora

quora.com › How-do-you-find-the-max-of-multiple-columns-for-a-particular-row-and-save-the-header-of-the-max-column-into-a-new-column-Python-3-x-pandas-dataframe-development

How to find the max of multiple columns for a particular row and save the header of the max column into a new column (Python 3.x, pandas, dataframe, development) - Quora

How do you find the max of multiple columns for a particular row and save the header of the max column into a new column (Python 3.x, pandas, dataframe, development)? ... Use DataFrame.idxmax (returns column label of the first maximum per row) or numpy operations when ties need custom handling. Below are concise patterns and options. 1) Simple: column name of the maximum value per row

Intellipaat

intellipaat.com › home › blog › python pandas add a column for row-wise max value of selected columns

Python Pandas add a column for row-wise max value of selected columns - Intellipaat

February 3, 2026 - But you can still fill them with a default value before calculating the max value: df['Max_Value'] = df[['A', 'B', 'C']].fillna(0).max(axis=1) Alternatively, below is the code you can use if you want to return NaN when all values in the row are NaN: df['Max_Value'] = df[['A', 'B', 'C']].max(axis=1, skipna=False) ... Adding a column for the row-wise maximum value in a Pandas DataFrame is easy and direct.