Brave Search

Find row where values for column is maximal in a pandas DataFrame

stackoverflow.com › questions › 10202570 › find-row-where-values-for-column-is-maximal-in-a-pandas-dataframe

Use the pandas idxmax function. It's straightforward:

>>> import pandas
>>> import numpy as np
>>> df = pandas.DataFrame(np.random.randn(5,3),columns=['A','B','C'])
>>> df
          A         B         C
0  1.232853 -1.979459 -0.573626
1  0.140767  0.394940  1.068890
2  0.742023  1.343977 -0.579745
3  2.125299 -0.649328 -0.211692
4 -0.187253  1.908618 -1.862934
>>> df['A'].idxmax()
3
>>> df['B'].idxmax()
4
>>> df['C'].idxmax()
1

Alternatively you could also use numpy.argmax, such as numpy.argmax(df['A']) -- it provides the same thing, and appears at least as fast as idxmax in cursory observations.
idxmax() returns indices labels, not integers.
Example': if you have string values as your index labels, like rows 'a' through 'e', you might want to know that the max occurs in row 4 (not row 'd').
if you want the integer position of that label within the Index you have to get it manually (which can be tricky now that duplicate row labels are allowed).

HISTORICAL NOTES:

idxmax() used to be called argmax() prior to 0.11
argmax was deprecated prior to 1.0.0 and removed entirely in 1.0.0
back as of Pandas 0.16, argmax used to exist and perform the same function (though appeared to run more slowly than idxmax).
argmax function returned the integer position within the index of the row location of the maximum element.
pandas moved to using row labels instead of integer indices. Positional integer indices used to be very common, more common than labels, especially in applications where duplicate row labels are common.

For example, consider this toy DataFrame with a duplicate row label:

In [19]: dfrm
Out[19]: 
          A         B         C
a  0.143693  0.653810  0.586007
b  0.623582  0.312903  0.919076
c  0.165438  0.889809  0.000967
d  0.308245  0.787776  0.571195
e  0.870068  0.935626  0.606911
f  0.037602  0.855193  0.728495
g  0.605366  0.338105  0.696460
h  0.000000  0.090814  0.963927
i  0.688343  0.188468  0.352213
i  0.879000  0.105039  0.900260

In [20]: dfrm['A'].idxmax()
Out[20]: 'i'

In [21]: dfrm.iloc[dfrm['A'].idxmax()]  # .ix instead of .iloc in older versions of pandas
Out[21]: 
          A         B         C
i  0.688343  0.188468  0.352213
i  0.879000  0.105039  0.900260

So here a naive use of idxmax is not sufficient, whereas the old form of argmax would correctly provide the positional location of the max row (in this case, position 9).

This is exactly one of those nasty kinds of bug-prone behaviors in dynamically typed languages that makes this sort of thing so unfortunate, and worth beating a dead horse over. If you are writing systems code and your system suddenly gets used on some data sets that are not cleaned properly before being joined, it's very easy to end up with duplicate row labels, especially string labels like a CUSIP or SEDOL identifier for financial assets. You can't easily use the type system to help you out, and you may not be able to enforce uniqueness on the index without running into unexpectedly missing data.

So you're left with hoping that your unit tests covered everything (they didn't, or more likely no one wrote any tests) -- otherwise (most likely) you're just left waiting to see if you happen to smack into this error at runtime, in which case you probably have to go drop many hours worth of work from the database you were outputting results to, bang your head against the wall in IPython trying to manually reproduce the problem, finally figuring out that it's because idxmax can only report the label of the max row, and then being disappointed that no standard function automatically gets the positions of the max row for you, writing a buggy implementation yourself, editing the code, and praying you don't run into the problem again.

Answer from ely on Stack Overflow

Pandas

pandas.pydata.org › docs › reference › api › pandas.DataFrame.max.html

pandas.DataFrame.max — pandas 3.0.2 documentation

Return the maximum over the requested axis. ... Return the index of the minimum over the requested axis.

Stack Overflow

stackoverflow.com › questions › 10202570 › find-row-where-values-for-column-is-maximal-in-a-pandas-dataframe

python - Find row where values for column is maximal in a pandas DataFrame - Stack Overflow

Top answer

1 of 15

388

Use the pandas idxmax function. It's straightforward:

>>> import pandas
>>> import numpy as np
>>> df = pandas.DataFrame(np.random.randn(5,3),columns=['A','B','C'])
>>> df
          A         B         C
0  1.232853 -1.979459 -0.573626
1  0.140767  0.394940  1.068890
2  0.742023  1.343977 -0.579745
3  2.125299 -0.649328 -0.211692
4 -0.187253  1.908618 -1.862934
>>> df['A'].idxmax()
3
>>> df['B'].idxmax()
4
>>> df['C'].idxmax()
1

Alternatively you could also use numpy.argmax, such as numpy.argmax(df['A']) -- it provides the same thing, and appears at least as fast as idxmax in cursory observations.
idxmax() returns indices labels, not integers.
Example': if you have string values as your index labels, like rows 'a' through 'e', you might want to know that the max occurs in row 4 (not row 'd').
if you want the integer position of that label within the Index you have to get it manually (which can be tricky now that duplicate row labels are allowed).

HISTORICAL NOTES:

idxmax() used to be called argmax() prior to 0.11
argmax was deprecated prior to 1.0.0 and removed entirely in 1.0.0
back as of Pandas 0.16, argmax used to exist and perform the same function (though appeared to run more slowly than idxmax).
argmax function returned the integer position within the index of the row location of the maximum element.
pandas moved to using row labels instead of integer indices. Positional integer indices used to be very common, more common than labels, especially in applications where duplicate row labels are common.

For example, consider this toy DataFrame with a duplicate row label:

In [19]: dfrm
Out[19]: 
          A         B         C
a  0.143693  0.653810  0.586007
b  0.623582  0.312903  0.919076
c  0.165438  0.889809  0.000967
d  0.308245  0.787776  0.571195
e  0.870068  0.935626  0.606911
f  0.037602  0.855193  0.728495
g  0.605366  0.338105  0.696460
h  0.000000  0.090814  0.963927
i  0.688343  0.188468  0.352213
i  0.879000  0.105039  0.900260

In [20]: dfrm['A'].idxmax()
Out[20]: 'i'

In [21]: dfrm.iloc[dfrm['A'].idxmax()]  # .ix instead of .iloc in older versions of pandas
Out[21]: 
          A         B         C
i  0.688343  0.188468  0.352213
i  0.879000  0.105039  0.900260

So here a naive use of idxmax is not sufficient, whereas the old form of argmax would correctly provide the positional location of the max row (in this case, position 9).

This is exactly one of those nasty kinds of bug-prone behaviors in dynamically typed languages that makes this sort of thing so unfortunate, and worth beating a dead horse over. If you are writing systems code and your system suddenly gets used on some data sets that are not cleaned properly before being joined, it's very easy to end up with duplicate row labels, especially string labels like a CUSIP or SEDOL identifier for financial assets. You can't easily use the type system to help you out, and you may not be able to enforce uniqueness on the index without running into unexpectedly missing data.

So you're left with hoping that your unit tests covered everything (they didn't, or more likely no one wrote any tests) -- otherwise (most likely) you're just left waiting to see if you happen to smack into this error at runtime, in which case you probably have to go drop many hours worth of work from the database you were outputting results to, bang your head against the wall in IPython trying to manually reproduce the problem, finally figuring out that it's because idxmax can only report the label of the max row, and then being disappointed that no standard function automatically gets the positions of the max row for you, writing a buggy implementation yourself, editing the code, and praying you don't run into the problem again.

2 of 15

103

You might also try idxmax:

In [5]: df = pandas.DataFrame(np.random.randn(10,3),columns=['A','B','C'])

In [6]: df
Out[6]: 
          A         B         C
0  2.001289  0.482561  1.579985
1 -0.991646 -0.387835  1.320236
2  0.143826 -1.096889  1.486508
3 -0.193056 -0.499020  1.536540
4 -2.083647 -3.074591  0.175772
5 -0.186138 -1.949731  0.287432
6 -0.480790 -1.771560 -0.930234
7  0.227383 -0.278253  2.102004
8 -0.002592  1.434192 -1.624915
9  0.404911 -2.167599 -0.452900

In [7]: df.idxmax()
Out[7]: 
A    0
B    8
C    7

e.g.

In [8]: df.loc[df['A'].idxmax()]
Out[8]: 
A    2.001289
B    0.482561
C    1.579985

Videos

00:53

YouTube

How To Find The Max Value Within A Pandas Column - YouTube

November 16, 2024

04:44

YouTube

Find the Min and Max Values for Rows and Columns - Pandas - YouTube

October 20, 2020

03:43

YouTube

Pandas Index Max | pd.DataFrame.idxmax() - YouTube

September 12, 2020

11:19

YouTube

Highlighting the Maximum Value of each Column in Pandas - YouTube

w3schools.com › python › pandas › ref_df_max.asp

Pandas DataFrame max() Method

import pandas as pd data = [[10, ... maximum value of each column. By specifying the column axis (axis='columns'), the max() method searches column-wise and returns the maximum value for each row....

Spark By {Examples}

sparkbyexamples.com › home › pandas › pandas find row values for column maximal

Pandas Find Row Values for Column Maximal - Spark By {Examples}

November 25, 2024 - In Pandas, you can find the row values for the maximum value in a specific column using the idxmax() function along with the column selection. You can

Stack Overflow

stackoverflow.com › questions › 44300989 › get-max-value-from-row-of-a-dataframe-in-python

pandas - Get max value from row of a dataframe in python - Stack Overflow

Top answer

1 of 2

88

Use max with axis=1:

df = df.max(axis=1)
print (df)
0    2.0
1    3.2
2    8.8
3    7.8
dtype: float64

And if need new column:

df['max_value'] = df.max(axis=1)
print (df)
     a    b     c  max_value
0  1.2  2.0  0.10        2.0
1  2.1  1.1  3.20        3.2
2  0.2  1.9  8.80        8.8
3  3.3  7.8  0.12        7.8

2 of 2

6

You could use numpy

df.assign(max_value=df.values.max(1))

     a    b     c  max_value
0  1.2  2.0  0.10        2.0
1  2.1  1.1  3.20        3.2
2  0.2  1.9  8.80        8.8
3  3.3  7.8  0.12        7.8

Statology

statology.org › home › pandas: how to find the max value in each row

Pandas: How to Find the Max Value in Each Row

February 16, 2023 - This particular syntax creates a new column called max that contains the max value in each row of the DataFrame. The following example shows how to use this syntax in practice. ... import pandas as pd import numpy as np #create DataFrame df ...

Dataquest Community

community.dataquest.io › q&a › dq courses

Pandas return row with the maximum value of a column - DQ Courses - Dataquest Community

Top answer

1 of 5

1

Hi @charlesd, You can try the following: df.iloc[df['column_name'].idxmax()] idxmax() will return the index position of the row with the highest value. Then you can use iloc to return the row with that index. WARNING: If there are multiple max values this method will only return the first row w…

2 of 5

0

Thanks! I’m getting a ‘Series’ object has no attribute ‘idmax’ error. Hmm, so when I was manipulating the dataset did I change this column to a series? I’ve checked and it says that the type of the column is ‘float64’.

Statology

statology.org › home › pandas: return row with max value in particular column

Pandas: Return Row with Max Value in Particular Column

July 11, 2022 - The row in index position 6 contained the max value in the points column, so a value of 6 was returned. Related: How to Use idxmax() Function in Pandas (With Examples)

Find elsewhere

Google Bing Mojeek

Stack Abuse

stackabuse.com › how-to-get-the-max-element-of-a-pandas-dataframe-rows-columns-entire-dataframe

How to Get the Max Element of a Pandas DataFrame - Rows, Columns, Entire DataFrame

July 19, 2022 - The default value for the axis argument is 0. If the axis equals to 0, the max() method will find the max element of each column. On the other hand, if the axis equals to 1, the max() will find the max element of each row.

Data Science Dojo

discuss.datasciencedojo.com › python

What is the method to find the maximum value in each row of a dataframe? - Python - Data Science Dojo Discussions

February 22, 2023 - I wanted to get a name of a column and the only method I know to do this using the method idmax() which is used in Pandas to find the column with the maximum value in each row of the DataFrame and then count the occurrences of each column to ...

GeeksforGeeks

geeksforgeeks.org › python › find-maximum-values-position-in-columns-and-rows-of-a-dataframe-in-pandas

Find maximum values & position in columns and rows of a Dataframe in Pandas - GeeksforGeeks

July 15, 2025 - Pandas dataframe.idxmax() method returns the index of the first occurrence of maximum over the requested axis. While finding the index of the maximum value across any index, all NA/null values are excluded.

Pandas

pandas.pydata.org › pandas-docs › stable › generated › pandas.DataFrame.max.html

pandas.DataFrame.max — pandas 2.2.2 documentation

The page has been moved to this page

Stack Overflow

stackoverflow.com › questions › 16424493 › pandas-setting-no-of-max-rows

python - Pandas: Setting no. of max rows - Stack Overflow

Top answer

1 of 10

607

Set display.max_rows:

pd.set_option('display.max_rows', 500)

For older versions of pandas (<=0.11.0) you need to change both display.height and display.max_rows.

pd.set_option('display.height', 500)
pd.set_option('display.max_rows', 500)

See also pd.describe_option('display').

You can set an option only temporarily for this one time like this:

from IPython.display import display
with pd.option_context('display.max_rows', 100, 'display.max_columns', 10):
    display(df) #need display to show the dataframe when using with in jupyter
    #some pandas stuff

You can also reset an option back to its default value like this:

pd.reset_option('display.max_rows')

And reset all of them back:

pd.reset_option('all')

2 of 10

88

pd.set_option('display.max_rows', 500)
df

Does not work in Jupyter!
Instead use:

pd.set_option('display.max_rows', 500)
df.head(500)

Stack Overflow

stackoverflow.com › questions › 15741759 › find-maximum-value-of-a-column-and-return-the-corresponding-row-values-using-pan

python - Find maximum value of a column and return the corresponding row values using Pandas - Stack Overflow

Top answer

1 of 12

246

Assuming df has a unique index, this gives the row with the maximum value:

In [34]: df.loc[df['Value'].idxmax()]
Out[34]: 
Country        US
Place      Kansas
Value         894
Name: 7

Note that idxmax returns index labels. So if the DataFrame has duplicates in the index, the label may not uniquely identify the row, so df.loc may return more than one row.

Therefore, if df does not have a unique index, you must make the index unique before proceeding as above. Depending on the DataFrame, sometimes you can use stack or set_index to make the index unique. Or, you can simply reset the index (so the rows become renumbered, starting at 0):

df = df.reset_index()

2 of 12

129

df[df['Value']==df['Value'].max()]

This will return the entire row with max value

Python Examples

pythonexamples.org › pandas-dataframe-maximum-max

Pandas DataFrame - Maximum Value - max() - Exmaples

To find the maximum value of a Pandas DataFrame, you can use the pandas.DataFrame.max() method. Using max(), you can find the maximum value along an axis: row-wise or column-wise, or the maximum of the entire DataFrame.

Arab Psychology

scales.arabpsychology.com › psychological scales › how do i find the max value in a row in pandas?

How Do I Find The Max Value In A Row In Pandas?

September 2, 2024 - To find the maximum value in a row in Pandas, you can use the .max() method. This method will return the highest value in the given row. For example, if you have a dataframe named df, you can use df.max() to return the maximum value in each row of the dataframe.

thisPointer

thispointer.com › home › pandas › pandas: find maximum values & position in columns or rows of a dataframe

Pandas: Find maximum values & position in columns or rows of a Dataframe - thisPointer

November 7, 2019 - # get the column name of max values in every row maxValueIndexObj = dfObj.idxmax(axis=1) print("Max values of row are at following columns :") print(maxValueIndexObj) ... It’s a series containing the rows index labels as index and column names as values where the maximum value exists in that row. ... import pandas as pd import numpy as np def main(): # List of Tuples matrix = [(22, 16, 23), (33, np.NaN, 11), (44, 34, 11), (55, 35, np.NaN), (66, 36, 13) ] # Create a DataFrame object dfObj = pd.DataFrame(matrix, index=list('abcde'), columns=list('xyz')) print('Original Dataframe Contents :') p

Pandas

pandas.pydata.org › docs › reference › api › pandas.DataFrame.idxmax.html

pandas.DataFrame.idxmax — pandas 3.0.2 documentation

This method is the DataFrame version of ndarray.argmax. ... Consider a dataset containing food consumption in Argentina. >>> df = pd.DataFrame( ... { ... "consumption": [10.51, 103.11, 55.48], ... "co2_emissions": [37.2, 19.66, 1712], ... }, ... index=["Pork", "Wheat Products", "Beef"], ... ) >>> df consumption co2_emissions Pork 10.51 37.20 Wheat Products 103.11 19.66 Beef 55.48 1712.00 · By default, it returns the index for the maximum value in each column.

GeeksforGeeks

geeksforgeeks.org › select-row-with-maximum-and-minimum-value-in-pandas-dataframe

Select row with maximum and minimum value in Pandas dataframe - GeeksforGeeks

September 7, 2022 - We often need to do certain operations on both rows and column while handling the data. Letâs see how to sort rows in pandas DataFrame. Code #1: Sorting rows by Sc ... Let's see how can we select rows with maximum and minimum values in Pandas Dataframe with help of different examples using Python.

Stack Overflow

stackoverflow.com › questions › 41108859 › python-pandas-find-the-maximum-for-each-row-in-a-dataframe-column-containing-a

Python Pandas: Find the maximum for each row in a dataframe column containing a numpy array - Stack Overflow

Top answer

1 of 2

4

I would just forget the 'max_val_idx' column. I don't think it saves time and actually is more of a pain for syntax. Sample data:

df = pd.DataFrame({ 'x': range(3) }).applymap( lambda x: np.random.randn(3) )

                                                   x
0  [-1.17106202376, -1.61211460669, 0.0198122724315]
1    [0.806819945736, 1.49139051675, -0.21434675401]
2  [-0.427272615966, 0.0939459129359, 0.496474566...

You could extract the max like this:

df.applymap( lambda x: x.max() )

          x  
0  0.019812
1  1.491391
2  0.496475

But generally speaking, life is easier if you have one number per cell. If each cell has an array of length 3, you could rearrange like this:

for i, v in enumerate(list('abc')): df[v] = df.x.map( lambda x: x[i] )
df = df[list('abc')]

          a         b         c
0 -1.171062 -1.612115  0.019812
1  0.806820  1.491391 -0.214347
2 -0.427273  0.093946  0.496475

And then do a standard pandas operation:

df.apply( max, axis=1 )

          x  
0  0.019812
1  1.491391
2  0.496475

Admittedly, this is not much easier than above, but overall the data will be much easier to work with in this form.

2 of 2

2

I don't know how the speed of this will compare, since I'm constructing a 2D matrix of all the rows, but here's a possible solution:

>>> np.choose(df['max_val_idx'], np.array(df['values'].tolist()).T)
0   -0.611351
1   -0.990448
2   -1.012000