Brave Search

Can I use idxmax() instead of argmax() in all cases?

stackoverflow.com › questions › 47596390 › can-i-use-idxmax-instead-of-argmax-in-all-cases

There is a difference. pd.DataFrames And pd.Series have an index, which might not be consecutive, e.g. [0 ... n), but is when you don't specify it during creation. Therefore people often confuse them.

Consider this parabola

import pandas as pd
import numpy as np

data = pd.Series(16 - np.arange(-4,5) ** 2)

0     0
1     7
2    12
3    15
4    16
5    15
6    12
7     7
8     0
dtype: int64

The labels are set to [0 ... 9), because we didn't specify them. In this case, both data.argmax() and data.idxmax() result in 4, because that's the integer position and label for 16.

However, if we filter out the odd values, that the index isn't consecutive anymore:

filtered = data[data % 2 == 0]

0     0
2    12
4    16
6    12
8     0
dtype: int64

Here, filtered.argmax() returns 2 whereas filtered.idxmax() returns 4.

This is particularly relevant when you want to reference data, using entries extracted from filtered. That is data.loc[4] will return the found value via the unfiltered version.

Answer from Herbert on Stack Overflow

reddit.com › r/learnpython › [pandas] idxmax() vs argmax()

r/learnpython on Reddit: [pandas] idxmax() vs argmax()

November 19, 2018 -

The current behavior of ‘Series.argmax’ is deprecated, use ‘idxmax’. This is what I found written in the documentation. Can you tell me what are the differences b/w these two? Where should use one or another?

What are the advantages of one on another?

Top answer

1 of 1

2

idxmax returns the index label of the maximum value. argmax does too, but this was a consequence of changing the pandas.Series object to no longer subclass numpy.ndarray. It used to just call numpy.ndarray.argmax, which returns the positional index, but when that no longer made sense it became an alias for idxmax. The reason why it is deprecated is because you now have a situation where numpy.ndarray.argmax and pandas.Series.argmax return two different things, which just introduces confusion, especially when that thing is already done by pandas.Series.idxmax. If you really need the positional index of the maximum, use np.argmax(my_series.values), in all other cases use my_series.idxmax().

Stack Overflow

stackoverflow.com › questions › 47596390 › can-i-use-idxmax-instead-of-argmax-in-all-cases

python - Can I use idxmax() instead of argmax() in all cases? - Stack Overflow

Top answer

1 of 1

12

There is a difference. pd.DataFrames And pd.Series have an index, which might not be consecutive, e.g. [0 ... n), but is when you don't specify it during creation. Therefore people often confuse them.

Consider this parabola

import pandas as pd
import numpy as np

data = pd.Series(16 - np.arange(-4,5) ** 2)

0     0
1     7
2    12
3    15
4    16
5    15
6    12
7     7
8     0
dtype: int64

The labels are set to [0 ... 9), because we didn't specify them. In this case, both data.argmax() and data.idxmax() result in 4, because that's the integer position and label for 16.

However, if we filter out the odd values, that the index isn't consecutive anymore:

filtered = data[data % 2 == 0]

0     0
2    12
4    16
6    12
8     0
dtype: int64

Here, filtered.argmax() returns 2 whereas filtered.idxmax() returns 4.

This is particularly relevant when you want to reference data, using entries extracted from filtered. That is data.loc[4] will return the found value via the unfiltered version.

Discussions

API: .argmax should be positional, not an alias for idxmax

In #6214 it was reported that argmax changed (I think since Series didn't subclass ndarray anymore). It's sometimes useful to get the index position of the max, so it'd be nice if .argmax did that, leaving idxmax to always be label-based... More on github.com

github.com

12

July 5, 2017

Pandas argmax() is deprecated - replace with idxmax()

Currently when you call agefromname.argmax('Jason', 'm'), you'll get a FutureWarning: 'argmax' is deprecated. Use 'idxmax' instead. The behavior of 'argmax&#... More on github.com

github.com

5

January 24, 2018

BUG: pd.Series idxmax raises ValueError instead of returning <NA> when all values are <NA>

According to documentation, when pd.Series contains all NaN values, calling idxmax with skipna=True should return NaN. However, in this case it raise "ValueError: attempt to get argmax of an empty sequence" instead. This issue only happens when I used convert_dtypes() on the Series before calling ... More on github.com

github.com

2

February 9, 2023

python - Series. max and idxmax - Stack Overflow

Although... nothing wrong with idx = s.idxmax(); val = s[idx]... ... import numpy as np import pandas as pd s = pd.Series(np.random.randn(5), index=['a', 'b', 'c', 'd', 'e']) def Max_Argmax(series): # takes as input your series values = series.values # store numeric values indexes = series.index ... More on stackoverflow.com

stackoverflow.com

Videos

08:37

YouTube

Understanding and using idxmin/idxmax in Pandas - YouTube

May 6, 2024

10:27

YouTube

Pandas library session 13 - IdxMax - IdxMin - ArgMax - ArgMin - ...

How to Find Index of Max Value in a Series or DataFrame | pandas ...

July 19, 2023

03:43

YouTube

Pandas Index Max | pd.DataFrame.idxmax() - YouTube

September 12, 2020

View all

Xarray

docs.xarray.dev › en › stable › generated › xarray.DataArray.idxmax.html

xarray.DataArray.idxmax

In comparison to argmax(), this returns the coordinate label while argmax() returns the index. ... dim (Hashable, optional) – Dimension over which to apply idxmax.

Pandas

pandas.pydata.org › pandas-docs › version › 0.25 › reference › api › pandas.Series.argmax.html

pandas.Series.argmax — pandas 0.25.3 documentation

The current behaviour of ‘Series.argmax’ is deprecated, use ‘idxmax’ instead. The behavior of ‘argmax’ will be corrected to return the positional maximum in the future.

GitHub

github.com › pandas-dev › pandas › issues › 16830

API: .argmax should be positional, not an alias for idxmax · Issue #16830 · pandas-dev/pandas

July 5, 2017 - In #6214 it was reported that argmax changed (I think since Series didn't subclass ndarray anymore). It's sometimes useful to get the index position of the max, so it'd be nice if .argmax did that, leaving idxmax to always be label-based...

Author TomAugspurger

Pandas

pandas.pydata.org › docs › reference › api › pandas.Series.idxmax.html

pandas.Series.idxmax — pandas 3.0.1 documentation - PyData |

This method is the Series version of ndarray.argmax. This method returns the label of the maximum, while ndarray.argmax returns the position.

Pandas

pandas.pydata.org › docs › reference › api › pandas.Series.argmax.html

pandas.Series.argmax — pandas 3.0.2 documentation

numpy.ndarray.argmax · Equivalent method for numpy arrays. Series.idxmax · Return index label of the maximum values. Series.idxmin · Return index label of the minimum values. Examples · Consider dataset containing cereal calories · >>> s = pd.Series( ...

Find elsewhere

Google Bing Mojeek

GitHub

github.com › JasonKessler › agefromname › issues › 3

Pandas argmax() is deprecated - replace with idxmax() · Issue #3 · JasonKessler/agefromname

January 24, 2018 - Currently when you call agefromname.argmax('Jason', 'm'), you'll get a FutureWarning: 'argmax' is deprecated. Use 'idxmax' instead. The behavior of 'argmax' will be corrected to return the positional maximum in the future. Use 'series.va...

Author ychennay

w3resource

w3resource.com › pandas › series › series-idxmax.php

Pandas Series: idxmax() function - w3resource

September 15, 2022 - The idxmax() function is used to get the row label of the maximum value. If multiple values equal the maximum, the first row label with that value is returned. ... Returns: Index Label of the maximum value.

Pandas

pandas.pydata.org › docs › reference › api › pandas.DataFrame.idxmax.html

pandas.DataFrame.idxmax — pandas 3.0.2 documentation

This method is the DataFrame version of ndarray.argmax. ... Consider a dataset containing food consumption in Argentina. >>> df = pd.DataFrame( ... { ... "consumption": [10.51, 103.11, 55.48], ... "co2_emissions": [37.2, 19.66, 1712], ... }, ... index=["Pork", "Wheat Products", "Beef"], ... ) >>> df consumption co2_emissions Pork 10.51 37.20 Wheat Products 103.11 19.66 Beef 55.48 1712.00 · By default, it returns the index for the maximum value in each column. >>> df.idxmax...

GitHub

github.com › pandas-dev › pandas › issues › 51276

BUG: pd.Series idxmax raises ValueError instead of returning <NA> when all values are <NA> · Issue #51276 · pandas-dev/pandas

February 9, 2023 - According to documentation, when pd.Series contains all NaN values, calling idxmax with skipna=True should return NaN. However, in this case it raise "ValueError: attempt to get argmax of an empty sequence" instead. This issue only happens when I used convert_dtypes() on the Series before calling idxmax.

Author hchau630

Xarray

docs.xarray.dev › en › latest › generated › xarray.Dataset.idxmax.html

xarray.Dataset.idxmax

In comparison to argmax(), this returns the coordinate label while argmax() returns the index. ... dim (str, optional) – Dimension over which to apply idxmax.

Stack Overflow

stackoverflow.com › questions › 53059843 › series-max-and-idxmax

python - Series. max and idxmax - Stack Overflow

Top answer

1 of 2

2

What about a custom function? Something like

import numpy as np
import pandas as pd

s = pd.Series(np.random.randn(5), index=['a', 'b', 'c', 'd', 'e'])

def Max_Argmax(series):  # takes as input your series
   values = series.values  # store numeric values
   indexes = series.index  # store indexes
   Argmax = np.argmax(values)  # save index of max
   return values[Argmax], indexes[Argmax] # return max and corresponding index

(max, index) = Max_Argmax(s)

I run it on my PC and I get:

>>> s
a   -1.854440
b    0.302282
c   -0.630175
d   -1.012799
e    0.239437
dtype: float64

>>> max
0.3022819091746019

>>> index
'b'

Hope it helps!

2 of 2

2

As Jon Clements mentioned:

In [3]: s = pd.Series(np.random.randn(5), index=['a', 'b', 'c', 'd', 'e'])
In [4]: x, y = s.agg(['max', 'idxmax'])
In [5]: x
Out[5]: 1.6339096862287581
In [6]: y
Out[6]: 'b'
In [7]: s
Out[7]: a    1.245039
        b    1.633910
        c    0.619384
        d    0.369604
        e    1.009942
        dtype: float64

In response to asking for a tuple:

def max_and_index(series):
    """Return a tuple of (max, idxmax) from a pandas.Series"""
    x, y = series.agg(['max', 'idxmax'])
    return x, y

t = max_and_idxmax(s)
print(t)
(1.6339096862287581, 'b')
print(type(t))
<class 'tuple'>

Even smaller:

def max_and_idxmax(series):
    """Return a tuple of (max, idxmax) from a pandas.Series"""
    return series.max(), series.idxmax()

If you need speed, use the numpy method above

import pandas as pd
import numpy as np


def max_and_index(series):
    x, y = series.agg(['max', 'idxmax'])
    return x, y

def max_and_idxmax(series):
    return series.max(), series.idxmax()

def np_max_and_argmax(series):
    return np.max(series.values), np.argmax(series.values)

def Max_Argmax(series):
   v = series.values
   i = series.index
   arg = np.argmax(v)
   return v[arg], i[arg]


a = []
for i in range(2,9,1):
    a.append(pd.Series(np.random.randint(0, 100, size=10**i)))
    print('{}\t{:>11,}'.format(i-2, 10**i))

# 0            100
# 1          1,000
# 2         10,000
# 3        100,000
# 4      1,000,000
# 5     10,000,000
# 6    100,000,000

idx = 5
%%timeit -n 2 -r 10
max_and_index(a[idx])
# 144 ms ± 5.45 ms per loop (mean ± std. dev. of 10 runs, 2 loops each)

%%timeit -n 2 -r 10
max_and_idxmax(a[idx])
# 143 ms ± 5.14 ms per loop (mean ± std. dev. of 10 runs, 2 loops each)

%%timeit -n 2 -r 10
Max_Argmax(a[idx])
# 9.89 ms ± 1.13 ms per loop (mean ± std. dev. of 10 runs, 2 loops each)

%%timeit -n 2 -r 10
np_max_and_argmax(a[idx])
# 24.5 ms ± 1.74 ms per loop (mean ± std. dev. of 10 runs, 2 loops each)

Dask

docs.dask.org › en › stable › generated › dask.dataframe.DataFrame.idxmax.html

dask.dataframe.DataFrame.idxmax — Dask documentation

This method is the DataFrame version of ndarray.argmax. ... Consider a dataset containing food consumption in Argentina. >>> df = pd.DataFrame( ... { ... "consumption": [10.51, 103.11, 55.48], ... "co2_emissions": [37.2, 19.66, 1712], ... }, ... index=["Pork", "Wheat Products", "Beef"], ... ) >>> df consumption co2_emissions Pork 10.51 37.20 Wheat Products 103.11 19.66 Beef 55.48 1712.00 · By default, it returns the index for the maximum value in each column. >>> df.idxmax...

Pandas

pandas.pydata.org › pandas-docs › stable › reference › api › pandas.Series.argmax.html

pandas.Series.argmax — pandas 3.0.1 documentation

numpy.ndarray.argmax · Equivalent method for numpy arrays. Series.idxmax · Return index label of the maximum values. Series.idxmin · Return index label of the minimum values. Examples · Consider dataset containing cereal calories · >>> s = pd.Series( ...

NumPy

numpy.org › doc › stable › reference › generated › numpy.argmax.html

numpy.argmax — NumPy v2.4 Manual

>>> b = np.arange(6) >>> b[1] = 5 >>> b array([0, 5, 2, 3, 4, 5]) >>> np.argmax(b) # Only the first occurrence is returned.

GeeksforGeeks

geeksforgeeks.org › pandas › python-pandas-series-argmax

Python | Pandas Series.argmax() - GeeksforGeeks

February 27, 2019 - Returns : idxmax : Index of maximum of values. Example #1: Use Series.argmax() function to return the row label of the maximum value in the given series object

Stack Overflow

stackoverflow.com › questions › 48646431 › argmax-or-idxmax-is-not-providing-right-index-where-the-maximum-value-is-loc

python - argmax() or idxmax() is not providing right index where the maximum value is located - Stack Overflow

I have a subset dataframe and am trying to find the index where in the volume column is the max volume. In this case it should be index 1428, but using argmax or idxmcx it is giving 1431 combine1 ...

Pandas

pandas.pydata.org › pandas-docs › stable › reference › api › pandas.DataFrame.idxmax.html

pandas.DataFrame.idxmax — pandas 3.0.1 documentation

This method is the DataFrame version of ndarray.argmax. ... Consider a dataset containing food consumption in Argentina. >>> df = pd.DataFrame( ... { ... "consumption": [10.51, 103.11, 55.48], ... "co2_emissions": [37.2, 19.66, 1712], ... }, ... index=["Pork", "Wheat Products", "Beef"], ... ) >>> df consumption co2_emissions Pork 10.51 37.20 Wheat Products 103.11 19.66 Beef 55.48 1712.00 · By default, it returns the index for the maximum value in each column. >>> df.idxmax...

Stack Overflow

stackoverflow.com › questions › 60083433 › what´s-the-difference-between-idxmax-and-max-inside-a-groupby-pandas

python - What´s the difference between idxmax() and max() inside a groupby pandas - Stack Overflow

Top answer

1 of 1

5

max() simply returns the maximum value.

idmax() returns the index of the (first occurrence of the) maximum value, not the maximum value itself.