As pointed out above, it gives "Down" arbitrarily, but not randomly. On the same machine with the same Pandas version, running the above code should always yield the same result (although it's not guaranteed by the docs, see comments below).

Let's reproduce what's happening.

Given this series:

abc = pd.Series(list("abcdefghijklmnoppqq"))

The value_counts implementation boils down to this:

import pandas._libs.hashtable as htable
keys, counts = htable.value_count_object(np.asarray(abc), True)
result = pd.Series(counts, index=keys)

result:

g    1
e    1
f    1
h    1
o    1
d    1
b    1
q    2
j    1
k    1
i    1
p    2
n    1
l    1
c    1
m    1
a    1
dtype: int64

The order of the result is given by the implementation of the hash table. It is the same for every call.

You could look into the implementation of value_count_object, which calls build_count_table_object, which uses the khash implementation to get more details about the hashing.

After computing the table, the value_counts implementation is sorting the results with quicksort. This sort is not stable and with this specially constructed example reorders "p" and "q":

result.sort_values(ascending=False)

q    2
p    2
a    1
e    1
f    1
h    1
o    1
d    1
b    1
j    1
m    1
k    1
i    1
n    1
l    1
c    1
g    1
dtype: int64

Thus there are potentially two factors for the ordering: first the hashing, and second the non-stable sort.

The displayed top value is then just the first entry of the sorted list, in this case, "q".

On my machine, quicksort becomes non-stable at 17 entries, this is why I chose the example above.

We can test the non-stable sort with this direct comparison:

pd.Series(list("abcdefghijklmnoppqq")).describe().top
'q'

pd.Series(list(               "ppqq")).describe().top
'p'
Answer from w-m on Stack Overflow
Top answer
1 of 1
3

As pointed out above, it gives "Down" arbitrarily, but not randomly. On the same machine with the same Pandas version, running the above code should always yield the same result (although it's not guaranteed by the docs, see comments below).

Let's reproduce what's happening.

Given this series:

abc = pd.Series(list("abcdefghijklmnoppqq"))

The value_counts implementation boils down to this:

import pandas._libs.hashtable as htable
keys, counts = htable.value_count_object(np.asarray(abc), True)
result = pd.Series(counts, index=keys)

result:

g    1
e    1
f    1
h    1
o    1
d    1
b    1
q    2
j    1
k    1
i    1
p    2
n    1
l    1
c    1
m    1
a    1
dtype: int64

The order of the result is given by the implementation of the hash table. It is the same for every call.

You could look into the implementation of value_count_object, which calls build_count_table_object, which uses the khash implementation to get more details about the hashing.

After computing the table, the value_counts implementation is sorting the results with quicksort. This sort is not stable and with this specially constructed example reorders "p" and "q":

result.sort_values(ascending=False)

q    2
p    2
a    1
e    1
f    1
h    1
o    1
d    1
b    1
j    1
m    1
k    1
i    1
n    1
l    1
c    1
g    1
dtype: int64

Thus there are potentially two factors for the ordering: first the hashing, and second the non-stable sort.

The displayed top value is then just the first entry of the sorted list, in this case, "q".

On my machine, quicksort becomes non-stable at 17 entries, this is why I chose the example above.

We can test the non-stable sort with this direct comparison:

pd.Series(list("abcdefghijklmnoppqq")).describe().top
'q'

pd.Series(list(               "ppqq")).describe().top
'p'
🌐
Machine Learning Plus
machinelearningplus.com › blog › pandas describe
Pandas Describe - machinelearningplus
March 8, 2022 - # create a datetime series series = pd.date_range(start='27/05/2021', periods=len(df)) # adding dates series to dataframe df['dates'] = series # describe function on dates df.dates.describe() ... count 5 unique 5 top 2021-05-28 00:00:00 freq 1 first 2021-05-27 00:00:00 last 2021-05-31 00:00:00 Name: dates, dtype: object · You can make pandas recognize date-time values as numeric using datetime_is_numeric.
🌐
Medium
medium.com › @heyamit10 › understanding-pandas-describe-9048cb198aa4
Understanding pandas.describe(). I understand that learning data science… | by Hey Amit | Medium
March 6, 2025 - import pandas as pd # Sample DataFrame with non-numeric data data = { 'Name': ['Alice', 'Bob', 'Charlie', 'David', 'Eva'], 'Department': ['HR', 'IT', 'Finance', 'IT', 'HR'] } df = pd.DataFrame(data) # Describing non-numerical (object) data print(df.describe(include='object')) Output: Name Department count 5 5 unique 5 3 top Alice HR freq 1 2 ·
🌐
W3Schools
w3schools.com › python › pandas › ref_df_describe.asp
Pandas DataFrame describe() Method
import pandas as pd data = [[10, 18, 11], [13, 15, 8], [9, 20, 3]] df = pd.DataFrame(data) print(df.describe()) Try it Yourself »
🌐
GeeksforGeeks
geeksforgeeks.org › python-pandas-dataframe-describe-method
Pandas DataFrame describe() Method - GeeksforGeeks
June 12, 2025 - The describe() method in Pandas generates descriptive statistics of DataFrame columns which provides key metrics like mean, standard deviation, percentiles and more. It works with numeric data by default but can also handle categorical data ...
Find elsewhere
🌐
datagy
datagy.io › home › pandas tutorials › data analysis in pandas › pandas describe: descriptive statistics on your dataframe
Pandas Describe: Descriptive Statistics on Your Dataframe • datagy
December 15, 2022 - We can see now that all columns are included in the describe method’s output. We can see that this actually this includes different metrics, such as unique and top. In Pandas version 1.1, a new argument was introduced.
🌐
Note.nkmk.me
note.nkmk.me › home › python › pandas
pandas: Get summary statistics for each column with describe() | note.nkmk.me
January 20, 2024 - top: Mode · freq: Frequency of the mode · mean: Arithmetic mean · std: Sample standard deviation · min: Minimum Value · max: Maximum Value · 50%: Median (50th percentile) 25%, 75%: 25th and 75th percentiles · Specify percentiles to calculate in describe(): percentiles · For datetime64[ns] type · The pandas version used in this article is as follows.
🌐
w3resource
w3resource.com › pandas › dataframe › dataframe-describe.php
Pandas DataFrame: describe() function - w3resource
DataFrame.describe(self, percentiles=None, include=None, exclude=None) ... Returns: Series or DataFrame Summary statistics of the Series or Dataframe provided. ... Download the Pandas DataFrame Notebooks from here.
🌐
Spark By {Examples}
sparkbyexamples.com › home › pandas › pandas dataframe describe() method
Pandas DataFrame describe() Method - Spark By {Examples}
July 29, 2024 - Use describe(include=all) provides summary statistics for all columns, including count, unique values, the most frequent value (top), and its frequency (freq) for categorical data.
🌐
Snowflake Documentation
docs.snowflake.com › en › developer-guide › snowpark › reference › python › 1.8.0 › modin › pandas_api › modin.pandas.DataFrame.describe
modin.pandas.DataFrame.describe | Snowflake Documentation
Generate descriptive statistics for columns in the dataset · For non-numeric columns, computes count (# of non-null items), unique (# of unique items), top (the mode; the element at the lowest position if multiple), and freq (# of times the mode appears) for each column
🌐
Apache
spark.apache.org › docs › latest › api › python › reference › pyspark.pandas › api › pyspark.pandas.DataFrame.describe.html
pyspark.pandas.DataFrame.describe — PySpark 4.1.1 documentation
For object data (e.g. strings or timestamps), the result’s index will include count, unique, top, and freq. The top is the most common value. The freq is the most common value’s frequency. Timestamps also include the first and last items. ... Describing a numeric Series.
🌐
Javatpoint
javatpoint.com › pandas-dataframe-describe
Pandas DataFrame.describe() - javatpoint
Pandas DataFrame.describe() with What is Python Pandas, Reading Multiple Files, Null values, Multiple index, Application, Application Basics, Resampling, Plotting the data, Moving windows functions, Series, Read the file, Data operations, Filter Data etc.
🌐
Statology
statology.org › home › pandas: how to use describe() for categorical variables
Pandas: How to Use describe() for Categorical Variables
March 8, 2023 - By default, the describe() function in pandas calculates descriptive statistics for all numeric variables in a DataFrame. However, you can use the following methods to calculate descriptive statistics for categorical variables as well: Method 1: Calculate Descriptive Statistics for Categorical Variables ... This method will calculate count, unique, top and freq for each categorical variable in a DataFrame.
🌐
Sharp Sight
sharpsight.ai › blog › pandas-describe
Pandas Describe, Explained - Sharp Sight
February 6, 2024 - We’re telling Pandas describe to do this with the code include = ['category']. Notice that the output is similar to the output for string variables (which we saw in example 4). The output includes the count, the number of unique values, the most frequent value (i.e., the ‘top’ value), and the frequency of the most frequent value.
🌐
Statology
statology.org › home › how to use describe() function in pandas (with examples)
How to Use describe() Function in Pandas (With Examples)
August 9, 2021 - Note: If there are missing values in any columns, pandas will automatically exclude these values when calculating the descriptive statistics. To calculate descriptive statistics for every column in the DataFrame, we can use the include=’all’ argument: #generate descriptive statistics for all columns df.describe(include='all') team points assists rebounds count 8 8.000000 8.00000 8.000000 unique 3 NaN NaN NaN top B NaN NaN NaN freq 3 NaN NaN NaN mean NaN 20.250000 7.75000 8.375000 std NaN 6.158618 2.54951 2.559994 min NaN 12.000000 4.00000 5.000000 25% NaN 14.750000 6.50000 6.000000 50% NaN 21.000000 8.00000 8.500000 75% NaN 25.000000 9.00000 10.250000 max NaN 29.000000 12.00000 12.000000