It's the corrected sample standard deviation.
You can convince yourself of this with a simple Series and applying the formulae:
In [11]: s = pd.Series([1, 2])
In [12]: s.std()
Out[12]: 0.70710678118654757
In [13]: from math import sqrt
....: sqrt(0.5)
Out[13]: 0.7071067811865476
and the formula for corrected sample standard deviation:
In [14]: sqrt(1./(len(s)-1) * ((s - s.mean()) ** 2).sum())
Out[14]: 0.7071067811865476
Answer from Andy Hayden on Stack OverflowIt's the corrected sample standard deviation.
You can convince yourself of this with a simple Series and applying the formulae:
In [11]: s = pd.Series([1, 2])
In [12]: s.std()
Out[12]: 0.70710678118654757
In [13]: from math import sqrt
....: sqrt(0.5)
Out[13]: 0.7071067811865476
and the formula for corrected sample standard deviation:
In [14]: sqrt(1./(len(s)-1) * ((s - s.mean()) ** 2).sum())
Out[14]: 0.7071067811865476
DataFrame.describe() calls Series.std() to get the standard deviation. And as the documentation tells us,
Return unbiased standard deviation over requested axis.
Normalized by N-1 by default. This can be changed using the ddof argument
So the standard deviation returned by describe() is, in fact, the "corrected sample standard deviation".
.describe() attribute generates a Dataframe where count, std, max ... are values of the index, so according to the documentation you should use .loc to retrieve just the index values desired:
df.describe().loc[['count','max']]
Describe returns a series, so you can just select out what you want
In [6]: s = Series(np.random.rand(10))
In [7]: s
Out[7]:
0 0.302041
1 0.353838
2 0.421416
3 0.174497
4 0.600932
5 0.871461
6 0.116874
7 0.233738
8 0.859147
9 0.145515
dtype: float64
In [8]: s.describe()
Out[8]:
count 10.000000
mean 0.407946
std 0.280562
min 0.116874
25% 0.189307
50% 0.327940
75% 0.556053
max 0.871461
dtype: float64
In [9]: s.describe()[['count','mean']]
Out[9]:
count 10.000000
mean 0.407946
dtype: float64