How can I retrieve the k highest values in a data frame in pandas?
For example, given the DataFrame:
b d e
Utah 1.624345 -0.611756 -0.528172
Ohio -1.072969 0.865408 -2.301539
Texas 1.744812 -0.761207 0.319039
Oregon -0.249370 1.462108 -2.060141
Generated with:
import numpy as np
import pandas as pd
np.random.seed(1)
frame = pd.DataFrame(np.random.randn(4, 3), columns=list('bde'),
index=['Utah', 'Ohio', 'Texas', 'Oregon'])
print(frame)
The 3 highest values in the data frame are:
The max() method returns a Series with the maximum value of each column. By specifying the column axis ( axis='columns' ), the max() method searches column-wise and returns the maximum value for each row.
head(n) to get the first n rows of the DataFrame. It takes one optional argument n (number of rows you want to get from the start). By default n = 5, it return first 5 rows if value of n is not passed to the method.
Use Python's min() and max() to find smallest and largest values in your data. Call min() and max() with a single iterable or with any number of regular arguments.
DataFrames consist of rows, columns, and data. To get the number of the most frequent value in a column, we will first access a column by using df['col_name'], and then we will apply the mode() method which will return the most frequent value.
You can use pandas.DataFrame.stack
+ pandas.Series.nlargest
, e.g.:
In [183]: frame.stack().nlargest(3)
Out[183]:
Texas b 1.744812
Utah b 1.624345
Oregon d 1.462108
dtype: float64
or:
In [184]: frame.stack().nlargest(3).reset_index(drop=True)
Out[184]:
0 1.744812
1 1.624345
2 1.462108
dtype: float64
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With