Get top n values per category in pandas retaining all columns

Tags:

After some transformations I got the following dataframe, how do I proceed to obtain the top n records by a column in this case short_name and using other as indicator frequency. I read this post but the problem with both solutions is that they get rid of the column product_name, they just retain the grouped column and I need to keep them all.

short_name          product_id    frequency
Yoghurt y cereales  975009684     32
Yoghurt y cereales  975009685     21
Yoghurt y cereales  975009700     16
Yoghurt y Cereales  21097         16
Yoghurt Bebible     21329         68
Yoghurt Bebible     21328         67
Yoghurt Bebible     21500         31

250

asked Jun 23 '17 18:06

Alberto Bonsanto

1 Answers

I'd try to use nlargest method:

In [5]: df.groupby('short_name', as_index=False).apply(lambda x: x.nlargest(2, 'frequency'))
Out[5]:
             short_name  product_id  frequency
0 4     Yoghurt Bebible       21329         68
  5     Yoghurt Bebible       21328         67
1 3  Yoghurt y Cereales       21097         16
2 0  Yoghurt y cereales   975009684         32
  1  Yoghurt y cereales   975009685         21

answered Sep 18 '22 13:09

MaxU - stop WAR against UA

Related questions
                            
                                How to plot a linear regression with datetimes on the x-axis
                            
                                How to get client IP address using python bottle framework
                            
                                How to use Proxy PAC file for python urllib or request?
                            
                                Flask custom error page 500 not working
                            
                                Solution for AssertionError: invalid dtype determination in get_concat_dtype when concatenating operation on list of Dataframes
                            
                                How can I understand a .pyc file content
                            
                                How can I encode and decode percent-encoded (URL encoded) strings in Python?
                            
                                Configparser Integers
                            
                                a pythonic way to write a constrain() function
                            
                                Is there any way to execute a statement before each return statement in python function?
                            
                                ValueError: too many values to unpack - Is it possible to ignore one value?
                            
                                Errno 24: Too many open files. But I am not opening files?
                            
                                How to download outlook attachment from Python Script?
                            
                                How do I define a Dataframe in Python?
                            
                                ImportError: No module named 'resource_rc'
                            
                                Selenium Chromedriver add cookie - invalid domain error
                            
                                Average of two timestamps in python
                            
                                Python multi-line JSON and variables
                            
                                How to get gantt plot using matplotlib
                            
                                How to get output of hidden layer given an input, weights and biases of the hidden layer in keras?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Get top n values per category in pandas retaining all columns

Tags:

python

pandas

Alberto Bonsanto

People also ask

1 Answers

MaxU - stop WAR against UA

Recent Activity

Donate For Us