I have a problem filtering a <code>pandas</code> dataframe. <pre class="prettyprint"><code>city NYC NYC NYC NYC SYD SYD SEL SEL ... df.city.value_counts() </code></pre> I would like to remove rows of cities that has less than 4 count frequency, which would be SYD and SEL for instance. What would be the way to do so without manually dropping them city by city?

Here you go with filter <pre class="prettyprint"><code>df.groupby('city').filter(lambda x : len(x)>3) Out[1743]: city 0 NYC 1 NYC 2 NYC 3 NYC </code></pre> Solution two <code>transform</code> <pre class="prettyprint"><code>sub_df = df[df.groupby('city').city.transform('count')>3].copy() # add copy for future warning when you need to modify the sub df </code></pre>

Python: Removing Rows on Count condition

Tags:

python

indexing

pandas

dataframe

counter

I have a problem filtering a pandas dataframe.

Click to copy

city  NYC  NYC  NYC  NYC  SYD  SYD  SEL  SEL ...  df.city.value_counts()

I would like to remove rows of cities that has less than 4 count frequency, which would be SYD and SEL for instance.

What would be the way to do so without manually dropping them city by city?

329

asked Apr 09 '18 14:04

Devin Lee

1 Answers

Here you go with filter

Click to copy

df.groupby('city').filter(lambda x : len(x)>3) Out[1743]:    city 0  NYC 1  NYC 2  NYC 3  NYC

Solution two transform

Click to copy

sub_df = df[df.groupby('city').city.transform('count')>3].copy()  # add copy for future warning when you need to modify the sub df

140

answered Sep 21 '22 15:09

BENY

Related questions
                            
                                Django: Does unique_together imply db_index=True in the same way that ForeignKey does?
                            
                                Fit a gaussian function
                            
                                "SSL: certificate_verify_failed" error when scraping https://www.thenewboston.com/
                            
                                Remove non-business days rows from pandas dataframe
                            
                                Failing to import itertools in Python 3.5.2
                            
                                How to drop column according to NAN percentage for dataframe?
                            
                                Numpy import error Python3 on Raspberry Pi?
                            
                                SQLAlchemy Relationship Filter?
                            
                                matplotlib matshow labels
                            
                                Multiprocessing a function with several inputs
                            
                                Deriving a class from TestCase throws two errors
                            
                                Trying to parse JSON in Python. ValueError: Expecting property name [duplicate]
                            
                                flask-sqlalchemy - PostgreSQL - Define specific schema for table?
                            
                                Setting Different error bar colors in bar plot in matplotlib
                            
                                Specify which python version pylint should evaluate for
                            
                                How to 'update' or 'overwrite' a python list
                            
                                Get all combinations of elements from two lists?
                            
                                Linear Regression on Pandas DataFrame using Sklearn ( IndexError: tuple index out of range)
                            
                                How to create a sample single-column Spark DataFrame in Python?
                            
                                Pillow: libopenjp2.so.7: cannot open shared object file: No such file or directory

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python: Removing Rows on Count condition

Tags:

python

indexing

pandas

dataframe

counter

Devin Lee

People also ask

1 Answers

BENY

Recent Activity

Donate For Us