Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: Removing Rows on Count condition

I have a problem filtering a pandas dataframe.

city  NYC  NYC  NYC  NYC  SYD  SYD  SEL  SEL ...  df.city.value_counts() 

I would like to remove rows of cities that has less than 4 count frequency, which would be SYD and SEL for instance.

What would be the way to do so without manually dropping them city by city?

like image 329
Devin Lee Avatar asked Apr 09 '18 14:04

Devin Lee


People also ask

How do you delete a row based on a condition in Python?

Use pandas. DataFrame. drop() method to delete/remove rows with condition(s).

How do I delete all rows containing certain data in Python?

Python pandas drop rows by index To remove the rows by index all we have to do is pass the index number or list of index numbers in case of multiple drops. to drop rows by index simply use this code: df. drop(index) . Here df is the dataframe on which you are working and in place of index type the index number or name.

How do I drop a row based on a column value?

Use drop() method to delete rows based on column value in pandas DataFrame, as part of the data cleansing, you would be required to drop rows from the DataFrame when a column value matches with a static value or on another column value.

How do I remove the number of rows in Python?

Deleting rows using “drop” (best for small numbers of rows) To delete rows from a DataFrame, the drop function references the rows based on their “index values“. Most typically, this is an integer value per row, that increments from zero when you first load data into Pandas. You can see the index when you run “data.


1 Answers

Here you go with filter

df.groupby('city').filter(lambda x : len(x)>3) Out[1743]:    city 0  NYC 1  NYC 2  NYC 3  NYC 

Solution two transform

sub_df = df[df.groupby('city').city.transform('count')>3].copy()  # add copy for future warning when you need to modify the sub df 
like image 140
BENY Avatar answered Sep 21 '22 15:09

BENY