How do I filter a pandas DataFrame based on value counts?

I'm working in Python with a pandas DataFrame of video games, each with a genre. I'm trying to remove any video game with a genre that appears less than some number of times in the DataFrame, but I have no clue how to go about this. I did find a StackOverflow question that seems to be related, but I can't decipher the solution at all (possibly because I've never heard of R and my memory of functional programming is rusty at best).

Help?

How do you filter DataFrame for certain values?

Using query() to Filter by Column Value in pandas DataFrame. query() function is used to filter rows based on column value in pandas. After applying the expression, it returns a new DataFrame. If you wanted to update the existing DataFrame use inplace=True param.

How do I filter data based on conditions in Pandas?

Filter Rows by Condition You can use df[df["Courses"] == 'Spark'] to filter rows by a condition in pandas DataFrame. Not that this expression returns a new DataFrame with selected rows.

How do you use value count in Pandas?

syntax to use value_counts on a Pandas dataframe This is really simple. You just type the name of the dataframe then . value_counts() . When you use value_counts on a dataframe, it will count the number of records for every combination of unique values for every column.

Use groupby filter:

In [11]: df = pd.DataFrame([[1, 2], [1, 4], [5, 6]], columns=['A', 'B'])  In [12]: df Out[12]:    A  B 0  1  2 1  1  4 2  5  6  In [13]: df.groupby("A").filter(lambda x: len(x) > 1) Out[13]:    A  B 0  1  2 1  1  4

I recommend reading the split-combine-section of the docs.

Solutions with better performance should be GroupBy.transform with size for count per groups to Series with same size like original df, so possible filter by boolean indexing:

df1 = df[df.groupby("A")['A'].transform('size') > 1]

Or use Series.map with Series.value_counts:

df1 = df[df['A'].map(df['A'].value_counts()) > 1]

How do I filter a pandas DataFrame based on value counts?

Tags:

python

pandas

dataframe

filtering

uchuujin

People also ask

2 Answers

Andy Hayden

jezrael

Recent Activity

Donate For Us

How do I filter a pandas DataFrame based on value counts?

Tags:

python

pandas

dataframe

filtering

uchuujin

People also ask

2 Answers

Andy Hayden

jezrael

Related questions

Recent Activity

Donate For Us