I have a data frame in pandas and would like to get all the values of a certain column that appear more than X times. I know this should be easy but somehow I am not getting anywhere with my current attempts.
Here is an example:
>>> df2 = pd.DataFrame([{"uid": 0, "mi":1}, {"uid": 0, "mi":2}, {"uid": 0, "mi":1}, {"uid": 0, "mi":1}]) >>> df2 mi uid 0 1 0 1 2 0 2 1 0 3 1 0
Now supposed I want to get all values from column "mi" that appear more than 2 times, the result should be
>>> <fancy query> array([1])
I have tried a couple of things with groupby and count but I always end up with a series with the values and their respective counts but don't know how to extract the values that have count more than X from that:
>>> df2.groupby('mi').mi.count() > 2 mi 1 True 2 False dtype: bool
But how can I use this now to get the values of mi that are true?
Any hints appreciated :)
To sum the number of times an element or number appears, Python's value_counts() function is used. The mode() method can then be used to get the most often occurring element.
How do you Count the Number of Occurrences in a data frame? To count the number of occurrences in e.g. a column in a dataframe you can use Pandas value_counts() method. For example, if you type df['condition']. value_counts() you will get the frequency of each unique value in the column “condition”.
Return a Series containing counts of unique values. The resulting object will be in descending order so that the first element is the most frequently-occurring element.
Or how about this:
Create the table:
>>> import pandas as pd >>> df2 = pd.DataFrame([{"uid": 0, "mi":1}, {"uid": 0, "mi":2}, {"uid": 0, "mi":1}, {"uid": 0, "mi":1}])
Get the counts of each occurance:
>>> vc = df2.mi.value_counts() >>> print vc 1 3 2 1
Print out those that occur more than 2 times:
>>> print vc[vc > 2].index[0] 1
I use this:
df2.mi.value_counts().reset_index(name="count").query("count > 5")["index"]
The part before query()
gives me a data frame with two columns: index
and count
. The query()
filters on count
and then we pull out the values.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With