I have a sample list like this:
Category| Item
--------|-------
Animal | Fish
Animal | Cat
... |
Food | Fish
Food | Cake
... |
etc...
I want to take a random sample of 10 items out of each category, so that the remaining dataframe just has those records.
I've tried df.sample()
but it just gives me samples across the board.
I can do this this through df.iterrows()
but I am hoping there is a more simple solution.
You have to tell pandas you want to group by category with the groupby
method.
df.groupby('category')['item'].apply(lambda s: s.sample(10))
If you have less than ten items in a sample but don't want to sample with replacement you can do this.
df.groupby('category')['item'].apply(lambda s: s.sample(min(len(s), 10)))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With