Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sample each group after pandas groupby

I know this must have been answered some where but I just could not find it.

Problem: Sample each group after groupby operation.

import pandas as pd  df = pd.DataFrame({'a': [1,2,3,4,5,6,7],                    'b': [1,1,1,0,0,0,0]})  grouped = df.groupby('b')  # now sample from each group, e.g., I want 30% of each group 
like image 884
gongzhitaao Avatar asked Apr 03 '16 19:04

gongzhitaao


People also ask

How do you get a group in a Groupby pandas?

By doing groupby() pandas returns you a dict of grouped DFs. You can easily get the key list of this dict by python built in function keys() .

What is the result of Groupby in pandas?

Group DataFrame using a mapper or by a Series of columns. A groupby operation involves some combination of splitting the object, applying a function, and combining the results. This can be used to group large amounts of data and compute operations on these groups.

How do you count after Groupby in pandas?

Use count() by Column Name Use pandas DataFrame. groupby() to group the rows by column and use count() method to get the count for each group by ignoring None and Nan values.

How do I get Groupby columns in pandas?

You can also reset_index() on your groupby result to get back a dataframe with the name column now accessible. If you perform an operation on a single column the return will be a series with multiindex and you can simply apply pd. DataFrame to it and then reset_index. Show activity on this post.


1 Answers

Apply a lambda and call sample with param frac:

In [2]: df = pd.DataFrame({'a': [1,2,3,4,5,6,7],                    'b': [1,1,1,0,0,0,0]}) ​ grouped = df.groupby('b') grouped.apply(lambda x: x.sample(frac=0.3))  Out[2]:      a  b b         0 6  7  0 1 2  3  1 
like image 63
EdChum Avatar answered Sep 20 '22 12:09

EdChum