Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Value_counts on multiple columns with groupby

I need some help with Pandas.

I have following dataframe:

df = pd.DataFrame({'1Country': ['FR', 'FR', 'GER','GER','IT','IT', 'FR','GER','IT'],
               '2City': ['Paris', 'Paris', 'Berlin', 'Berlin', 'Rome', 'Rome','Paris','Berlin','Rome'],
               'F1': ['A', 'B', 'C', 'B', 'B', 'C', 'A', 'B', 'C'],
               'F2': ['B', 'C', 'A', 'A', 'B', 'C', 'A', 'B', 'C'],
               'F3': ['C', 'A', 'B', 'C', 'C', 'C', 'A', 'B', 'C']})

screenshot

I am trying to do a groupby on first two columns 1Country and 2City and do value_counts on columns F1 and F2. So far I was only able to do groupby and value_counts on 1 column at a time with

df.groupby(['1Country','2City'])['F1'].apply(pd.Series.value_counts)

How can I do value_counts on multiple columns and get a datframe as a result?

like image 939
amongo Avatar asked Dec 10 '22 05:12

amongo


2 Answers

You could use agg, something along these lines:

df.groupby(['1Country','2City']).agg({i:'value_counts' for i in df.columns[2:]})

               F1   F2   F3
FR  Paris  A  2.0  1.0  2.0
           B  1.0  1.0  NaN
           C  NaN  1.0  1.0
GER Berlin A  NaN  2.0  NaN
           B  2.0  1.0  2.0
           C  1.0  NaN  1.0
IT  Rome   B  1.0  1.0  NaN
           C  2.0  2.0  3.0
like image 116
sacuL Avatar answered Dec 12 '22 18:12

sacuL


You can pass a dict to agg as follows:

df.groupby(['1Country', '2City']).agg({'F1': 'value_counts', 'F2': 'value_counts'})
like image 23
Silenced Temporarily Avatar answered Dec 12 '22 17:12

Silenced Temporarily