Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas GroupBy frequency of values

Tags:

I have this set of sample data

STATE   CAPSULES     LIQUID         TABLETS  
Alabama NaN          Prescription   OTC
Georgia Prescription NaN            OTC
Texas   OTC          OTC            NaN
Texas   Prescription NaN            NaN
Florida NaN          Prescription   OTC
Georgia OTC          Prescription   Prescription
Texas   Prescription NaN            OTC
Alabama NaN          OTC            OTC
Georgia OTC          NaN            NaN

I have tried multiple groupby configurations to get the following ideal result:

State   capsules_OTC    capsules_prescription   liquid_OTC  liquid_prescription tablets_OTC tablets_prescription
Alabama    0             0                         0              0               0           0
Florida    0             0                         0              0               0           0
Georgia    1             1                         1              1               1           1
Texas      1             2                         2              2               2           2

For example, tried this

df.groupby(['STATE','CAPSULES'])

to try and get at least the first column wrangled, no dice. Perhaps this not such an easy answer, but I figured I am missing something simple with groupby and perhaps count() or some other apply function?

like image 751
John Taylor Avatar asked Oct 28 '20 00:10

John Taylor


People also ask

How do you find the frequency of elements in pandas?

Using the count(), size() method, Series. value_counts(), and pandas. Index. value_counts() method we can count the number of frequency of itemsets in the given DataFrame.

How do you count the number of times a value appears in a column in pandas?

We can count by using the value_counts() method. This function is used to count the values present in the entire dataframe and also count values in a particular column.

How do you count values in Groupby?

Groupby is a very powerful pandas method. You can group by one column and count the values of another column per this column value using value_counts. Using groupby and value_counts we can count the number of activities each person did.

How do you find the frequency of unique values in an entire data frame?

To count the number of occurrences in e.g. a column in a dataframe you can use Pandas value_counts() method. For example, if you type df['condition']. value_counts() you will get the frequency of each unique value in the column “condition”.


1 Answers

Use pd.get_dummies with groupby and sum:

pd.get_dummies(df, columns=['CAPSULES', 'LIQUID', 'TABLETS'])\
  .groupby('STATE', as_index=False).sum()

Output:

     STATE  CAPSULES_OTC  CAPSULES_Prescription  LIQUID_OTC  LIQUID_Prescription  TABLETS_OTC  TABLETS_Prescription
0  Alabama             0                      0           1                    1            2                     0
1  Florida             0                      0           0                    1            1                     0
2  Georgia             2                      1           0                    1            1                     1
3    Texas             1                      2           1                    0            1                     0
like image 195
Scott Boston Avatar answered Sep 29 '22 11:09

Scott Boston