Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to get value counts for multiple columns at once in Pandas DataFrame?

Given a Pandas DataFrame that has multiple columns with categorical values (0 or 1), is it possible to conveniently get the value_counts for every column at the same time?

For example, suppose I generate a DataFrame as follows:

import numpy as np import pandas as pd np.random.seed(0) df = pd.DataFrame(np.random.randint(0, 2, (10, 4)), columns=list('abcd')) 

I can get a DataFrame like this:

   a  b  c  d 0  0  1  1  0 1  1  1  1  1 2  1  1  1  0 3  0  1  0  0 4  0  0  0  1 5  0  1  1  0 6  0  1  1  1 7  1  0  1  0 8  1  0  1  1 9  0  1  1  0 

How do I conveniently get the value counts for every column and obtain the following conveniently?

   a  b  c  d 0  6  3  2  6 1  4  7  8  4 

My current solution is:

pieces = [] for col in df.columns:     tmp_series = df[col].value_counts()     tmp_series.name = col     pieces.append(tmp_series) df_value_counts = pd.concat(pieces, axis=1) 

But there must be a simpler way, like stacking, pivoting, or groupby?

like image 216
Xin Avatar asked Sep 15 '15 15:09

Xin


People also ask

How do I count values in multiple columns in pandas?

In order to get the count of unique values on multiple columns use pandas DataFrame. drop_duplicates() which drop duplicate rows from pandas DataFrame. This eliminates duplicates and return DataFrame with unique rows.

How do I see all value counts in pandas?

value_counts() function returns object containing counts of unique values. The resulting object will be in descending order so that the first element is the most frequently-occurring element. Excludes NA values by default.

How do I count the number of specific values in a Pandas column?

We can count by using the value_counts() method. This function is used to count the values present in the entire dataframe and also count values in a particular column.

How do I count the number of values in a Pandas DataFrame?

To count the number of occurrences in e.g. a column in a dataframe you can use Pandas value_counts() method. For example, if you type df['condition']. value_counts() you will get the frequency of each unique value in the column “condition”.


2 Answers

Just call apply and pass pd.Series.value_counts:

In [212]: df = pd.DataFrame(np.random.randint(0, 2, (10, 4)), columns=list('abcd')) df.apply(pd.Series.value_counts) Out[212]:    a  b  c  d 0  4  6  4  3 1  6  4  6  7 
like image 165
EdChum Avatar answered Sep 18 '22 18:09

EdChum


There is actually a fairly interesting and advanced way of doing this problem with crosstab and melt

df = pd.DataFrame({'a': ['table', 'chair', 'chair', 'lamp', 'bed'],                    'b': ['lamp', 'candle', 'chair', 'lamp', 'bed'],                    'c': ['mirror', 'mirror', 'mirror', 'mirror', 'mirror']})  df         a       b       c 0  table    lamp  mirror 1  chair  candle  mirror 2  chair   chair  mirror 3   lamp    lamp  mirror 4    bed     bed  mirror 

We can first melt the DataFrame

df1 = df.melt(var_name='columns', value_name='index') df1     columns   index 0        a   table 1        a   chair 2        a   chair 3        a    lamp 4        a     bed 5        b    lamp 6        b  candle 7        b   chair 8        b    lamp 9        b     bed 10       c  mirror 11       c  mirror 12       c  mirror 13       c  mirror 14       c  mirror 

And then use the crosstab function to count the values for each column. This preserves the data type as ints which wouldn't be the case for the currently selected answer:

pd.crosstab(index=df1['index'], columns=df1['columns'])  columns  a  b  c index            bed      1  1  0 candle   0  1  0 chair    2  1  0 lamp     1  2  0 mirror   0  0  5 table    1  0  0 

Or in one line, which expands the column names to parameter names with ** (this is advanced)

pd.crosstab(**df.melt(var_name='columns', value_name='index')) 

Also, value_counts is now a top-level function. So you can simplify the currently selected answer to the following:

df.apply(pd.value_counts) 
like image 39
Ted Petrou Avatar answered Sep 18 '22 18:09

Ted Petrou