Explanation about pandas value_counts function

Question

Can someone please explain what does the line

result = data.apply(pd.value_counts).fillna(0)

does in here?

import pandas as pd 
from pandas import Series, DataFrame

data = DataFrame({'Qu1': [1, 3, 4, 3, 4],
                  'Qu2': [2, 3, 1, 2, 3],
                  'Qu3': [1, 5, 2, 4, 4]})

result = data.apply(pd.value_counts).fillna(0)  

In [26]:data
Out[26]:
Qu1 Qu2 Qu3
0 1 2 1
1 3 3 5
2 4 1 2
3 3 2 4
4 4 3 4

In [27]:result
Out[28]:
Qu1 Qu2 Qu3
1 1 1 1
2 0 2 1
3 2 2 0
4 2 0 2
5 0 0 1

Andy Hayden · Accepted Answer

I think the easiest way to understand what's going on is to break it down.

One each column, value_counts simply counts the number of occurrences of each value in the Series (i.e. in 4 appears twice in the Qu1 column):

In [11]: pd.value_counts(data.Qu1)
Out[11]:
4    2
3    2
1    1
dtype: int64

When you do an apply each column is realigned with the other results, since every value between 1 and 5 is seen it's aligned with range(1, 6):

In [12]: pd.value_counts(data.Qu1).reindex(range(1, 6))
Out[12]:
1     1
2   NaN
3     2
4     2
5   NaN
dtype: float64

You want to count the values you didn't see as 0 rather than NaN, hence the fillna:

In [13]: pd.value_counts(data.Qu1).reindex(range(1, 6)).fillna(0)
Out[13]:
1    1
2    0
3    2
4    2
5    0
dtype: float64

When you do the apply, it concats the result of doing this for each column:

In [14]: pd.concat((pd.value_counts(data[col]).reindex(range(1, 6)).fillna(0)
                       for col in data.columns),
                   axis=1, keys=data.columns)
Out[14]:
   Qu1  Qu2  Qu3
1    1    1    1
2    0    2    1
3    2    2    0
4    2    0    2
5    0    0    1

U2EF1 · Answer

From the docs, it produces a histogram of non-null values. Looking just at column Qu1 of result, we can tell that there is one 1, zero 2's, two 3's, two 4's, and zero 5's in the original column data.Qu1.

Explanation about pandas value_counts function

Tags:

python

pandas

Quazi Farhan

2 Answers

Andy Hayden

U2EF1

Recent Activity

Donate For Us

Explanation about pandas value_counts function

Tags:

python

pandas

Quazi Farhan

2 Answers

Andy Hayden

U2EF1

Related questions

Recent Activity

Donate For Us