How would I count all unique values of a dataframe in python without double counting?

Question

Let's suppose I have a python data frame that looks something like this:

Factor_1    Factor_2    Factor_3   Factor_4   Factor_5
   A           B           A          Nan       Nan
   B           D           F          A         Nan
   F           A           D          B          A

Something like this in which I have 5 columns that have different factors. I would like to create a column that counts how many of this factors appear in the dtaframe but without double counting in what terms without double counting if the value apperas in one row it only counts it as 1 for example if one row has A, B, C, A, A only 1 A would be counted. The expected out output would be this.

Factor   Count
  A        3
  B        3
  D        2
  F        2
 Nan       2

I used a a code I was helped with

df.stack(dropna=False).value_counts(dropna=False)

I was using an if to drop the double count but I would like to know if there is a practical and simple way to do this, like the code above, and not with an If because what I am doing is not efficient.

Shubham Sharma · Accepted Answer

You can use Series.unique + Series.value_counts:

s = pd.Series(np.hstack(df.T.apply(pd.Series.unique))).value_counts(dropna=False)

B      3
A      3
F      2
D      2
NaN    2
dtype: int64

How would I count all unique values of a dataframe in python without double counting?

Tags:

python

pandas

dataframe

Pandas INC

1 Answers

Shubham Sharma

Recent Activity

Donate For Us

How would I count all unique values of a dataframe in python without double counting?

Tags:

python

pandas

dataframe

Pandas INC

1 Answers

Shubham Sharma

Related questions

Recent Activity

Donate For Us