Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Counting most common combination of values in dataframe column

Tags:

python

pandas

I have DataFrame in the following form:

ID Product
1   A
1   B
2   A 
3   A
3   C 
3   D 
4   A
4   B

I would like to count the most common combination of two values from Product column grouped by ID. So for this example expected result would be:

Combination Count
A-B          2
A-C          1
A-D          1
C-D          1

Is this output possible with pandas?

like image 865
Alex T Avatar asked Sep 19 '19 19:09

Alex T


People also ask

How do you count occurrences in a DataFrame column?

How do you Count the Number of Occurrences in a data frame? To count the number of occurrences in e.g. a column in a dataframe you can use Pandas value_counts() method. For example, if you type df['condition']. value_counts() you will get the frequency of each unique value in the column “condition”.

How do you count occurrences of a value in a DataFrame?

We can count by using the value_counts() method. This function is used to count the values present in the entire dataframe and also count values in a particular column.

How do you get unique two column combinations in pandas?

You can get unique values in column (multiple columns) from pandas DataFrame using unique() or Series. unique() functions. unique() from Series is used to get unique values from a single column and the other one is used to get from multiple columns.

How do you count the number of unique values in a column in a DataFrame?

You can use the nunique() function to count the number of unique values in a pandas DataFrame.


1 Answers

Use itertools.combinations, explode and value_counts

import itertools

(df.groupby('ID').Product.agg(lambda x: list(itertools.combinations(x,2)))
                 .explode().str.join('-').value_counts())

Out[611]:
A-B    2
C-D    1
A-D    1
A-C    1
Name: Product, dtype: int64

Or:

import itertools

(df.groupby('ID').Product.agg(lambda x: list(map('-'.join, itertools.combinations(x,2))))
                 .explode().value_counts())

Out[597]:
A-B    2
C-D    1
A-D    1
A-C    1
Name: Product, dtype: int64
like image 104
Andy L. Avatar answered Nov 11 '22 06:11

Andy L.