Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Combination of values in pandas data frame

This is my pandas dataframe:

       Item          Support_Count
0      BREAD              4
1      MILK               4
2      DIAPER             4
3      BEER               3

How will i generate all possible unique combinations of 2 and 3 set of items from the 1st column 'Item'.

Example(2 item sets): (BREAD,MILK) ,(BREAD,DIAPER),(BREAD,BEER),(MILK,DIAPER) etc.

Example (3 item sets): (BREAD,MILK,DIAPER),(BREAD,MILK,BEER),(MILK,DIAPER,BEER) etc.

like image 527
data_person Avatar asked Mar 27 '16 01:03

data_person


People also ask

How do you get unique two column combinations in Pandas?

You can get unique values in column (multiple columns) from pandas DataFrame using unique() or Series. unique() functions. unique() from Series is used to get unique values from a single column and the other one is used to get from multiple columns.

Which are the 3 main ways of combining DataFrames together?

Combine data from multiple files into a single DataFrame using merge and concat. Combine two DataFrames using a unique ID found in both DataFrames. Employ to_csv to export a DataFrame in CSV format. Join DataFrames using common fields (join keys).


1 Answers

You can use the itertools library:

import itertools
list(itertools.combinations(df['Item'], 2))

[('BREAD', 'MILK'),
 ('BREAD', 'DIAPER'),
 ('BREAD', 'BEER'),
 ('MILK', 'DIAPER'),
 ('MILK', 'BEER'),
 ('DIAPER', 'BEER')]

list(itertools.combinations(df['Item'], 3))

[('BREAD', 'MILK', 'DIAPER'),
 ('BREAD', 'MILK', 'BEER'),
 ('BREAD', 'DIAPER', 'BEER'),
 ('MILK', 'DIAPER', 'BEER')]

Note: The number of combinations grows very quickly so generating all possible combinations may not be efficient. I recommend looking at apriori algorithm implementations if you haven't already done so.

like image 176
ayhan Avatar answered Oct 29 '22 17:10

ayhan