Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Identify all combinations of six variables in R

I have a data frame with 6 variables and 250 observations that looks as follows:

   id    Var1    Var2    Var3    Var4    Var5    Var6 **

   1     yes     yes     yes     no      yes     yes
   2     no      no      yes     yes     no      yes
   ...
   250   no      yes     yes     yes     yes     yes

I want to identify all combinations of variables present in the data. For example, I know there are 20 observations with "yes" for each variable.

I am doing a peer grouping analysis and want to group the observations based on these yes/no variables. The 20 observations with "yes" to each variable will be group#1, 20 other observations have Var1=yes and Var2:Var6=no will be group#2, etc...

I attempted to use count in plyr as follows:

> count(dataframe[,-1])

This did not work. Any suggestions will be great!

like image 742
Rymatt830 Avatar asked Feb 09 '23 18:02

Rymatt830


1 Answers

You can either use interaction or paste( ..., sep="_") to make the combinations, but then you need to do something with them. Either split them into separate categories (which will preserve identities) or tabulate them with table (or both).

 int_grps <- split( dataframe[,1], interaction( dataframe[,-1], drop=TRUE) )

 int_counts <- table( interaction( dataframe[,-1], drop=TRUE ) )

If you only wanted to enumerate the combinations that exist, the code could be:

names(table(interaction( dataframe[,-1], drop=TRUE)) )    
like image 84
IRTFM Avatar answered Feb 12 '23 09:02

IRTFM