I have a data frame with 6 variables and 250 observations that looks as follows:
id Var1 Var2 Var3 Var4 Var5 Var6 **
1 yes yes yes no yes yes
2 no no yes yes no yes
...
250 no yes yes yes yes yes
I want to identify all combinations of variables present in the data. For example, I know there are 20 observations with "yes" for each variable.
I am doing a peer grouping analysis and want to group the observations based on these yes/no variables. The 20 observations with "yes" to each variable will be group#1, 20 other observations have Var1=yes and Var2:Var6=no will be group#2, etc...
I attempted to use count in plyr as follows:
> count(dataframe[,-1])
This did not work. Any suggestions will be great!
You can either use interaction
or paste( ..., sep="_")
to make the combinations, but then you need to do something with them. Either split
them into separate categories (which will preserve identities) or tabulate them with table
(or both).
int_grps <- split( dataframe[,1], interaction( dataframe[,-1], drop=TRUE) )
int_counts <- table( interaction( dataframe[,-1], drop=TRUE ) )
If you only wanted to enumerate the combinations that exist, the code could be:
names(table(interaction( dataframe[,-1], drop=TRUE)) )
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With