Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using table() to create 3 variable frequency table in R

Tags:

r

frequency

I'm new to R and seeking some help. I understand the following problem is fairly simple and have looked for similar questions. None give quite the answer I'm looking for - any help would be appreciated.

The problem:

Producing a frequency table using the table() function for three variables with data in the format:

    Var1    Var2   Var3
1   0        1        0
2   0        1        0
3   1        1        1
4   0        0        1

Where, 0 = "No" and 1 = "Yes"

And the final table is in the following format with variables and values labelled:

           Var3
           Yes   No
Var1  Yes   1     0
      No    1     2
Var2  Yes   1     2
      No    1     0

What I have tried so far:

Using the following code I'm able to produce a 2 variable table, with labels for the variables but not for the values (ie. No and Yes).

table(data$Var1, data$Var3, dnn = c("Var1", "Var3"))

It looks like this:

      Var3
Var1  0  1
   0  2  1
   1  0  1

In trying to label the row and column values (0 = No and 1= Yes) I understand row.names and responseName can be used, however the following attempt to label row names gives an all arguments must have the same length error.

> table(data$Var1, data$Var2, dnn = c("Var1", "Var2"), row.names = c("No", "Yes"))

I have also tried using ftable() however the shape of the table produced using code below is not correct resulting in incorrect frequencies for the problem. The issue with labeling row & col values persists.

> ftable(data$Var1, data$Var2, data$Var3, dnn = c("Var1", "Var2", "Var3"))
      Var3  0  1
Var1 Var2             
0     0     0  1
      1     2  0
1     0     0  0
      1     0  1

Any help in using table() to produce a table of the shape desired would be greatly appreciated.

like image 968
Alison Bennett Avatar asked May 25 '15 05:05

Alison Bennett


2 Answers

You could try tabular from library(tables) after changing the labels as showed by @thelatemail

library(tables)
data[] <- lapply(data, factor, levels=1:0, labels=c('Yes', 'No'))
tabular(Var1+Var2~Var3, data=data)

 #         Var3   
 #         Yes  No
 #Var1  Yes 1    0 
 #      No  1    2 
 #Var2  Yes 1    2 
 #      No  1    0 

data

data <- structure(list(Var1 = c(0L, 0L, 1L, 0L), Var2 = c(1L, 1L, 1L, 
0L), Var3 = c(0L, 0L, 1L, 1L)), .Names = c("Var1", "Var2", "Var3"
), class = "data.frame", row.names = c("1", "2", "3", "4"))
like image 174
akrun Avatar answered Dec 10 '22 04:12

akrun


The easiest way is to probably use the reshape2 package. Firstly you will need to convert your numeric information to factors so that it doesn't treat it as a number.

data$Var1 <- as.factor(data$Var1)
data$Var2 <- as.factor(data$Var2)
data$Var3 <- as.factor(data$Var3)

Then you can easily just apply table(data) to get the information you want. If you really want to transform it in the format you specified, then pull it as a data.frame and then transform it as required:

df <- as.data.frame(table(data))
library(reshape2)
dcast(df, Var1+Var2 ~ Var3)

This as the output:

  Var1 Var2 0 1
1    0    0 0 1
2    0    1 2 0
3    1    0 0 0
4    1    1 0 1

EDIT: You can just use ftable on the data frame once its all factors:

> ftable(data)
          Var3 0 1
Var1 Var2         
0    0         0 1
     1         2 0
1    0         0 0
     1         0 1
like image 37
chappers Avatar answered Dec 10 '22 03:12

chappers