Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to subset a flat contingency table in R without losing row & column names?

I'm using ftable to create a flat contingency table. However, when I subset the contingency table, R removes the row and column names. Is there a way to subset the table such that the row and column names remain in the subsetted table? Here's an example:

# Create fake data
Group1 = sample(LETTERS[1:3], 20, replace=TRUE)
Group2 = sample(letters[1:3], 20, replace=TRUE)
Year = sample(c("2010","2011","2012"), 20, replace=TRUE)
df1 = data.frame(Group1, Group2, Year)

# Create flat contingency table with column margin
table1 = ftable(addmargins(table(df1$Group1, df1$Group2, df1$Year), margin=3))

# Select rows with sum greater than 2
table2 = table1[table1[ ,4] > 2, ]

> table1
     2010 2011 2012 Sum

A a     0    1    2   3
  b     2    1    0   3
  c     0    0    0   0
B a     0    1    1   2
  b     2    0    0   2
  c     1    0    1   2
C a     0    1    0   1
  b     1    0    2   3
  c     3    0    1   4

> table2
     [,1] [,2] [,3] [,4]
[1,]    0    1    2    3
[2,]    2    1    0    3
[3,]    1    0    2    3
[4,]    3    0    1    4

Notice how R has converted the subsetted table to a matrix, stripping out the column names and both levels of row names. How can I keep the ftable structure in the subsetted table?

like image 480
eipi10 Avatar asked Mar 27 '12 18:03

eipi10


People also ask

What is Ftable in R?

ftable() , short for "flatten table," is a function in R that creates a flat contingency table.

Which R function can be used to create a contingency table?

The table() function is used in R to create a contingency table. The table() function is one of the most versatile functions in R. It can take any data structure as an argument and turn it into a table. The more complex the original data, the more complex is the resulting contingency table.

Which R command will you use to construct a frequency contingency table?

table() command can be used to create contingency tables in R because the command can handle data in simple vectors or more complex matrix and data frame objects.


2 Answers

Consider working with a data.frame of frequencies. It is a much better data structure to work with, especially if you are going to filter it. Here is a way to build one using the reshape package.

# cast the data into a data.frame
library(reshape)
df1$Freq <- 1
df2 <- cast(df1, Group1 + Group2 ~ Year, fun = sum, value = "Freq")
df2
#   Group1 Group2 2010 2011 2012
# 1      A      a    0    0    1
# 2      A      b    1    1    3
# 3      A      c    0    0    1
# 4      B      a    1    2    0
# 5      B      b    1    1    0
# 6      B      c    0    0    1
# 7      C      a    2    0    1
# 8      C      b    2    0    0
# 9      C      c    0    0    2

# add a column for the `Sum` of frequencies over the years
df2 <- within(df2, Sum <- `2010` + `2011` + `2012`)
df2
#   Group1 Group2 2010 2011 2012 Sum
# 1      A      a    0    0    1   1
# 2      A      b    1    1    3   5
# 3      A      c    0    0    1   1
# 4      B      a    1    2    0   3
# 5      B      b    1    1    0   2
# 6      B      c    0    0    1   1
# 7      C      a    2    0    1   3
# 8      C      b    2    0    0   2
# 9      C      c    0    0    2   2

df2[df2$Sum > 2, ]
#   Group1 Group2 2010 2011 2012 Sum
# 2      A      b    1    1    3   5
# 4      B      a    1    2    0   3
# 7      C      a    2    0    1   3
like image 76
flodel Avatar answered Sep 20 '22 06:09

flodel


The result will no longer be an ftable object, because some of the combinations are missing.

But you can have a matrix instead, with rows and column names.

ftable_names <- function(x, which="row.vars") {
  # Only tested in dimensions 1 and 2
  rows <- as.vector(Reduce( 
    function(u,v) t(outer(as.vector(u),as.vector(v),paste)), 
    attr(x, which), 
    "" 
  ))
}
i <- table1[ ,4] > 2
table2 <- table1[i,]
rownames(table2) <- ftable_names(table1, "row.vars")[i]
colnames(table2) <- ftable_names(table1, "col.vars")
table2

#      2010  2011  2012  Sum
# A a     1     2     0    3
# A c     0     0     3    3
# B c     0     3     0    3
# C a     3     1     1    5
like image 43
Vincent Zoonekynd Avatar answered Sep 22 '22 06:09

Vincent Zoonekynd