Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Table frequency from multiple col and multiple row in R

Tags:

r

frequency

I am trying to get a frequency table from this dataframe:

tmp2 <- structure(list(a1 = c(1L, 0L, 0L), a2 = c(1L, 0L, 1L),
                       a3 = c(0L, 1L, 0L), b1 = c(1L, 0L, 1L),
                       b2 = c(1L, 0L, 0L), b3 = c(0L, 1L, 1L)),
                       .Names = c("a1", "a2", "a3", "b1", "b2", "b3"),
                       class = "data.frame", row.names = c(NA, -3L))


tmp2 <- read.csv("tmp2.csv", sep=";")
tmp2
> tmp2
  a1 a2 a3 b1 b2 b3
1  1  1  0  1  1  0
2  0  0  1  0  0  1
3  0  1  0  1  0  1

I try to get a frequency table as follow:

table(tmp2[,1:3], tmp2[,4:6])

But I get :

Error in sort.list(y) : 'x' must be atomic for 'sort.list'
Have you called 'sort' on a list?

Expected output:

enter image description here

Info: It is not necessary a square matrix for instance I should be able to add b4 b5 and keep a1 a2 a3

like image 884
S12000 Avatar asked Apr 13 '16 10:04

S12000


People also ask

How do you make a table of frequency in R?

To create a frequency table in R, we can simply use table function but the output of table function returns a horizontal table. If we want to read the table in data frame format then we would need to read the table as a data frame using as. data. frame function.

How do you make a table with multiple variables?

Creating a table with lots of variables. You can create tables with an unlimited number of variables by selecting Insert > Analysis > More and then selecting Tables > Multiway Table. For example, the table below shows Average monthly bill by Occupation, Work Status, and Gender.

Can I group by multiple columns in R?

How to perform a group by on multiple columns in R DataFrame? By using group_by() function from dplyr package we can perform group by on multiple columns or variables (two or more columns) and summarise on multiple columns for aggregations.


2 Answers

An option:

matrix(colSums(tmp2[,rep(1:3,3)] & tmp2[,rep(4:6,each=3)]),
       ncol=3,nrow=3,
       dimnames=list(colnames(tmp2)[1:3],colnames(tmp2)[4:6]))
#   b1 b2 b3
#a1  1  1  0
#a2  2  1  1
#a3  0  0  1

If you have a different number of a and b columns, you can try:

acols<-1:3 #state the indices of the a columns
bcols<-4:6 #same for b; if you add a column this should be 4:7
matrix(colSums(tmp2[,rep(acols,length(bcols))] & tmp2[,rep(bcols,each=length(acols))]),
           ncol=length(bcols),nrow=length(acols),
           dimnames=list(colnames(tmp2)[acols],colnames(tmp2)[bcols]))
like image 123
nicola Avatar answered Sep 30 '22 23:09

nicola


Here's a possible solution :

aIdxs <- 1:3
bIdxs <- 4:7

# init matrix
m <- matrix(0,
            nrow = length(aIdxs), ncol=length(bIdxs),
            dimnames = list(colnames(tmp2)[aIdxs],colnames(tmp2)[bIdxs]))

# create all combinations of a's and b's column indexes
idxs <- expand.grid(aIdxs,bIdxs)

# for each line and for each combination we add 1
# to the matrix if both a and b column are 1 
for(r in 1:nrow(tmp2)){
  m <- m + matrix(apply(idxs,1,function(x){ all(tmp2[r,x]==1) }),
                  nrow=length(aIdxs), byrow=FALSE)
}
> m
   b1 b2 b3
a1  1  1  0
a2  2  1  1
a3  0  0  1
like image 30
digEmAll Avatar answered Sep 30 '22 21:09

digEmAll