Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting both column counts and proportions in the same table in R

Tags:

r

If there a function that will give me both counts and column/overall percents in the same table? I can looked at both tables and reshape2 and don't see an option for doing it. I'll give a little example:

data setup

n <- 100
x <- sample(letters[1:3], n, T)
y <- sample(letters[1:3], n, T)
d <- data.frame(x=x, y=y)

With tables

This is very clunky as it requires me to unlist and recombine.

> library(tables)
> (t1 <- tabular(x~y*(n=length), d))

   a  b  c 
 x n  n  n 
 a 13 14 11
 b  8 11 13
 c 10 12  8
> prop.table(matrix(unlist(t1),3,3), 1)
          [,1]      [,2]      [,3]
[1,] 0.3421053 0.3684211 0.2894737
[2,] 0.2500000 0.3437500 0.4062500
[3,] 0.3333333 0.4000000 0.2666667

With Reshape2

This is a little easier, but still not in one.

> library(reshape2)
> (t2 <- acast(d, x~y, length))
Using y as value column: use value_var to override.
   a  b  c
a 13 14 11
b  8 11 13
c 10 12  8
> (t3 <- prop.table(t2,1))
          a         b         c
a 0.3421053 0.3684211 0.2894737
b 0.2500000 0.3437500 0.4062500
c 0.3333333 0.4000000 0.2666667

Desired output

What I really want is output that looks something like this:

> structure(list(
+     a = data.frame(n=t2[,1], pct=t3[,1]),
+     b = data.frame(n=t2[,2], pct=t3[,2]),
+     c = data.frame(n=t2[,3], pct=t3[,3])), 
+   class = 'data.frame',
+   row.names = letters[1:3])
  a.n     a.pct b.n     b.pct c.n     c.pct
a  13 0.3421053  14 0.3684211  11 0.2894737
b   8 0.2500000  11 0.3437500  13 0.4062500
c  10 0.3333333  12 0.4000000   8 0.2666667

Is there a way to do this easily with R?

like image 900
Andrew Redd Avatar asked Feb 24 '12 21:02

Andrew Redd


1 Answers

Here is one approach, you still need a second step, but it comes before the tabular command so the result is still a tabular object.

n <- 100 
x <- sample(letters[1:3], n, T) 
y <- sample(letters[1:3], n, T) 
d <- data.frame(x=x, y=y) 
d$z <- 1/ave( rep(1,n), d$x, FUN=sum )

(t1 <- tabular(x~y*Heading()*z*((n=length) + (p=sum)), d))
like image 122
Greg Snow Avatar answered Sep 28 '22 08:09

Greg Snow