Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Good ways to code complex tabulations in R?

Does anyone have any good thoughts on how to code complex tabulations in R?

I am afraid I might be a little vague on this, but I want to set up a script to create a bunch of tables of a complexity analogous to the stat abstract of the united states.

e.g.: http://www.census.gov/compendia/statab/tables/09s0015.pdf

And I would like to avoid a whole bunch of rbind and hbind statements.

In SAS, I have heard, there is a table creation specification language; I was wondering if there was something of similar power for R?

Thanks!

like image 205
forkandwait Avatar asked Dec 05 '25 17:12

forkandwait


2 Answers

It looks like you want to apply a number of different calculations to some data, grouping it by one field (in the example, by state)?

There are many ways to do this. See this related question.

You could use Hadley Wickham's reshape package (see reshape homepage). For instance, if you wanted the mean, sum, and count functions applied to some data grouped by a value (this is meaningless, but it uses the airquality data from reshape):

> library(reshape)
> names(airquality) <- tolower(names(airquality))
> # melt the data to just include month and temp
> aqm <- melt(airquality, id="month", measure="temp", na.rm=TRUE)
> # cast by month with the various relevant functions
> cast(aqm, month ~ ., function(x) c(mean(x),sum(x),length(x)))
  month X1   X2 X3
1     5 66 2032 31
2     6 79 2373 30
3     7 84 2601 31
4     8 84 2603 31
5     9 77 2307 30

Or you can use the by() function. Where the index will represent the states. In your case, rather than apply one function (e.g. mean), you can apply your own function that will do multiple tasks (depending upon your needs): for instance, function(x) { c(mean(x), length(x)) }. Then run do.call("rbind" (for instance) on the output.

Also, you might give some consideration to using a reporting package such as Sweave (with xtable) or Jeffrey Horner's brew package. There is a great post on the learnr blog about creating repetitive reports that shows how to use it.

like image 51
Shane Avatar answered Dec 07 '25 06:12

Shane


Another options is the plyr package.

library(plyr)
names(airquality) <- tolower(names(airquality))
ddply(airquality, "month", function(x){
    with(x, c(meantemp = mean(temp), maxtemp = max(temp), nonsense = max(temp) - min(solar.r)))
})
like image 30
Thierry Avatar answered Dec 07 '25 05:12

Thierry



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!