data.table vs plyr regression output

Question

The data.table package is very helpful in terms of speed. But I am having trouble actually using the output from a linear regression. Is there an easy way to get the data.table output to be as pretty/useful as that from the plyr package? Below is an example. Thank you!

library('data.table');
library('plyr');

REG <- data.table(ID=c(rep('Frank',5),rep('Tony',5),rep('Ed',5)), y=rnorm(15), x=rnorm(15), z=rnorm(15));
REG;

ddply(REG, .(ID), function(x) coef(lm(y ~ x + z, data=x)));

REG[, coef(lm(y ~ x + z)), by=ID];

The data.table coefficient estimates are output in a single column whereas the plyr/ddply coefficient estimates are output in multiple and nicely labeled columns.

I know I can run the regression three times with data.table but that seems really inefficient. I could be wrong, though.

REG[, Intercept=coef(lm(y ~ x + z))[1],
      x        =coef(lm(y ~ x + z))[2],
      z        =coef(lm(y ~ x + z))[3], by=ID];

IRTFM · Accepted Answer

Try this:

> REG[, as.list(coef(lm(y ~ x + z))), by=ID];
        ID (Intercept)           x         z
[1,] Frank  -0.2928611  0.07215896  1.835106
[2,]  Tony   0.9120795 -1.11153056  2.041260
[3,]    Ed   1.0498359  5.77131778 -1.253741

I have the nagging feeling that this question was asked less than a week ago, but I don't think I arrived at this approach when I tried it and I don't remember than any answer was this compact.

Oh, there it is .. on r-help. Matthew can comment on the rightfulness of this if he wants. I guess the message is that functions returning lists will not have dimensions dropped. The interesting thing was the using list(coef(lm(...)) did not succeed in the manner we hoped.

data.table vs plyr regression output

Tags:

r

data.table

user1491868

1 Answers

IRTFM

Recent Activity

Donate For Us

data.table vs plyr regression output

Tags:

r

data.table

user1491868

1 Answers

IRTFM

Related questions

Recent Activity

Donate For Us