Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Linear Regression and group by in R

I want to do a linear regression in R using the lm() function. My data is an annual time series with one field for year (22 years) and another for state (50 states). I want to fit a regression for each state so that at the end I have a vector of lm responses. I can imagine doing for loop for each state then doing the regression inside the loop and adding the results of each regression to a vector. That does not seem very R-like, however. In SAS I would do a 'by' statement and in SQL I would do a 'group by'. What's the R way of doing this?

like image 737
JD Long Avatar asked Jul 23 '09 04:07

JD Long


People also ask

Is linear regression used for segmentation?

In this context, segmented linear regression can be viewed as a decomposition of a given large dataset into a relatively small set of simple objects that provide compact but approximate representation of the given dataset within a specified accuracy. Segmented linear regression provides quite robust approximation.

What is group regression?

When sample data are divided into groups, and observations consist of the independent variable xand associated dependent variable y,a logical form of analysis is “grouped regression.” This statistical technique allows testing of the relationship between the two variables and assessment of how the relationship is ...


1 Answers

Here's an approach using the plyr package:

d <- data.frame(   state = rep(c('NY', 'CA'), 10),   year = rep(1:10, 2),   response= rnorm(20) )  library(plyr) # Break up d by state, then fit the specified model to each piece and # return a list models <- dlply(d, "state", function(df)    lm(response ~ year, data = df))  # Apply coef to each model and return a data frame ldply(models, coef)  # Print the summary of each model l_ply(models, summary, .print = TRUE) 
like image 156
hadley Avatar answered Nov 19 '22 10:11

hadley