I'm struggling with what seems like it should be a simple extension of a previous question I'd asked here.
I'm trying to aggregate over (a) a range of dates and (b) a factor variable. Sample data might be:
Brand Day Rev RVP
A 1 2535.00 195.00
B 1 1785.45 43.55
C 1 1730.87 32.66
A 2 920.00 230.00
B 2 248.22 48.99
C 3 16466.00 189.00
A 1 2535.00 195.00
B 3 1785.45 43.55
C 3 1730.87 32.66
A 4 920.00 230.00
B 5 248.22 48.99
C 4 16466.00 189.00
Thanks to helpful advice, I've figured out how to find the mean revenue for brands over a period of days using data.table:
new_df<-df[,(mean(Rev)), by=list(Brand,Day)]
Now, I'd like to create a new table where there is a column listing the coefficient estimate from an OLS regression of Rev by Day for each brand. I tried to do this as follows:
new_df2<-df[,(lm(Rev~Day)), by=list(Brand)]
That doesn't seem quite right. Thoughts? I'm sure it's something obvious I've missed.
You have a couple of options.
You could save the whole model object as a list within the data.table
models <- df[, list(model = list(lm(Rev ~ Day))),by = Brand]
models
Brand model
1: A <lm>
2: B <lm>
3: C <lm>
# look at the models
models[,print(model[[1]]),by= Brand]
Call:
lm(formula = Rev ~ Day)
Coefficients:
(Intercept) Day
2804.2 -538.3
Call:
lm(formula = Rev ~ Day)
Coefficients:
(Intercept) Day
1741.5 -263.5
Call:
lm(formula = Rev ~ Day)
Coefficients:
(Intercept) Day
-3698 4653
You can save the coefficients
models[, {coefs <- coef(model[[1]])
list(coefs = coefs, name = names(coefs))}, by = Brand]
## Brand coefs name
## 1: A 2804.1667 (Intercept)
## 2: A -538.3333 Day
## 3: B 1741.5291 (Intercept)
## 4: B -263.5251 Day
## 5: C -3697.8621 (Intercept)
## 6: C 4653.1989 Day
Or you can just extract the list of models
models[,model]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With