How do I get an AIC
from an lm_robust
object (package estimatr
)? I'm using lm_robust
because I want to use a robust estimator for calculating the SE
. Unlike the lm
function, AIC
is not provided when you run the summary function and running the AIC
function on a lm_robust
object produces an error. Below is a toy example of the kind of model I'm trying to run.
library(estimatr)
fake_data<-data.frame(outcome=rnorm(100,3.65,1),
pred1=rnorm(100,15,7),
pred2=as.factor(sample(1:5, 100, replace = T)))
mod1<-lm_robust(outcome~pred1+pred2,data=fake_data)
AIC(mod1)
here is what the error message looks like:
> AIC(mod1)
Error in UseMethod("logLik") :
no applicable method for 'logLik' applied to an object of class "lm_robust"
If you have to do it with lm_robust, you may choose to calculate it by yourself as below, AIC = 2*k + n [Ln ( 2 (pi) RSS/n ) + 1] # n : Number of observation # k : All variables including all distinct factors and constant # RSS : Residual Sum of Square
Obviously the easiest way to get the AIC of a linear model is by using the lm function and then the AIC command. However, this simple method can be very time consuming, as actually with the lm function R computes a lot of stuff which isn't strictly needed to calculate the AIC of the model.
For linear models with unknown scale (i.e., for lm and aov ), -2log L is computed from the deviance and uses a different additive constant to logLik and hence AIC.
I'm using lm_robust because I want to use a robust estimator for calculating the SE. Unlike the lm function, AIC is not provided when you run the summary function and running the AIC function on a lm_robust object produces an error. Below is a toy example of the kind of model I'm trying to run.
If you have to do it with lm_robust
, you may choose to calculate it by yourself as below,
The formula of AIC
,
AIC = 2*k + n [Ln( 2(pi) RSS/n ) + 1]
# n : Number of observation
# k : All variables including all distinct factors and constant
# RSS : Residual Sum of Square
If we apply it to R
for your case,
# Note that, I take k=7 since you have, 5 factors + 1 continuous and 1 constant
AIC_calculated <- 2*7 + 100* (log( 2*pi* (1-mod1$r.squared)*mod1$tss/100 ) + 1)
[1] 332.2865
which is same with both lm
and glm
outputs.
mod2<-lm(outcome~pred1+pred2,data=fake_data)
> AIC(mod2)
[1] 332.2865
And finally, of course, you can put this calculation into a function to call whenever you want by just giving lm_robust
model inside it without having to set the N
and k
parameters for any given data like,
myAIC <- function(data) {
2*(data$k+1) + data$N * (log(2*pi* (1-data$r.squared)*data$tss/data$N ) + 1)
}
> myAIC(mod1)
[1] 332.2865
Note: Results may be shown different in your computer because of the seeding differences when running the sample()
function in dataframe.
Here's a workaround
mod1 = lm_robust(outcome ~ pred1 + pred2, data = fake_data)
#Create any fitted model using 'lm' as a placeholder
mod2 = with(list(x = rnorm(10), y = rnorm(10)), lm(y ~ x))
#Copy values in `mod2` from `mod1`
mod2[names(mod2)] = mod1[names(mod2)]
#Calculate residuals in `mod2`
mod2$residuals = mod2$fitted.values - fake_data$outcome
AIC(mod2)
#[1] 326.6092
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With