When using the glm
function in R one can use functions like addNA
or log
inside the formula
argument. Let's say we have a dataframe Data
with 4 columns: Class
, var1
which are factors and var2
, var3
which are numeric variables and we fit:
Model <- glm(data = Data,
formula = Class ~ addNA(var1) + var2+ log(var3),
family = binomial)
In the glm output variable 1 will now be called addNA(var1)
(e.g. in Model$xlevels
), while variable 3 will be called log(var3)
.
Is it possible to retrieve a list from the glm output that indicates that var1, var2 and var3 were extracted from the dataframe, without addNA(var1) or log(var3) appearing in the variable names?
More general, is it possible to infer which columns were extracted from the input dataframe by glm prior to any transformations / cross terms etc being generated inside the glm function, after the call to glm has been made?
This works:
all.vars(formula(Model)[-2])
## [1] "var1" "var2" "var3"
The [-2]
indexing removes the response variable from the formula. However, you may be disappointed that the internally stored model frame does not have the original variables, but the transformed variables ...
names(model.frame(Model))
## [1] "Class" "addNA(var1)" "var2" "log(var3)"
If you want the raw names, then all.vars(getCall(Model)$formula)
should work.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With