I would like to get the same results in Julia as using lmer function from lme4 library in R. Please find below and example on a build-in mtcars dataset using R
library(lme4)
data<-mtcars
data$vs<-as.factor(data$vs)
data$am<-as.factor(data$am)
data$gear<-as.factor(data$gear)
str(data)
model <- lmer(mpg ~ cyl:gear + hp:am + (1|gear:am), data = data)
I have found the lmm()
function from MixedModels package for Julia which should be capable of running the same results, however I don't know how to rewrite the formula from the first argument of lmer()
function using lmm()
. Especially the interaction (:) operator.
I would be grateful for an answer with short example.
The correspondence between R and Julia models here does not seem to be exact. There are also issues of different numerical algorithms. But my attempt to recreate the same model using MixedModels
is as follows:
using RDatasets
using MixedModels
mtcars = dataset("datasets","mtcars")
mtcars[:AM] = PooledDataArray(mtcars[:AM])
mtcars[:Gear] = PooledDataArray(mtcars[:Gear])
mtcars[:GearAM] = PooledDataArray(collect(zip(mtcars[:Gear],mtcars[:AM])))
m = fit!(lmm(MPG ~ 1 + Gear + AM + Cyl + HP + (1|GearAM),mtcars))
Creating the mixed effect column manually is klunky - perhaps there is a better way. Note the differences in naming the coefficients between R and Julia. There are 6 fixed-effect coefficients in both.
The solution in Julia seems different (on my machine) but achieves a better log-likelihood. The random-effect is expected to be weak, as its variables are already present in the fixed-effects (it only accounts for dependence between Gear and AM), and there are only 32 data points.
Hope this helps and if you come up with a better understanding, it would be nice to add it in another answer or comment.
If you use REML=true
and @formula(MPG ~ 1 + Cyl & Gear + HP & AM + (1 | Gear & AM)
you can reproduce exactly.
Support for (1 | Gear & AM)
requires version 3 or higher of MixedModels.jl which as of now (August 2020) is unreleased, so to reproduce this you'll need to install it with pkg"add MixedModels#master"
. This was run on Julia 1.5.0:
using RDatasets, MixedModels
mtcars = RDatasets.dataset("datasets","mtcars")
categorical!(mtcars, [:AM, :Gear])
m = fit(MixedModel,
@formula(MPG ~ Cyl & Gear + HP & AM + (1 | Gear & AM)),
mtcars,
REML=true)
Produces this output:
Linear mixed model fit by REML
MPG ~ 1 + Cyl & Gear + HP & AM + (1 | Gear & AM)
REML criterion at convergence: 168.11435769585046
Variance components:
Column Variance Std.Dev.
Gear & AM (Intercept) 16.913543 4.1126078
Residual 7.988539 2.8264004
Number of obs: 32; levels of grouping factors: 4
Fixed-effects parameters:
──────────────────────────────────────────────────────
Coef. Std. Error z Pr(>|z|)
──────────────────────────────────────────────────────
(Intercept) 34.3507 3.50385 9.80 <1e-21
Cyl & Gear: 3 -1.08837 0.781649 -1.39 0.1638
Cyl & Gear: 4 -1.56833 0.875625 -1.79 0.0733
Cyl & Gear: 5 -0.167413 1.42839 -0.12 0.9067
HP & AM: 0 -0.0412826 0.0215381 -1.92 0.0553
HP & AM: 1 -0.0613247 0.0298067 -2.06 0.0396
──────────────────────────────────────────────────────
Compared with the R code you posted:
Linear mixed model fit by REML ['lmerMod']
Formula: mpg ~ cyl:gear + hp:am + (1 | gear:am)
Data: data
REML criterion at convergence: 168.1
Scaled residuals:
Min 1Q Median 3Q Max
-1.42280 -0.54922 -0.07898 0.54892 2.09217
Random effects:
Groups Name Variance Std.Dev.
gear:am (Intercept) 16.914 4.113
Residual 7.989 2.826
Number of obs: 32, groups: gear:am, 4
Fixed effects:
Estimate Std. Error t value
(Intercept) 34.35066 3.50385 9.804
cyl:gear3 -1.08837 0.78165 -1.392
cyl:gear4 -1.56833 0.87562 -1.791
cyl:gear5 -0.16741 1.42839 -0.117
hp:am0 -0.04128 0.02154 -1.917
hp:am1 -0.06132 0.02981 -2.057
You can also control which variables are treated as categorical by providing contrasts for them:
m2 = fit(MixedModel,
@formula(MPG ~ Cyl & Gear + HP & AM + (1 | Gear & AM)),
mtcars,
REML=true,
contrasts = Dict(:AM => DummyCoding(), :Gear => DummyCoding()))
There are two main differences in specifying the regression formula between R and Julia (with StatsModels.jl).
&
to create an interaction between terms, but in R you use :
.@formula
macro. In R, anything and everything is potentially a macro via the magic of "nonstandard evaluation".So, the equivalent Julia formula for your model is
@formula(mpg ~ cyl&gear + hp&am + (1|gear&am))
For using categorical!
to convert columns in a DataFrame to categorical, see the DataFrames.jl docs
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With