Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Logistic Regression with GLM

I was trying to bring some of my R Code to Julia, but have a problem with the GLM Package. The dataset is grouped by age and in each group are m_i individuals from which N_i are sick. I want to estimate the probability of being sick as a function of age - a typical logistic regression problem. I R the code would look like:

fit <- glm(cbind(N, m - N) ~ age, family = binomial, data = heart)

I tried in Julia the following function call, but it does not work:

glm(@formula((N, m-N) ~ age), df, Binomial(), LogitLink())

Any ideas? The dataset could be found here: http://stat.ethz.ch/Teaching/Datasets/heart.dat

Thank you.

like image 631
Hamlet Avatar asked Apr 28 '26 06:04

Hamlet


1 Answers

You have to construct a binary variable sick that corresponds to number of sick and not sick observations in each age group. I achieve this below by creating a separate DataFrame for each age group and then running vcat on them.

Here is the code that does the job assuming that you read in your data in heart data frame (I squashed creation of heart_flat into one line, but you can extract the comprehension inside to see what is created on the go):

heart_flat = vcat([DataFrame(age=row[:age],
                             sick=[ones(Int, row[:N]);
                                   zeros(Int, row[:m]-row[:N])])
                   for row in eachrow(heart)]...)

glm(@formula(sick ~ age), heart_flat, Binomial(), LogitLink())

It produces the same estimates as those in R.

like image 127
Bogumił Kamiński Avatar answered Apr 30 '26 06:04

Bogumił Kamiński



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!