Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Logistic Regression Error in r

Tags:

r

I'm creating a model with several thousand variables, all of which have a majority of values equal to NA. I am able to successfully run logistic regression on some variables but not others.

Here's my code to input the large amount of vars:

model_vars <- names(dataset[100:4000])
vars<- paste("DP ~ ", paste(model_vars, collapse= " + "))  

This formats it with the dependant variable and each Independant variable having a "+" between. I then run this through the glm function:

glm(vars, data = training, family = binomial)

Here is the error I get when certain variables are included:

Error in family$linkfun(mustart) : 
Argument mu must be a nonempty numeric vector

I cannot figure out why this is occuring and why the regression works for certain variables and not others. I can't see any trend in the variables that cause the error. Could someone clarify why this error shows up?

like image 933
greeny Avatar asked Oct 15 '25 11:10

greeny


2 Answers

Data must be numeric (no NA, Inf, NaN, True, False etc.)!

I had the error:

Error in family$linkfun(mustart) : 
  Argument mu must be a nonempty numeric vector

when using logistic regression with glm(), like:

glm(y~x,data=df, family='binomial')

after subsetting and standardizing data frames in a loop.

It turned out, that (some of) the subsetted and standardized data frames contained NA, which caused the error.

like image 111
Peter Avatar answered Oct 17 '25 02:10

Peter


For others with that cryptic error message. Perhaps the data frame is empty?

This reproduces the message:

d=data.frame(x=c(NA),y=c(NA))
d=d[complete.cases(d),]
m=glm(y~.,d,family = 'binomial')

Error in family$linkfun(mustart) : Argument mu must be a nonempty numeric vector

like image 24
Chris Avatar answered Oct 17 '25 00:10

Chris