I'm creating a model with several thousand variables, all of which have a majority of values equal to NA. I am able to successfully run logistic regression on some variables but not others.
Here's my code to input the large amount of vars:
model_vars <- names(dataset[100:4000])
vars<- paste("DP ~ ", paste(model_vars, collapse= " + "))
This formats it with the dependant variable and each Independant variable having a "+" between. I then run this through the glm function:
glm(vars, data = training, family = binomial)
Here is the error I get when certain variables are included:
Error in family$linkfun(mustart) :
Argument mu must be a nonempty numeric vector
I cannot figure out why this is occuring and why the regression works for certain variables and not others. I can't see any trend in the variables that cause the error. Could someone clarify why this error shows up?
Data must be numeric (no NA, Inf, NaN, True, False etc.)!
I had the error:
Error in family$linkfun(mustart) :
Argument mu must be a nonempty numeric vector
when using logistic regression with glm(), like:
glm(y~x,data=df, family='binomial')
after subsetting and standardizing data frames in a loop.
It turned out, that (some of) the subsetted and standardized data frames contained NA, which caused the error.
For others with that cryptic error message. Perhaps the data frame is empty?
This reproduces the message:
d=data.frame(x=c(NA),y=c(NA))
d=d[complete.cases(d),]
m=glm(y~.,d,family = 'binomial')
Error in family$linkfun(mustart) : Argument mu must be a nonempty numeric vector
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With