Not able to fix the below error for the below logistic regression
training=(IBM$Serial<625)
data=IBM[!training,]
stock.direction <- data$Direction
training_model=glm(stock.direction~data$lag2,data=data,family=binomial)
###Error### ---- Error in eval(family$initialize) : y values must be 0 <= y <= 1
Few rows from the data i am using
X Date Open High Low Close Adj.Close Volume Return lag1 lag2 lag3 Direction Serial
1 28-11-2012 190.979996 192.039993 189.270004 191.979996 165.107727 3603600 0.004010855 0.004010855 -0.001198021 -0.006354834 Up 1
2 29-11-2012 192.75 192.899994 190.199997 191.529999 164.720734 4077900 0.00114865 0.00114865 -0.004020279 -0.009502386 Up 2
3 30-11-2012 191.75 192 189.5 190.070007 163.465073 4936400 0.003630178 0.003630178 -0.001894039 -0.005576956 Up 3
4 03-12-2012 190.759995 191.300003 188.360001 189.479996 162.957703 3349600 0.001213907 0.001213907 -0.002480478 -0.001636046 Up 4
The reason it's asking for y values between 0 and 1 is because the categorical features in your data such as 'direction' are of type 'character'. You need to convert them to type 'factor' with as.factor(data$Direction)
. So: glm(Direction ~ lag2, data=...)
Don't need to declare stock.direction.
You can check the class of variables by using the command class(variable)
, and if they're character, you can convert to factor and create a new column in the same data frame. It should work then.
I was getting the same error
Error in eval(family$initialize) : y values must be 0 <= y <= 1" and solved it by adding "stringsAsFactors=T
to the read.csv
function.
BEFORE : gene.train = read.csv("gene.train.csv", header=T) # error
AFTER : gene.train = read.csv("gene.train.csv", header=T, stringsAsFactors=T) #
no error.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With