Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Logistic Regression on factor: Error in eval(family$initialize) : y values must be 0 <= y <= 1

Not able to fix the below error for the below logistic regression

training=(IBM$Serial<625)
data=IBM[!training,]

stock.direction <- data$Direction
training_model=glm(stock.direction~data$lag2,data=data,family=binomial)

###Error### ---- Error in eval(family$initialize) : y values must be 0 <= y <= 1

Few rows from the data i am using

X   Date    Open    High    Low Close   Adj.Close   Volume  Return  lag1    lag2    lag3    Direction   Serial
1   28-11-2012  190.979996  192.039993  189.270004  191.979996  165.107727  3603600 0.004010855 0.004010855 -0.001198021    -0.006354834    Up  1
2   29-11-2012  192.75  192.899994  190.199997  191.529999  164.720734  4077900 0.00114865  0.00114865  -0.004020279    -0.009502386    Up  2
3   30-11-2012  191.75  192 189.5   190.070007  163.465073  4936400 0.003630178 0.003630178 -0.001894039    -0.005576956    Up  3
4   03-12-2012  190.759995  191.300003  188.360001  189.479996  162.957703  3349600 0.001213907 0.001213907 -0.002480478    -0.001636046    Up  4
like image 347
Akhil Doppalapudi Avatar asked Nov 29 '17 06:11

Akhil Doppalapudi


2 Answers

The reason it's asking for y values between 0 and 1 is because the categorical features in your data such as 'direction' are of type 'character'. You need to convert them to type 'factor' with as.factor(data$Direction). So: glm(Direction ~ lag2, data=...) Don't need to declare stock.direction.

You can check the class of variables by using the command class(variable), and if they're character, you can convert to factor and create a new column in the same data frame. It should work then.

like image 94
Nidhi Garg Avatar answered Oct 15 '22 11:10

Nidhi Garg


I was getting the same error

Error in eval(family$initialize) : y values must be 0 <= y <= 1" and solved it by adding "stringsAsFactors=T

to the read.csv function.

BEFORE : gene.train = read.csv("gene.train.csv", header=T) # error

AFTER : gene.train = read.csv("gene.train.csv", header=T, stringsAsFactors=T) # no error.

like image 23
binmosa Avatar answered Oct 15 '22 11:10

binmosa