Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Caret Neural Network Error: "missing values in resampled performance measures"

I have seen other people with this error before, however, I have not found a satisfactory answer. I wonder if anyone can offer some insights into my problem?

I have some car auction data which I am trying to model to predict the Hammer.Price.

> str(myTrain)
'data.frame':   34375 obs. of  9 variables:
 $ Grade          : int  4 4 4 4 2 3 4 3 3 4 ...
 $ Mileage        : num  150850 113961 71834 57770 43161 ...
 $ Hammer.Price   : num  750 450 1600 4650 4800 ...
 $ New.Price      : num  15051 13795 15051 14475 14475 ...
 $ Year.Introduced: int  1996 1996 1996 1996 1996 1996 1996 1996 1996 1996 ...
 $ Engine.Size    : num  1.6 1.6 1.6 1.6 1.6 1.6 1.6 1.6 1.6 1.6 ...
 $ Doors          : int  3 3 3 3 3 3 3 3 3 3 ...
 $ Age            : int  3771 4775 3802 2402 2463 3528 3315 3193 4075 4988 ...
 $ Days.from.Sale : int  1778 1890 2183 1939 1876 1477 1526 1812 1813 1472 ...

myTrain contains a random 70% of the data and myTest the other 30%, I train the model

myModel <- train(Hammer.Price ~ ., data = myTrain, method = "nnet")

This results in the following warning:

Warning message: In nominalTrainWorkflow(x = x, y = y, wts = weights, info = trainInfo, : There were missing values in resampled performance measures.

When I try to predict all of the results are equal to 1.

myTestPred <- predict(myModel, myTest)

I have previously used this data to train a MLP neural network using SPSS Modeller but don't seem to be able to recreate the results in R. I have tried some of the other neural network packages in caret but always get the same result.

Does anyone understand this better than me?

like image 550
Matthew Jackson Avatar asked Jul 28 '15 07:07

Matthew Jackson


2 Answers

Does it fix the problem if you scale the data before calling train? I have had this problem with glmnet and nnet if you don't scale all the variables before running the model. It also helps (anecdotally) if you make all of your variables numeric.

You can also try making your resampling explicit e.g. using

myControl <- trainControl(method = "repeatedcv", repeats=5, number = 10)

and then passing this to train:

myModel <- train(Hammer.Price ~ .,
    data = myTrain,
    method = "nnet",
    trControl = mycontrol)

Without the data it is sometimes difficult to spot the error, sorry.

like image 161
Achekroud Avatar answered Nov 14 '22 22:11

Achekroud


Your target variable Hammer.Price is a numeric variable. From the help page of the nnet-function, you will see that the default in nnet is a logistic target variable. Thus, when modelling numeric target variable, you have to tell nnet that you are doing so. The parameter linout is the one you need. By setting linout = TRUE you should not get the warning message again.

like image 30
Samuel-Rosa Avatar answered Nov 14 '22 23:11

Samuel-Rosa