Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cross Validation in Keras

I'm implementing a Multilayer Perceptron in Keras and using scikit-learn to perform cross-validation. For this, I was inspired by the code found in the issue Cross Validation in Keras

from sklearn.cross_validation import StratifiedKFold

def load_data():
    # load your data using this function

def create model():
    # create your model using this function

def train_and_evaluate__model(model, data[train], labels[train], data[test], labels[test)):
    # fit and evaluate here.

if __name__ == "__main__":
    X, Y = load_model()
    kFold = StratifiedKFold(n_splits=10)
    for train, test in kFold.split(X, Y):
        model = None
        model = create_model()
        train_evaluate(model, X[train], Y[train], X[test], Y[test])

In my studies on neural networks, I learned that the knowledge representation of the neural network is in the synaptic weights and during the network tracing process, the weights that are updated to thereby reduce the network error rate and improve its performance. (In my case, I'm using Supervised Learning)

For better training and assessment of neural network performance, a common method of being used is cross-validation that returns partitions of the data set for training and evaluation of the model.

My doubt is...

In this code snippet:

for train, test in kFold.split(X, Y):
    model = None
    model = create_model()
    train_evaluate(model, X[train], Y[train], X[test], Y[test])

We define, train and evaluate a new neural net for each of the generated partitions?

If my goal is to fine-tune the network for the entire dataset, why is it not correct to define a single neural network and train it with the generated partitions?

That is, why is this piece of code like this?

for train, test in kFold.split(X, Y):
    model = None
    model = create_model()
    train_evaluate(model, X[train], Y[train], X[test], Y[test])

and not so?

model = None
model = create_model()
for train, test in kFold.split(X, Y):
    train_evaluate(model, X[train], Y[train], X[test], Y[test])

Is my understanding of how the code works wrong? Or my theory?

like image 810
Alisson Hayasi Avatar asked Jan 03 '18 21:01

Alisson Hayasi


1 Answers

If my goal is to fine-tune the network for the entire dataset

It is not clear what you mean by "fine-tune", or even what exactly is your purpose for performing cross-validation (CV); in general, CV serves one of the following purposes:

  • Model selection (choose the values of hyperparameters)
  • Model assessment

Since you don't define any search grid for hyperparameter selection in your code, it would seem that you are using CV in order to get the expected performance of your model (error, accuracy etc).

Anyway, for whatever reason you are using CV, the first snippet is the correct one; your second snippet

model = None
model = create_model()
for train, test in kFold.split(X, Y):
    train_evaluate(model, X[train], Y[train], X[test], Y[test])

will train your model sequentially over the different partitions (i.e. train on partition #1, then continue training on partition #2 etc), which essentially is just training on your whole data set, and it is certainly not cross-validation...

That said, a final step after the CV which is often only implied (and frequently missed by beginners) is that, after you are satisfied with your chosen hyperparameters and/or model performance as given by your CV procedure, you go back and train again your model, this time with the entire available data.

like image 105
desertnaut Avatar answered Sep 19 '22 14:09

desertnaut