I'm trying to perform parameters tuning for a neural network built with keras. This is my code with a comment on the line that causes the error:
from sklearn.cross_validation import StratifiedKFold, cross_val_score
from sklearn import grid_search
from sklearn.metrics import classification_report
import multiprocessing
from keras.models import Sequential
from keras.layers import Dense
from sklearn.preprocessing import LabelEncoder
from keras.utils import np_utils
from keras.wrappers.scikit_learn import KerasClassifier
import numpy as np
def tuning(X_train,Y_train,X_test,Y_test):
in_size=X_train.shape[1]
num_cores=multiprocessing.cpu_count()
model = Sequential()
model.add(Dense(in_size, input_dim=in_size, init='uniform', activation='relu'))
model.add(Dense(8, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
batch_size = [10, 20, 40, 60, 80, 100]
epochs = [10,20]
param_grid = dict(batch_size=batch_size, nb_epoch=epochs)
k_model = KerasClassifier(build_fn=model, verbose=0)
clf = grid_search.GridSearchCV(estimator=k_model, param_grid=param_grid, cv=StratifiedKFold(Y_train, n_folds=10, shuffle=True, random_state=1234),
scoring="accuracy", verbose=100, n_jobs=num_cores)
clf.fit(X_train, Y_train) #ERROR HERE
print("Best parameters set found on development set:")
print()
print(clf.best_params_)
print()
print("Grid scores on development set:")
print()
for params, mean_score, scores in clf.grid_scores_:
print("%0.3f (+/-%0.03f) for %r"
% (mean_score, scores.std() * 2, params))
print()
print("Detailed classification report:")
print()
print("The model is trained on the full development set.")
print("The scores are computed on the full evaluation set.")
print()
y_true, y_pred = Y_test, clf.predict(X_test)
print(classification_report(y_true, y_pred))
print()
And this is the errors report:
clf.fit(X_train, Y_train)
File "/usr/local/lib/python2.7/dist-packages/sklearn/grid_search.py", line 804, in fit
return self._fit(X, y, ParameterGrid(self.param_grid))
File "/usr/local/lib/python2.7/dist-packages/sklearn/grid_search.py", line 553, in _fit
for parameters in parameter_iterable
File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 800, in __call__
while self.dispatch_one_batch(iterator):
File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 658, in dispatch_one_batch
self._dispatch(tasks)
File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 566, in _dispatch
job = ImmediateComputeBatch(batch)
File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 180, in __init__
self.results = batch()
File "/usr/local/lib/python2.7/dist-packages/sklearn/externals/joblib/parallel.py", line 72, in __call__
return [func(*args, **kwargs) for func, args, kwargs in self.items]
File "/usr/local/lib/python2.7/dist-packages/sklearn/cross_validation.py", line 1531, in _fit_and_score
estimator.fit(X_train, y_train, **fit_params)
File "/usr/local/lib/python2.7/dist-packages/keras/wrappers/scikit_learn.py", line 135, in fit
**self.filter_sk_params(self.build_fn.__call__))
TypeError: __call__() takes at least 2 arguments (1 given)
Am I missing something? The grid search goes well with random forests, svm and logistic regression. I only have problems with Neural Networks.
By default, the grid search will only use one thread. By setting the n_jobs argument in the GridSearchCV constructor to -1, the process will use all cores on your machine. However, sometimes this may interfere with the main neural network training process.
The first one is the same as other conventional Machine Learning algorithms. The hyperparameters to tune are the number of neurons, activation function, optimizer, learning rate, batch size, and epochs. The second step is to tune the number of layers. This is what other conventional algorithms do not have.
The only difference between both the approaches is in grid search we define the combinations and do training of the model whereas in RandomizedSearchCV the model selects the combinations randomly. Both are very effective ways of tuning the parameters that increase the model generalizability.
Here the error indicates that the build_fn
needs to have 2 arguments as indicated from the # of parameters from param_grid
.
So you need to explicitly define an new function and use that as build_fn=make_model
def make_model(batch_size, nb_epoch):
model = Sequential()
model.add(Dense(in_size, input_dim=in_size, init='uniform', activation='relu'))
model.add(Dense(8, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
Also check keras/examples/mnist_sklearn_wrapper.py
where GridSearchCV
is used for hyper-parameter search.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With