I am implementing an MLP in Keras, and tweaking the hyperparameters. One object of experimentation is the learning rate. There are two schedules I'm trying to use, both outlined in this tutorial. One is specifically defined using learning rate / epochs, and one uses a separately-defined step decay function. The necessary code is below.
The error is 'The output of the "schedule" function should be float'. I specifically cast the learning rate as a float, so I'm not sure where I'm going wrong?
EDIT: original code was not a MWE, I apologize. To reproduce this error, you can save the data snippets below and run this code.
import numpy as np
import sys, argparse, keras, string
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.callbacks import LearningRateScheduler, EarlyStopping
from keras.optimizers import SGD
from keras.constraints import maxnorm
def load_data(data_file, test_file):
dataset = np.loadtxt(data_file, delimiter=",")
# split into input (X) and output (Y) variables
X = dataset[:, 0:(dataset.shape[1]-2)]
Y = dataset[:, dataset.shape[1]-1]
Y = Y - 1
testset = np.loadtxt(test_file, delimiter=",")
X_test = testset[:, 0:(testset.shape[1]-2)]
Y_test = testset[:, testset.shape[1]-1]
Y_test = Y_test - 1
return (X, Y, X_test, Y_test)
def mlp_keras(data_file, test_file, save_file, num_layers, num_units_per_layer, learning_rate_, epochs_, batch_size_):
history = History()
seed = 7
np.random.seed(seed)
X, y_binary, X_test, ytest = load_data(data_file, test_file)
d1 = True
### create model
model = Sequential()
model.add(Dense(num_units_per_layer[0], input_dim=X.shape[1], init='uniform', activation='relu', W_constraint=maxnorm(3)))
model.add(Dropout(0.2))
model.add(Dense(num_units_per_layer[1], init='uniform', activation = 'relu', W_constraint=maxnorm(3))) #W_constraint for dropout
model.add(Dropout(0.2))
model.add(Dense(1, init='uniform', activation='sigmoid'))
def step_decay(epoch):
drop_every = 10
decay_rate = (learning_rate_*np.power(0.5, np.floor((1+drop_every)/drop_every))).astype('float32')
return decay_rate
earlyStopping = EarlyStopping(monitor='val_loss', patience=2)
sgd = SGD(lr = 0.0, momentum = 0.8, decay = 0.0, nesterov=False)
model.compile(loss='binary_crossentropy', optimizer=sgd, metrics=['accuracy'])
if d1 == True:
lrate = LearningRateScheduler(step_decay)
else:
lrate = (learning_rate_/epochs).astype('float32')
callbacks_list = [lrate, earlyStopping]
## Fit the model
hist = model.fit(X, y_binary, validation_data=(X_test, ytest), nb_epoch=epochs_, batch_size=batch_size_, callbacks=callbacks_list) #48 batch_size, 2 epochs
scores = model.evaluate(X, y_binary)
print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
if __name__ == '__main__':
m1 = mlp_keras('train_rows.csv', 'test_rows.csv', 'res1.csv', 2, [100, 100], 0.001, 10, 10)
Error message:
File "/user/pkgs/anaconda2/lib/python2.7/site-packages/keras/callbacks.py", line 435, in on_epoch_begin
assert type(lr) == float, 'The output of the "schedule" function should be float.'
AssertionError: The output of the "schedule" function should be float.
Data snippet (train_ex.csv):
1,21,38,33,20,8,8,6,4,0,1,1,1,2,1,1,0,2,1,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1
1,19,29,26,28,13,6,7,3,2,4,4,3,2,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1
1,22,21,22,15,11,12,9,4,6,4,5,4,2,1,0,4,1,0,0,1,2,2,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2
1,18,24,14,17,6,14,10,5,7,4,2,4,1,4,2,0,3,4,1,3,3,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2
Data snippet (test_ex.csv):
1,16,30,40,44,8,7,1,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1
1,19,32,16,18,32,5,7,4,6,1,1,0,2,1,0,1,0,1,0,2,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1
1,29,55,21,11,6,6,7,8,5,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2
1,23,18,11,16,10,7,5,7,9,3,7,8,5,3,4,0,3,3,3,0,1,0,0,0,1,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2
EDIT 2:
Based on @sascha's comments, I've tried modifying a bit (this is the relevant section below). Same error.
def step_decay(epoch):
drop_every = 10
decay_rate = (learning_rate_*np.power(0.5, np.floor((1+drop_every)/drop_every))).astype('float32')
return decay_rate
def step_exp_decay(epoch):
return (learning_rate_/epochs).astype('float32')
earlyStopping = EarlyStopping(monitor='val_loss', patience=2)
sgd = SGD(lr = 0.0, momentum = 0.8, decay = 0.0, nesterov=False)
model.compile(loss='binary_crossentropy', optimizer=sgd, metrics=['accuracy'])
if d1 == True:
lrate = LearningRateScheduler(step_decay)
else:
lrate = LearningRateScheduler(step_exp_decay)
You can also try to check out the ReduceLROnPlateau callback to reduce the learning rate by a pre-defined factor, if a monitored value has not changed for a certain number of epochs, e.g. half the learning rate if the validation accuracy has not improved for five epochs looks like this:
learning_rate_reduction = ReduceLROnPlateau(monitor='val_acc',
patience=5,
verbose=1,
factor=0.5,
min_lr=0.0001)
model.fit_generator(..., callbacks=[learning_rate_reduction], ...)
First of all: i misinterpreted your code earlier and my comments are deprecated! Sorry!.
The error-message leads us to the real problem here!
You define your scheduler like this:
def step_decay(epoch):
drop_every = 10
decay_rate = (learning_rate_*np.power(0.5, np.floor((1+drop_every)/drop_every))).astype('float32')
return decay_rate
Check the type it returns!* It's <class 'numpy.float32'>
. (Try it: with python's type()
function)
Somehow keras is not doing a very general check for these types and expects <class 'float'>
(python's native float).
Just convert your numpy-float to a native python-float:
Replace: decay_rate = (learning_rate_*np.power(0.5, np.floor((1+drop_every)/drop_every))).astype('float32')
with: decay_rate = (learning_rate_*np.power(0.5, np.floor((1+drop_every)/drop_every))).astype('float32').item()
Read the docs of numpy.ndarray.item (especially the notes on the why of this behaviour)
The blog-author does not have this problem as he is not using numpy within his scheduler and uses python's math-functions. This will result in a native-float!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With