Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I tune neural network architecture using KerasTuner?

I'm trying to use KerasTuner to automatically tune the neural network architecture, i.e., the number of hidden layers and the number of nodes in each hidden layer. Currently, the neural network architecture is defined using one parameter NN_LAYER_SIZES. For example,

NN_LAYER_SIZES = [128, 128, 128, 128]

indicates the NN has 4 hidden layers and each hidden layer has 128 nodes.

KerasTuner has the following hyperparameter types (https://keras.io/api/keras_tuner/hyperparameters/):

  • Int
  • Float
  • Boolean
  • Choice

It seems none of these hyperparameter types fits my use case. So I wrote the following code to scan the number of hidden layers and the number of nodes. However, it's not been recognized as a hyperparameter.

number_of_hidden_layer = hp.Int("layer_number", min_value=2, max_value=5, step=1)
number_of_nodes = hp.Int("node_number", min_value=4, max_value=8, step=1)
NN_LAYER_SIZES = [2**number_of_nodes for _ in range(number of hidden_layer)]

Any suggestions on how to make it right?

like image 356
CathyQian Avatar asked Nov 05 '22 23:11

CathyQian


1 Answers

Maybe treat the number of layers as a hyperparameter by iterating through it when building your model. That way you can experiment with different numbers of layers combined with different numbers of nodes:

import tensorflow as tf
import keras_tuner as kt

def model_builder(hp):
  model = tf.keras.Sequential()
  model.add(tf.keras.layers.Flatten(input_shape=(28, 28)))

  units = hp.Int('units', min_value=32, max_value=512, step=32)
  layers = hp.Int('layers', min_value=2, max_value=5, step=1)

  for _ in range(layers):
    model.add(tf.keras.layers.Dense(units=units, activation='relu')) 

  model.add(tf.keras.layers.Dense(10))

  model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
                loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
                metrics=['accuracy'])
  return model

(img_train, label_train), (_, _) = tf.keras.datasets.fashion_mnist.load_data()
img_train = img_train.astype('float32') / 255.0

tuner = kt.Hyperband(model_builder,
                     objective='val_accuracy',
                     max_epochs=10,
                     factor=3)

tuner.search(img_train, label_train, epochs=50, validation_split=0.2)
best_hps=tuner.get_best_hyperparameters(num_trials=1)[0]

model = tuner.hypermodel.build(best_hps)
history = model.fit(img_train, label_train, epochs=50, validation_split=0.2)
like image 185
AloneTogether Avatar answered Nov 12 '22 12:11

AloneTogether