Consider this simple graph + session definition. Suppose I want to tune hyper params (learning rate and drop out keep probability) with a random search? What is the recommended way to implement it?
graph = tf.Graph()
with graph.as_default():
# Placeholders
data = tf.placeholder(tf.float32,shape=(None, img_h, img_w, num_channels),name='data')
labels = ...
dropout_keep_prob = tf.placeholder(tf.float32, name='keep_prob')
learning_rate = tf.placeholder(tf.float32, name='learning_rate')
# model architecture...
with tf.Session(graph=graph) as session:
tf.initialize_all_variables().run()
for step in range(num_steps):
offset = (step * batch_size) % (train_length.shape[0] - batch_size)
# Generate a minibatch.
batch_data = train_images[offset:(offset + batch_size), :]
#...
feed_train = {data: batch_data,
#...
learning_rate: 0.001,
keep_prob : 0.7
}
I tried putting everything inside a function
def run_model(learning_rate,keep_prob):
graph = tf.Graph()
with graph.as_default():
# graph here...
with tf.Session(graph=graph) as session:
tf.initialize_all_variables().run()
# session here...
But I ran into scope issues (I am not very familiar with scopes in Python/Tensoflow). Is there a best practice to achieve this?
The scikit-learn Python open-source machine learning library provides techniques to tune model hyperparameters. Specifically, it provides the RandomizedSearchCV for random search and GridSearchCV for grid search.
Hyperopt ends up a bit slower than Random Search, but note the significantly lower number of iterations it took to get to the optimum. Also, it manages to get to a relatively better score on the test set. This is why you would want to use Hyperopt. However, do keep in mind that Hyperopt not always ends up on top.
The hyperparameters to tune are the number of neurons, activation function, optimizer, learning rate, batch size, and epochs. The second step is to tune the number of layers. This is what other conventional algorithms do not have.
I implemented random search of hyper-parameter in a similar way, and things worked out fine. Basically what I did was I have a function general random hyper-parameters outside of graph and session. I wrapped the graph and session into a function as you did, and I passed on the generated hyper-parameters. See the code for better illustration.
def generate_random_hyperparams(lr_min, lr_max, kp_min, kp_max):
'''generate random learning rate and keep probability'''
# random search through log space for learning rate
random_learng_rate = 10**np.random.uniform(lr_min, lr_max)
random_keep_prob = np.random.uniform(kp_min, kp_max)
return random_learning_rate, random_keep_prob
I suspect the scope issue you are running into (since you didn't provide the exact error message I can only speculate) is caused by some careless naming... I would modify how you are naming variables in your run_model
function.
def run_model(random_learning_rate,random_keep_prob):
# Note that the arguments is named differently from the placeholders in the graph
graph = tf.Graph()
with graph.as_default():
# graph here...
learning_rate = tf.placeholder(tf.float32, name='learning_rate')
keep_prob = tf.placeholder(tf.float32, name='keep_prob')
# other operation ...
with tf.Session(graph=graph) as session:
tf.initialize_all_variables().run()
# session here...
feed_train = {data: batch_data,
#placeholder variable names as dict key, python value variables as dict value
learning_rate: random_learning_rate,
keep_prob : random_keep_prob
}
# evaluate performance with random_learning_rate and random_keep_prob
performance = session.run([...], feed_dict = feed_train)
return performance
Remember to use different variable names to name the tf.placeholders and the ones carrying the real python values.
The usage of above snippets would be something like:
performance_records = {}
for i in range(10): # random search hyper-parameter space 10 times
random_learning_rate, random_keep_prob = generate_random_hyperparams(-5, -1, 0.2, 0.8)
performance = run_model(random_learning_rate, random_keep_prob)
performance_records[(random_learning_rate, random_keep_prob)] = performance
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With