Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to print progress when training a DNNClassifier in tensorflow r0.9 (skflow)?

I couldn't get that a DNNClassifier prints the progress while training, ie, loss and validation score. As I understood the loss can be printed using the config parameter that inherits from BaseEstimator, but when I passed a RunConfig object, the classifier didn't printed anything.

from tensorflow.contrib.learn.python.learn.estimators import run_config

config = run_config.RunConfig(verbose=1)
classifier = learn.DNNClassifier(hidden_units=[10, 20, 10],
                             n_classes=3,
                             config=config)
classifier.fit(X_train, y_train, steps=1000)

Am I missing something? I checked how RunConfig handles the verbose parameter and it seems that it only cares if its greater than 1, which doesn't match with the documentation:

verbose: Controls the verbosity, possible values: 0: the algorithm and debug information is muted. 1: trainer prints the progress. 2: log device placement is printed.

As for the validation score I thought that using monitors.ValidationMonitor would be just fine, but when tried it, the classifier doesn't print anything, also nothing happens when tried to use early_stopping_rounds. I search for documentation or some comments in the source code but I couldn't find any for monitors.

like image 773
Ismael Avatar asked Jun 13 '16 18:06

Ismael


2 Answers

Adding these before the fit function shows the progress:

import logging
logging.getLogger().setLevel(logging.INFO)

Sample:

INFO:tensorflow:global_step/sec: 0
INFO:tensorflow:Training steps [0,1000000)
INFO:tensorflow:Step 1: loss = 10.5043
INFO:tensorflow:training step 100, loss = 10.45380 (0.223 sec/batch).
INFO:tensorflow:Step 101: loss = 10.5623
INFO:tensorflow:training step 200, loss = 10.46701 (0.220 sec/batch).
INFO:tensorflow:Step 201: loss = 10.3885
INFO:tensorflow:training step 300, loss = 10.36501 (0.232 sec/batch).
INFO:tensorflow:Step 301: loss = 10.3441
INFO:tensorflow:training step 400, loss = 10.44571 (0.220 sec/batch).
INFO:tensorflow:Step 401: loss = 10.396
INFO:tensorflow:global_step/sec: 3.95
like image 157
user3701366 Avatar answered Oct 23 '22 03:10

user3701366


Add this line before training:

import logging
tf.logging.set_verbosity(tf.logging.INFO)
like image 28
Roby Avatar answered Oct 23 '22 03:10

Roby