Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can I log training loss via a hook with a LinearRegressor?

I'm quite new at TensorFlow. I'm using TF 1.8 for a 'simple' linear regression. The output of the exercise is the set of linear weights that best fit the data, rather than a prediction model. So I would like to track and log the current minimum loss during training, along with the corresponding value of the weights.

I'm trying to use a LinearRegressor:

tf.logging.set_verbosity(tf.logging.INFO)

model = tf.estimator.LinearRegressor(
    feature_columns = make_feature_cols(),
    model_dir = TRAINING_OUTDIR
)

# --------------------------------------------v
logger = tf.train.LoggingTensorHook({"loss": ???}, every_n_iter=10)
trainHooks = [logger]

model.train(
    input_fn = make_train_input_fn(df, num_epochs = nEpochs),
    hooks = trainHooks
)

The model doesn't seem to contain a variable for the loss.

Can I use the LoggingTensorHook somehow? In which case, how do I define the loss tensor?

I also tried implementing my own hook. Examples suggest registering the loss inside before_run by calling SessionRunArgs, but I run into the same question there.

Thanks!!

like image 794
Mau Avatar asked Nov 07 '22 10:11

Mau


1 Answers

I agree with @jdehesa that loss is not directly available without writing a custom model_fn. However, with LoggingTensorHook you can get features estimates at every step and calculate any loss or other training metric yourself. I suggest using formatter to handle tensor values available to the hook. In the example below I use LoggingTensorHook with custom formatter to output feature estimates and current MSE loss.

import numpy as np
import tensorflow as tf
tf.logging.set_verbosity(tf.logging.INFO)

"""prepare inputs - generate sample data"""
num_features = 5
features = ['f'+str(i) for i in range (num_features)]
X = np.random.randint(-1000,1000, (10000, num_features))
a = np.random.randint (2, 30, size = (num_features))/10
b = np.random.randint(-99, 99)/10
y = np.matmul(X, a) + b
noise = np.random.randn(*X.shape)
X = X + (noise * 1)
X.shape, y.shape, a, b
>> ((10000, 5), (10000,), array([2.1, 2. , 1.7, 0.5, 0.9]), 1.8)

""" create model """
feature_cols = [tf.feature_column.numeric_column(k) for k in features]  
X_dict = {features[i]:X[:,i] for i in range (num_features) }

TRAINING_OUTDIR = '.'
model = tf.estimator.LinearRegressor(
    model_dir = TRAINING_OUTDIR,
    feature_columns = feature_cols)

input_fn = tf.estimator.inputs.numpy_input_fn(
    X_dict, y, batch_size=512, num_epochs=50, shuffle=True,
    queue_capacity=1000, num_threads=1)

input_fn_predict = tf.estimator.inputs.numpy_input_fn(
    X_dict, batch_size=X.shape[0], shuffle=False)

"""create hook and formatter"""
feature_var_names = [f"linear/linear_model/{f}/weights" for f in features]
hook_vars_list = ['global_step', 'linear/linear_model/bias_weights'] + feature_var_names

def hooks_formatter (tensor_dict):
    step = tensor_dict['global_step']
    a_hat = [tensor_dict[feat][0][0] for feat in feature_var_names]
    b_hat = tensor_dict['linear/linear_model/bias_weights'][0]
    y_pred = np.dot (X, np.array(a_hat).T) + b_hat
    mse_loss =  np.mean((y - y_pred)**2)   # MSE
    line = f"step:{step}; MSE_loss: {mse_loss:.4f}; bias:{b_hat:.3f};"
    for f,w in zip (features, a_hat):
        line += f" {f}:{w:.3f};"
    return line
hook1 = tf.train.LoggingTensorHook(hook_vars_list, every_n_iter=10, formatter=hooks_formatter)

"""train"""
model.train(input_fn = input_fn, steps=100,hooks = [hook1])
>>>
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Saving checkpoints for 1 into ./model.ckpt.
INFO:tensorflow:step:1; MSE_loss: 3183865.8670; bias:0.200; f0:0.200; f1:0.200; f2:0.200; f3:0.200; f4:0.200;
INFO:tensorflow:loss = 1924836100.0, step = 1
INFO:tensorflow:step:11; MSE_loss: 1023556.4537; bias:0.359; f0:0.936; f1:0.944; f2:0.903; f3:0.521; f4:0.802;
INFO:tensorflow:step:21; MSE_loss: 468665.2052; bias:0.269; f0:1.294; f1:1.276; f2:1.202; f3:0.437; f4:0.857;
INFO:tensorflow:step:31; MSE_loss: 232310.3535; bias:0.292; f0:1.513; f1:1.491; f2:1.379; f3:0.528; f4:0.893;
INFO:tensorflow:step:41; MSE_loss: 118843.3051; bias:0.278; f0:1.671; f1:1.633; f2:1.491; f3:0.472; f4:0.898;
INFO:tensorflow:step:51; MSE_loss: 62416.4437; bias:0.272; f0:1.782; f1:1.735; f2:1.563; f3:0.505; f4:0.903;
INFO:tensorflow:step:61; MSE_loss: 32799.2320; bias:0.277; f0:1.865; f1:1.808; f2:1.611; f3:0.487; f4:0.899;
INFO:tensorflow:step:71; MSE_loss: 17619.6118; bias:0.270; f0:1.924; f1:1.861; f2:1.641; f3:0.510; f4:0.904;
INFO:tensorflow:step:81; MSE_loss: 9423.0092; bias:0.283; f0:1.970; f1:1.899; f2:1.661; f3:0.494; f4:0.900;
INFO:tensorflow:step:91; MSE_loss: 5062.2780; bias:0.285; f0:2.003; f1:1.927; f2:1.675; f3:0.503; f4:0.901;
INFO:tensorflow:Saving checkpoints for 100 into ./model.ckpt.
INFO:tensorflow:Loss for final step: 1693422.1.
<tensorflow.python.estimator.canned.linear.LinearRegressor at 0x7f90a590f240>
like image 188
Poe Dator Avatar answered Nov 14 '22 23:11

Poe Dator