Design patterns for tensorflow models

Tags:

tensorflow

I created several simple models, mostly based on some tutorials. From what I've done, I feel that models are quite hard to reuse, and I feel that I need to create some structure with classes to encapsulate the models.

What are the 'standard' ways of structuring tensorflow models? Are there any coding conventions/best practices for this?

596

asked Apr 24 '17 04:04

Konstantin Solomatov

1 Answers

Throughout the Tensorflow examples and tutorials the prominent pattern for structuring model code is, to split the model up into three functions:

inference(inputs, ...) which builds the model
loss(logits, ...) which adds the loss on top of the logits
train(loss, ...) which adds training ops

When creating a model for training, your code would look something like this:

inputs = tf.placeholder(...)
logits = mymodel.inference(inputs, ...)
loss = mymodel.loss(logits, ...)
train = mymodel.train(loss, ...)

This pattern is used for the CIFAR-10 Tutorial for example (code, tutorial).

One thing one might stumble over is the fact that you cannot share (Python) variables between the inference and the loss function. That is not a big issue though, since Tensorflow provides Graph collections for exactly this use case, making for a much cleaner design (since it makes you group your things logically). One major use case for this is regularization:

If you are using the layers module (e.g. tf.layers.conv2d) you already have what you need, since all regularization penalties will be added (source) to the collection tf.GraphKeys.REGULARIZATION_LOSSES by default. For example when you do this:

conv1 = tf.layers.conv2d(
    inputs,
    filters=96,
    kernel_size=11,
    strides=4,
    activation=tf.nn.relu,
    kernel_initializer=tf.truncated_normal_initializer(stddev=0.01),
    kernel_regularizer=tf.contrib.layers.l2_regularizer(),
    name='conv1')

Your loss could look like this then:

def loss(logits, labels):
    softmax_loss = tf.losses.softmax_cross_entropy(
        onehot_labels=labels,
        logits=logits)

    regularization_loss = tf.add_n(tf.get_collection(
        tf.GraphKeys.REGULARIZATION_LOSSES)))

    return tf.add(softmax_loss, regularization_loss)

If you are not using the layers module, you would have to populate the collection manually (just like in the linked source snippet). Basically you want to add the penalties to the collection using tf.add_to_collection:

tf.add_to_collection(tf.GraphKeys.REGULARIZATION_LOSSES, reg_penalty)

With this you can calculate the loss including the regularization penalties just like above.

147

answered Oct 22 '22 15:10

thertweck

Related questions
                            
                                How to get code completion for Tensorflow in PyCharm?
                            
                                How to clear GPU memory occupied by zombie process if it's parent is init?
                            
                                How to Fine tune existing Tensorflow Object Detection model to recognize additional classes? [closed]
                            
                                When to use tf.resource and tf.variant?
                            
                                Tensorflow Estimator: Cache bottlenecks
                            
                                Exact model converging on keras-tf but not on keras
                            
                                looping through dataset once at test time in tensorflow
                            
                                Wide & Deep learning for large data error: GraphDef cannot be larger than 2GB
                            
                                Training of keras model get's slower after each repetition
                            
                                Is there a way to use tensorflow map_fn on GPU?
                            
                                Keras custom loss implementation : ValueError: An operation has `None` for gradient
                            
                                Tensorflow 2.0 Keras is training 4x slower than 2.0 Estimator
                            
                                Why does my keras LSTM model get stuck in an infinite loop?
                            
                                Can ReLU handle a negative input?
                            
                                PyTorch equivalence for softmax_cross_entropy_with_logits
                            
                                Graph optimizations on a tensorflow serveable created using tf.Estimator
                            
                                how is total loss calculated over multiple classes in Keras?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With