How to write to TensorBoard in TensorFlow 2

Tags:

I'm quite familiar in TensorFlow 1.x and I'm considering to switch to TensorFlow 2 for an upcoming project. I'm having some trouble understanding how to write scalars to TensorBoard logs with eager execution, using a custom training loop.

Problem description

In tf1 you would create some summary ops (one op for each thing you would want to store), which you would then merge into a single op, run that merged op inside a session and then write this to a file using a FileWriter object. Assuming sess is our tf.Session(), an example of how this worked can be seen below:

# While defining our computation graph, define summary ops:
# ... some ops ...
tf.summary.scalar('scalar_1', scalar_1)
# ... some more ops ...
tf.summary.scalar('scalar_2', scalar_2)
# ... etc.

# Merge all these summaries into a single op:
merged = tf.summary.merge_all()

# Define a FileWriter (i.e. an object that writes summaries to files):
writer = tf.summary.FileWriter(log_dir, sess.graph)

# Inside the training loop run the op and write the results to a file:
for i in range(num_iters):
    summary, ... = sess.run([merged, ...], ...)
    writer.add_summary(summary, i)

The problem is that sessions don't exist anymore in tf2 and I would prefer not disabling eager execution to make this work. The official documentation is written for tf1 and all references I can find suggest using the Tensorboard keras callback. However, as far as I know, this only works if you train the model through model.fit(...) and not through a custom training loop.

What I've tried

The tf1 version of tf.summary functions, outside of a session. Obviously any combination of these functions fails, as FileWriters, merge_ops, etc. don't even exist in tf2.
This medium post states that there has been a "cleanup" in some tensorflow APIs including tf.summary(). They suggest using from tensorflow.python.ops.summary_ops_v2, which doesn't seem to work. This implies using a record_summaries_every_n_global_steps; more on this later.
A series of other posts 1, 2, 3, suggest using the tf.contrib.summary and tf.contrib.FileWriter. However, tf.contrib has been removed from the core TensorFlow repository and build process.
A TensorFlow v2 showcase from the official repo, which again uses the tf.contrib summaries along with the record_summaries_every_n_global_steps mentioned previously. I couldn't make this to work either (even without using the contrib library).

tl;dr

My questions are:

Is there a way to properly use tf.summary in TensroFlow 2?
If not, is there another way to write TensorBoard logs in TensorFlow 2, when using a custom training loop (not model.fit())?

904

asked Jul 10 '19 00:07

Javier

1 Answers

Yes, there is a simpler and more elegant way to use summaries in TensorFlow v2.

First, create a file writer that stores the logs (e.g. in a directory named log_dir):

writer = tf.summary.create_file_writer(log_dir)

Anywhere you want to write something to the log file (e.g. a scalar) use your good old tf.summary.scalar inside a context created by the writer. Suppose you want to store the value of scalar_1 for step i:

with writer.as_default():
    tf.summary.scalar('scalar_1', scalar_1, step=i)

You can open as many of these contexts as you like inside or outside of your training loop.

Example:

# create the file writer object
writer = tf.summary.create_file_writer(log_dir)

for i, (x, y) in enumerate(train_set):

    with tf.GradientTape() as tape:
        y_ = model(x)
        loss = loss_func(y, y_)

    grads = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(grads, model.trainable_variables))

    # write the loss value
    with writer.as_default():
        tf.summary.scalar('training loss', loss, step=i+1)

103

answered Oct 11 '22 23:10

Djib2011

Related questions
                            
                                How do I embed a Flask-Security login form on my page?
                            
                                disk I/O error with SQLite3 in Python 3 when writing to a database
                            
                                Why is this warning "Expected type 'int' (matched generic type '_T'), got 'Dict[str, None]' instead"?
                            
                                How to display a pandas dataframe as datatable?
                            
                                Running Flask & a Discord bot in the same application
                            
                                Empty class with comment same as pass?
                            
                                How to cancel the effect of numpy seed()?
                            
                                Massive overfit during resnet50 transfer learning
                            
                                How can I specify the figsize of a graphviz representation of a decision tree?
                            
                                python pytest occasionally fails with OSError: reading from stdin while output is captured
                            
                                Does EarlyStopping in Keras save the best model?
                            
                                Prevent pip from installing some dependencies
                            
                                Efficient way to add new column to pandas dataframe
                            
                                Order bar chart in Altair?
                            
                                How to set an Array column with an empty array as default in SQLAlchemy + Postgres
                            
                                Repeating array with transformation
                            
                                pandas.read_feather got an unexpected argument nthreads
                            
                                Where in Django can I run startup code that requires models?
                            
                                Is it possible to use pandas.DataFrame.rolling with a step greater than 1?
                            
                                When should I use 'path' over 're_path'?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to write to TensorBoard in TensorFlow 2

Tags:

python

machine-learning

tensorflow

deep-learning

keras