I am currently working at a slightly bigger TensorFlow project and tried to visualize certain variables of my network as usual, i.e. doing this workflow
tf.summary.scalar('loss', loss)
summary_op = tf.summary.merge_all()
writer = tf.summary.FileWriter('PATH')
and adding the graphs = sess.run(summary_op)
writer.add_summary(s, epoch)
Usually this does the job for me. But this time, I only got the graph showing up and when I inspected the event-file, I found it to be empty. By coincidence, I found somebody suggesting to use writer.flush()
after adding my summary as a 6th step. This resolved my problem.
As a consequence, the logical follow-up question is: when and how do I have to use FileWriter.flush()
to make tensorflow work correctly?
You can call flush
whenever you want, really. This is probably clear for you, but just in case (and for other readers), FileWriter
does not immediately write the given summaries to disk. This is because writing to disk is relatively slow, and if you are creating very frequent summaries (i.e. each batch) doing so could damage your performance, so FileWriter
keeps a buffer of events that get written only "every once in a while" (and finally when it is closed). However, this means that TensorBoard will not see the written summaries immediately. flush
is there to force the FileWriter
to write to disk whatever it has in memory.
If you are producing summaries with low frequency (e.g. every 100 batches), it is probably fine to just call flush
after add_summary
. If you produce summaries constantly, but are not satisfied with the syncing frequency of FileWriter
, you can have a flush
for example every once in ten times, or something like that. I mean, you can also have it on every single iteration, and it may not make such a big difference, but it will not give you such a big benefit either. Of course, any potential performance impact will depend on your problem and your infrastructure (it is not the same logging scalars than images, or logging to a local SSD drive than to network storage, or having big, slow batches of hundreds of elements than tiny, fast batches of just a few).
In general, though, it is rarely a significant performance factor. For simple scenarios, the suggestion you got of adding flush
after add_summary
(one flush
, if you have several calls to add_summary
do not flush
after each of them, only after the last one) is most likely good enough.
EDIT: tf.summary.FileWriter
actually offers a construction parameter, flush_secs
, that defines how frequently the writer will automatically flush pending events to disk. By default it is two minutes. There is also a max_queue
, which defines the size of the internal events queue (the buffer of events).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With