Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

When do I have to use TensorFlow's FileWriter.flush() method?

I am currently working at a slightly bigger TensorFlow project and tried to visualize certain variables of my network as usual, i.e. doing this workflow

  1. declare what variables i want to track via tf.summary.scalar('loss', loss)
  2. collect them via summary_op = tf.summary.merge_all()
  3. declare my writer as writer = tf.summary.FileWriter('PATH') and adding the graph
  4. evaluate the summary operation during my training iterations via s = sess.run(summary_op)
  5. and finally adding it to my writer via writer.add_summary(s, epoch)

Usually this does the job for me. But this time, I only got the graph showing up and when I inspected the event-file, I found it to be empty. By coincidence, I found somebody suggesting to use writer.flush() after adding my summary as a 6th step. This resolved my problem.

As a consequence, the logical follow-up question is: when and how do I have to use FileWriter.flush() to make tensorflow work correctly?

like image 202
DocDriven Avatar asked Sep 24 '18 16:09

DocDriven


1 Answers

You can call flush whenever you want, really. This is probably clear for you, but just in case (and for other readers), FileWriter does not immediately write the given summaries to disk. This is because writing to disk is relatively slow, and if you are creating very frequent summaries (i.e. each batch) doing so could damage your performance, so FileWriter keeps a buffer of events that get written only "every once in a while" (and finally when it is closed). However, this means that TensorBoard will not see the written summaries immediately. flush is there to force the FileWriter to write to disk whatever it has in memory.

If you are producing summaries with low frequency (e.g. every 100 batches), it is probably fine to just call flush after add_summary. If you produce summaries constantly, but are not satisfied with the syncing frequency of FileWriter, you can have a flush for example every once in ten times, or something like that. I mean, you can also have it on every single iteration, and it may not make such a big difference, but it will not give you such a big benefit either. Of course, any potential performance impact will depend on your problem and your infrastructure (it is not the same logging scalars than images, or logging to a local SSD drive than to network storage, or having big, slow batches of hundreds of elements than tiny, fast batches of just a few).

In general, though, it is rarely a significant performance factor. For simple scenarios, the suggestion you got of adding flush after add_summary (one flush, if you have several calls to add_summary do not flush after each of them, only after the last one) is most likely good enough.

EDIT: tf.summary.FileWriter actually offers a construction parameter, flush_secs, that defines how frequently the writer will automatically flush pending events to disk. By default it is two minutes. There is also a max_queue, which defines the size of the internal events queue (the buffer of events).

like image 74
jdehesa Avatar answered Oct 14 '22 02:10

jdehesa