Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tensorboard histograms to matplotlib

I would like to "dump" the tensorboard histograms and plot them via matplotlib. I would have more scientific paper appealing plots.

I managed to hack the way through the Summary file using the tf.train.summary_iterator and dump the histogram that I wanted to dump( tensorflow.core.framework.summary_pb2.HistogramProto object). By doing that and implementing what the java-script code does with the data (https://github.com/tensorflow/tensorboard/blob/c2fe054231fe77f3a5b05dbc519f713d2e738d1c/tensorboard/plugins/histogram/tf_histogram_dashboard/histogramCore.ts#L104), I managed to get something similar (same trends) with the tensorboard plots, but not the exact same plot.

Can I have some light on this?

Thanks

like image 321
Tiago Freitas Pereira Avatar asked Jan 31 '18 17:01

Tiago Freitas Pereira


2 Answers

In order to plot a tensorboard histogram with matplotlib I am doing the following:

event_acc = EventAccumulator(path, size_guidance={
    'histograms': STEP_COUNT,
})
event_acc.Reload()
tags = event_acc.Tags()
result = {}
for hist in tags['histograms']:
    histograms = event_acc.Histograms(hist)
    result[hist] = np.array([np.repeat(np.array(h.histogram_value.bucket_limit), np.array(h.histogram_value.bucket).astype(np.int)) for h in histograms])
return result

h.histogram_value.bucket_limit gives me the value and h.histogram_value.bucket the count of this value. So when i repeat the values accordingly (np.repeat(...)), I get a huge array of expected size. This array can now be plotted with the default matplotlib logic.

like image 156
khuesmann Avatar answered Jan 03 '23 06:01

khuesmann


A good solution is the one from @khuesmann, but this only allows you to retrieve the accumulated histogram, not the histogram per step -- which is the one actually being showed in tensorboard.

If you want the distribution and so far, what I have understood is that Tensorboard usually compresses the histogram to decrease the memory used to store the data -- imagine storing a 2D histogram over 4 million steps, the memory can increase fast quickly. These compress histograms are accessible by doing this:

from tensorboard.backend.event_processing.event_accumulator import EventAccumulator

n2n = EventAccumulator(PATH)
n2n.Reload()

# Check the tags under histograms and choose the one you want
n2n.Tags()

# This will give you the list used by tensorboard 
# of the compress histograms by timestep and wall time
n2n.CompressedHistograms(HISTOGRAM_TAG)

The only problem is that it compresses the histogram to five percentiles (in Basic points they are 0, 668, 1587, 3085, 5000, 6915, 8413, 9332, 10000) which corresponds to (-Inf, -1.5, -1, -0.5, 0, 0.5, 1, 1.5, Inf) in standard deviations. Check the code here.

I haven't read much, but it wouldn't be hard to reconstruct the temporal histograms that tensorboard shows. If I find a way to do it, I will post it here.

like image 41
Mike W Avatar answered Jan 03 '23 05:01

Mike W