I'm reading the book Deep Learning with Python which uses Keras. In chapter 7, it shows how to use TensorBoard to monitor the training phase progress with an example: <pre class="prettyprint"><code>import keras from keras import layers from keras.datasets import imdb from keras.preprocessing import sequence max_features = 2000 max_len = 500 (x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features) x_train = sequence.pad_sequences(x_train, maxlen=max_len) x_test = sequence.pad_sequences(x_test, maxlen=max_len) model = keras.models.Sequential() model.add(layers.Embedding(max_features, 128, input_length=max_len, name='embed')) model.add(layers.Conv1D(32, 7, activation='relu')) model.add(layers.MaxPooling1D(5)) model.add(layers.Conv1D(32, 7, activation='relu')) model.add(layers.GlobalMaxPooling1D()) model.add(layers.Dense(1)) model.summary() model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc']) callbacks = [ keras.callbacks.TensorBoard( log_dir='my_log_dir', histogram_freq=1, embeddings_freq=1, ) ] history = model.fit(x_train, y_train, epochs=20, batch_size=128, validation_split=0.2, callbacks=callbacks) </code></pre> Apparently, the Keras library has gone through some changes since this code raises some exception: <pre class="prettyprint"><code>ValueError: To visualize embeddings, embeddings_data must be provided. </code></pre> This is after the first epoch is done and the first time the callbacks are run (the first time TensorBoard is run). I know that what is missing is the TensorBoard's parameter <code>embeddings_data</code>. But I don't know what should I assign to it. Does anyone have a working example for this? Here are the versions I'm using: <pre class="prettyprint"><code>Python: 3.6.5 Keras: 2.2.0 Tensorflow: 1.9.0 </code></pre> [UPDATE] In order to test any possible solution, I tested this: <pre class="prettyprint"><code>import numpy as np callbacks = [ keras.callbacks.TensorBoard( log_dir='my_log_dir', histogram_freq = 1, embeddings_freq = 1, embeddings_data = np.arange(0, max_len).reshape((1, max_len)), ) ] history = model.fit(x_train, y_train, epochs=20, batch_size=128, validation_split=0.2, callbacks=callbacks) </code></pre> This is the only way I could populate <code>embeddings_data</code> which won't lead to an error. But even though, this does not help either. Still the <code>PROJECTOR</code> tab of the TensorBoard is empty: <img src="https://i.stack.imgur.com/DpuUi.png" alt="enter image description here"> Any help is appreciated.

I'm also reading the book "Deep Learning with Python" which uses Keras. Here is my solution to this question. First，I try this code: <pre class="prettyprint"><code>callbacks = [keras.callbacks.TensorBoard( log_dir = 'my_log_dir', histogram_freq = 1, embeddings_freq = 1, embeddings_data = x_train, )] history = model.fit(x_train, y_train, epochs=2, batch_size=128, validation_split=0.2, callbacks=callbacks) </code></pre> But there is an error: ResourceExhaustedError. Because there are 25000 samples in "x_train", it is hard to embedding all of them on my old notebook. So next I try to embedding the first 100 samples of "x_train", and it makes sense. The code and result are showed here. <pre class="prettyprint"><code>callbacks = [keras.callbacks.TensorBoard( log_dir = 'my_log_dir', histogram_freq = 1, embeddings_freq = 1, embeddings_data = x_train[:100], )] history = model.fit(x_train, y_train, epochs=2, batch_size=128, validation_split=0.2, callbacks=callbacks) </code></pre> Projector of 100 samples Note that in the projector, "Points: 100" means there are 100 samples, and "Dimension: 64000" means the embedding vector length for one sample is 64000. There are 500 words in one sample, as "max_len = 500", and there is a 128_dim vector for each word, so 500 * 128 = 64000.

Yes that is correct, you need to provide what to embed for the visualisation using the <code>embeddings_data</code> argument: <pre class="prettyprint"><code>callbacks = [ keras.callbacks.TensorBoard( log_dir='my_log_dir', histogram_freq=1, embeddings_freq=1, embeddings_data=np.array([3,4,2,5,2,...]), ) ] </code></pre> <blockquote> embeddings_data: data to be embedded at layers specified in embeddings_layer_names. Numpy array (if the model has a single input) or list of Numpy arrays (if the model has multiple inputs). </blockquote> Have a look at the documentation for updated information on what those arguments are.

How to use TensorBoard with Keras in Python for visualizing embeddings

Tags:

python-3.x

keras

tensorboard

I'm reading the book Deep Learning with Python which uses Keras. In chapter 7, it shows how to use TensorBoard to monitor the training phase progress with an example:

import keras
from keras import layers
from keras.datasets import imdb
from keras.preprocessing import sequence

max_features = 2000
max_len = 500
(x_train, y_train), (x_test, y_test) = imdb.load_data(num_words=max_features)
x_train = sequence.pad_sequences(x_train, maxlen=max_len)
x_test = sequence.pad_sequences(x_test, maxlen=max_len)

model = keras.models.Sequential()
model.add(layers.Embedding(max_features, 128, input_length=max_len, name='embed'))
model.add(layers.Conv1D(32, 7, activation='relu'))
model.add(layers.MaxPooling1D(5))
model.add(layers.Conv1D(32, 7, activation='relu'))
model.add(layers.GlobalMaxPooling1D())
model.add(layers.Dense(1))
model.summary()

model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])

callbacks = [
    keras.callbacks.TensorBoard(
        log_dir='my_log_dir',
        histogram_freq=1,
        embeddings_freq=1,
    )
]
history = model.fit(x_train, y_train, epochs=20, batch_size=128, validation_split=0.2, callbacks=callbacks)

Apparently, the Keras library has gone through some changes since this code raises some exception:

ValueError: To visualize embeddings, embeddings_data must be provided.

This is after the first epoch is done and the first time the callbacks are run (the first time TensorBoard is run). I know that what is missing is the TensorBoard's parameter embeddings_data. But I don't know what should I assign to it.

Does anyone have a working example for this?

Here are the versions I'm using:

Python: 3.6.5
Keras: 2.2.0
Tensorflow: 1.9.0

[UPDATE]

In order to test any possible solution, I tested this:

import numpy as np

callbacks = [
    keras.callbacks.TensorBoard(
        log_dir='my_log_dir',
        histogram_freq = 1,
        embeddings_freq = 1,
        embeddings_data = np.arange(0, max_len).reshape((1, max_len)),
    )
]
history = model.fit(x_train, y_train, epochs=20, batch_size=128, validation_split=0.2, callbacks=callbacks)

This is the only way I could populate embeddings_data which won't lead to an error. But even though, this does not help either. Still the PROJECTOR tab of the TensorBoard is empty:

enter image description here

Any help is appreciated.

677

asked Sep 02 '18 16:09

Mehran

2 Answers

I'm also reading the book "Deep Learning with Python" which uses Keras. Here is my solution to this question. First，I try this code:

callbacks = [keras.callbacks.TensorBoard(
    log_dir = 'my_log_dir',
    histogram_freq = 1,
    embeddings_freq = 1,
    embeddings_data = x_train,
)]
history = model.fit(x_train, y_train, epochs=2, batch_size=128, validation_split=0.2, callbacks=callbacks)

But there is an error: ResourceExhaustedError.

Because there are 25000 samples in "x_train", it is hard to embedding all of them on my old notebook. So next I try to embedding the first 100 samples of "x_train", and it makes sense.

The code and result are showed here.

callbacks = [keras.callbacks.TensorBoard(
    log_dir = 'my_log_dir',
    histogram_freq = 1,
    embeddings_freq = 1,
    embeddings_data = x_train[:100],
)]
history = model.fit(x_train, y_train, epochs=2, batch_size=128, validation_split=0.2, callbacks=callbacks)

Projector of 100 samples

Note that in the projector, "Points: 100" means there are 100 samples, and "Dimension: 64000" means the embedding vector length for one sample is 64000. There are 500 words in one sample, as "max_len = 500", and there is a 128_dim vector for each word, so 500 * 128 = 64000.

184

answered Sep 28 '22 05:09

ttigong

Yes that is correct, you need to provide what to embed for the visualisation using the embeddings_data argument:

callbacks = [
    keras.callbacks.TensorBoard(
        log_dir='my_log_dir',
        histogram_freq=1,
        embeddings_freq=1,
        embeddings_data=np.array([3,4,2,5,2,...]),
    )
]

embeddings_data: data to be embedded at layers specified in embeddings_layer_names. Numpy array (if the model has a single input) or list of Numpy arrays (if the model has multiple inputs).

Have a look at the documentation for updated information on what those arguments are.

answered Sep 28 '22 05:09

nuric

Related questions
                            
                                Issue with smtplib sending mail with unicode characters in Python 3.1
                            
                                How do I solve "django.core.exceptions.ImproperlyConfigured: Could not find the GDAL library" when running PyCharm test?
                            
                                Logging module not working with Python3
                            
                                Why won't you switch to Python 3.x? [closed]
                            
                                django.core.exceptions.ImproperlyConfigured
                            
                                Get all keys of a nested dictionary [duplicate]
                            
                                Selenium (Python) - waiting for a download process to complete using Chrome web driver
                            
                                How to unpack a tuple from left to right?
                            
                                Pandas GroupBy memory deallocation
                            
                                Is there a Google Data API (gdata) for Python 3.x?
                            
                                Py_Initialize: Unable to get the locale encoding in OpenSuse 12.3
                            
                                Python error: command '...\Microsoft Visual Studio 10.0\\VC\\BIN\\cl.exe' failed with exit status 2
                            
                                How to add dynamic python modules to PyInstaller's specs?
                            
                                Pycharm docstring: code references and docstring inheritance
                            
                                breakpoint() using ipdb by default
                            
                                Unittesting with Pyspark: unclosed socket warnings
                            
                                Python, mocking and wrapping methods without instantating objects
                            
                                Pythonic, custom warnings
                            
                                BPMN dynamic workflow for Django
                            
                                ValueError: The number of FixedLocator locations (5), usually from a call to set_ticks, does not match the number of ticklabels (12)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With