How do I use the Embedding Projector included in Tensorboard?
I can't find any documentation for it. There are some references to it here, but there's no step-by-step example/tutorial on how to use it.
To visualize the word embedding, we are going to use common dimensionality reduction techniques such as PCA and t-SNE. To map the words into their vector representations in embedding space, the pre-trained word embedding GloVe will be implemented.
TensorFlow Projector is visuale tool that let the user intercat and analyze high demensional data (e.g. embeddings) and their metadata, by projecting them in a 3D space on the browser.
As far as I am aware this is the only documentation about embedding visualization on the TensorFlow website. Though the code snippet might not be very instructive for the first time users, so here is an example usage:
import os import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data LOG_DIR = 'logs' mnist = input_data.read_data_sets('MNIST_data') images = tf.Variable(mnist.test.images, name='images') with tf.Session() as sess: saver = tf.train.Saver([images]) sess.run(images.initializer) saver.save(sess, os.path.join(LOG_DIR, 'images.ckpt'))
Here first we create a TensoFlow variable (images
) and then save it using tf.train.Saver
. After executing the code we can launch TensorBoard by issuing tensorboard --logdir=logs
command and opening localhost:6006
in a browser.
However this visualisation is not very helpful because we do not see different classes to which each data point belongs. In order to distinguish each class from another one should provider some metadata:
import os import tensorflow as tf from tensorflow.examples.tutorials.mnist import input_data from tensorflow.contrib.tensorboard.plugins import projector LOG_DIR = 'logs' metadata = os.path.join(LOG_DIR, 'metadata.tsv') mnist = input_data.read_data_sets('MNIST_data') images = tf.Variable(mnist.test.images, name='images') with open(metadata, 'w') as metadata_file: for row in mnist.test.labels: metadata_file.write('%d\n' % row) with tf.Session() as sess: saver = tf.train.Saver([images]) sess.run(images.initializer) saver.save(sess, os.path.join(LOG_DIR, 'images.ckpt')) config = projector.ProjectorConfig() # One can add multiple embeddings. embedding = config.embeddings.add() embedding.tensor_name = images.name # Link this tensor to its metadata file (e.g. labels). embedding.metadata_path = metadata # Saves a config file that TensorBoard will read during startup. projector.visualize_embeddings(tf.summary.FileWriter(LOG_DIR), config)
Which gives us:
Sadly, I cannot find a more comprehensive documentation. Below I collect all related resources:
PS: Thanks for upvoting me. Now I can post all the links.
Now you can use Embedding Projector easily in Colab with PyTorch's SummaryWriter
import numpy as np import tensorflow as tf import tensorboard as tb tf.io.gfile = tb.compat.tensorflow_stub.io.gfile from torch.utils.tensorboard import SummaryWriter vectors = np.array([[0,0,1], [0,1,0], [1,0,0], [1,1,1]]) metadata = ['001', '010', '100', '111'] # labels writer = SummaryWriter() writer.add_embedding(vectors, metadata) writer.close() %load_ext tensorboard %tensorboard --logdir=runs
The %tensorboard magic now works properly again.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With