Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Linking Tensorboard Embedding Metadata to checkpoint

I'm using the tflearn wrapper over tensorflow to build a model, and would like to add metadata (labels) to the resultant embedding visualization. Is there a way to link a metadata.tsv file to a saved checkpoint after the fact of running it?

I've created a projector_config.pbtxt file in the logdir of the checkpoint summaries, with the metadata.tsv being in the same folder. The config looks like this:

embeddings {
  tensor_name: "Embedding/W"
  metadata_path: "C:/tmp/tflearn_logs/shallow_lstm/"
}

and was created using the code from the docs - https://www.tensorflow.org/how_tos/embedding_viz/

I've commented out the tf.Session part in the hopes of creating the metadata link without the need of doing so directly within a Session object, but I'm not sure if that's possible.

from tensorflow.contrib.tensorboard.plugins import projector
#with tf.Session() as sess:
config = projector.ProjectorConfig()
# One can add multiple embeddings.
embedding = config.embeddings.add()
embedding.tensor_name = 'Embedding/W'
# Link this tensor to its metadata file (e.g. labels).
embedding.metadata_path = 'C:/tmp/tflearn_logs/shallow_lstm/'
# Saves a config file that TensorBoard will read during startup.
projector.visualize_embeddings(tf.summary.FileWriter('/tmp/tflearn_logs/shallow_lstm/'), config)

Below is a snap of the current embedding visualization. Note the empty metadata. Is there a way to directly attach the desired metafile to this embedding?

Embedding Visualization

like image 274
ponderinghydrogen Avatar asked Oct 29 '22 13:10

ponderinghydrogen


2 Answers

I had the same problem and it is soloved now :)

Essentially, all you need to do is following 3 steps:

  1. save model checkpoint, supposing ckeckpoint's directory is ckp_dir;
  2. place projector_config.pbtxt and metadata.tsv in ckp_dir;
  3. run tensorboard --logdir=ckp_dir and click the Embedding Tab

the content of projector_config.pbtxt is :

    embeddings {
      tensor_name: "embedding_name"
      metadata_path: "metatdata.tsv"
    }

This is the key to link the embedding to metadata.tsv. In tf.Session(), we often get the embedding's value like sess.run('embedding_name:0'). But in projector_config.pbtxt, we just type tensor_name: "embedding_name".

Generally, we can specify the checkpoint path and metadata_path in projector_config.pbtxt so that we can place checkpoint, projector_config.pbtxt and metadata.tsv in different directories. But i think it is too complicated. I just solved it as above.

the result shown here

like image 120
aevil3 Avatar answered Nov 07 '22 09:11

aevil3


Try this with your projector_config.pbtxt:

embeddings {
  tensor_name: "Embedding/W"
  metadata_path: "$LOGDIR/metadata.tsv"
}

Make sure your $LOGDIR is the same path you use to call tensorboard --logdir=$LOGDIR on your terminal; that is, it should be relative to your current directory (so it probably shouldn't include C:/..). Also include the filename in the metadata_path.

Let me know if this works for you, too.


I stumbled upon the same problem trying to display words instead of indices for the word2vec tutorial. To achieve that your projector_config.pbtxt should look like this:

embeddings {
  tensor_name: "w_in"
  metadata_path: "$LOGDIR/vocab.txt"
}

You might also want to modify the save_vocab function in the code linked above since, as is, it converts unicode to hex.

like image 33
user30987 Avatar answered Nov 07 '22 08:11

user30987