Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a decent workaround to saving checkpoints in local drive when using TPU in Tensorflow?

A follow up to this question:

How to save a Tensorflow Checkpoint file from Google Colaboratory in when using TPU mode?

Where the official way of saving a checkpoint when using a Tensorflow TPU is to use the Google Cloud Service.

I am working if there is a workaround to this for those who do not wish to use GCS. Perhaps for each variable, do a .eval(), save the variable. And then set the save variable to the 'init' value for each variable.

A major issue I foresee though is saving and loading the parameters for the optimizers.

For Keras, the weights do seem to be saved from TPU to local

https://colab.research.google.com/github/tensorflow/tpu/blob/master/tools/colab/shakespeare_with_tpu_and_keras.ipynb

INFO:tensorflow:Copying TPU weights to the CPU

So I imagine that there's a general workaround too, without using keras.

like image 331
SantoshGupta7 Avatar asked Oct 26 '18 23:10

SantoshGupta7


1 Answers

Take a look at THIS CODE from Keras

If I understood correctly weights are not saved drectly from TPU, instead weights are synced to CPU and the saved to colab storage.

EDIT

Also see: this answer.

like image 113
alex Avatar answered Nov 01 '22 20:11

alex