Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

google colab setting a '^C' in the proccess

I'm running this code that i got from this tutorial I'm trying running the tensorflow object detection api, all code work well, if you run all calls, all cell will works well, and in the end, my images are classified.

Buuut have 1 cell that dont work well, it's work, but doesn't like it must work.

When i will train my model with !python legacy/train.py --logtostderr --train_dir=training/ --pipeline_config_path=training/ssd_mobilenet_v1_pets.config it start the tensorflow and start the training, buuut it only run 3 steps, 4 steps, some times 20,21,23 steps and in the end, the google colab set a ^C in the process

I never can finish my training because the google colab close my process, some one know whatsap happening?

I already try use GPU and TPU instances.

[...]
INFO:tensorflow:Restoring parameters from training/model.ckpt-0
I1022 20:41:48.368024 139794549495680 tf_logging.py:115] Restoring parameters from training/model.ckpt-0
INFO:tensorflow:Running local_init_op.
I1022 20:41:52.779153 139794549495680 tf_logging.py:115] Running local_init_op.
INFO:tensorflow:Done running local_init_op.
I1022 20:41:52.997912 139794549495680 tf_logging.py:115] Done running local_init_op.
INFO:tensorflow:Starting Session.
I1022 20:41:59.072830 139794549495680 tf_logging.py:115] Starting Session.
INFO:tensorflow:Saving checkpoint to path training/model.ckpt
I1022 20:41:59.245162 139793493063424 tf_logging.py:115] Saving checkpoint to path training/model.ckpt
INFO:tensorflow:Starting Queues.
I1022 20:41:59.252097 139794549495680 tf_logging.py:115] Starting Queues.
INFO:tensorflow:global_step/sec: 0
I1022 20:42:10.151180 139793484670720 tf_logging.py:159] global_step/sec: 0
INFO:tensorflow:Recording summary at step 0.
I1022 20:42:16.119055 139793476278016 tf_logging.py:115] Recording summary at step 0.
INFO:tensorflow:global step 1: loss = 14.0911 (28.770 sec/step)
I1022 20:42:28.496783 139794549495680 tf_logging.py:115] global step 1: loss = 14.0911 (28.770 sec/step)
INFO:tensorflow:global step 2: loss = 12.4958 (10.529 sec/step)
I1022 20:42:39.334129 139794549495680 tf_logging.py:115] global step 2: loss = 12.4958 (10.529 sec/step)
INFO:tensorflow:global step 3: loss = 11.6073 (8.267 sec/step)
I1022 20:42:47.601801 139794549495680 tf_logging.py:115] global step 3: loss = 11.6073 (8.267 sec/step)
^C
like image 710
Italo José Avatar asked Oct 22 '18 20:10

Italo José


1 Answers

I agree with Bob Smith about 'out of memory' issue here. You can cope with it by upgrading your memory from 12GB to 25GB of RAM with a simple trick from Haohui. Run the following code in Colab:

a = []
while(1):
    a.append('1')

It will crash the session and you'll get a message 'Would you like to switch to a high-RAM runtime...' in the lower left side of the screen.

like image 64
SvGA Avatar answered Oct 20 '22 10:10

SvGA