Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does Google Colab randomly disconnects before 12 hours

Sometimes my Colab notebooks disconnect before 12 hours and I am curious as to why this is so. Sometimes I get a message " Runtime disconnected".

At other times, there is no message. After I reconnect my notebook it looks like it hasn't ran for awhile (because the notebook doesn't say busy). In addition, my tensorflow .meta and .data files have not updated for like 6 hours out of the last 8 hours or so on google drive.

I found some questions on SO that were similar to my issue but other people's situation seems to be that they get "stuck" on initialization but my notebook don't get "stuck". It connects with a checkmark. I even tried restarting the runtime, but I still get no sign that my notebook is connected to my old VM in any way.

EDIT: Are google colab VM "preemptible" in any way? I know google compute engine has "preemptible" machines that can disconnect at any time. Since paying customers use the preemptible machines it only makes sense for me that colab - used by non-paying customers - would be preemptible as well. I did not find any documentation that supports this claim for colab.

like image 475
lightbox142 Avatar asked Oct 05 '18 17:10

lightbox142


1 Answers

Google Colab is not intended for long-running tasks. From the Colab FAQs web page (emphasis is mine):

Colaboratory is intended for interactive use. Long-running background computations, particularly on GPUs, may be stopped. Please do not use Colaboratory for cryptocurrency mining. Doing so is unsupported and may result in service unavailability. We encourage users who wish to run continuous or long-running computations through Colaboratory’s UI to use a local runtime.

In my experience, "long-running computations" include training neural networks and also bash commands that run for more than two or three hours. As mentioned above, these types of long-running tasks may result in service unavailability which usually lasts no longer than a few hours.

like image 117
ninjin Avatar answered Sep 20 '22 19:09

ninjin