Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Caffe does not make snapshots on SIGINT

When I'm pressing CTRL+C in terminal, caffe stops training but does not make snapshots. How to fix it? My solver:

net: "course-work/testing/model.prototxt"
test_iter: 200
test_interval: 500

base_lr: 0.001
momentum: 0.9
weight_decay: 0.005
lr_policy: "fixed"

display: 50
max_iter: 60000

snapshot: 5000
snapshot_format: HDF5
snapshot_prefix: "course-work/testing/by_solver_lr0"
snapshot_after_train: true

solver_mode: CPU

Bash script:

TOOLS=./build/tools
NET_DIR=course-work/testing

$TOOLS/caffe train \
    --solver=$NET_DIR/solver_lr0.prototxt 2>&1 | tee $NET_DIR/1.log
like image 915
0x1337 Avatar asked Feb 16 '16 16:02

0x1337


2 Answers

Redirecting caffe's output through tee and pipes might alter the way the OS handles and transfers signals to processes. Try avoiding | tee to make sure SIGINT reaches caffe.

Note that caffe tool has two flags

DEFINE_string(sigint_effect, "stop",
             "Optional; action to take when a SIGINT signal is received: "
              "snapshot, stop or none.");
DEFINE_string(sighup_effect, "snapshot",
             "Optional; action to take when a SIGHUP signal is received: "
             "snapshot, stop or none.");

These flags can help you define caffe's behavior on SIGINT and SIGHUP.

like image 104
Shai Avatar answered Oct 02 '22 23:10

Shai


A good way to log caffe output is

GLOG_log_dir=/path/to/log/dir $CAFFE_ROOT/bin/caffe.bin train 
—solver=/path/to/solver.prototxt

This does live logging of caffe output and SIGINT definitely reaches caffe.

like image 27
curio17 Avatar answered Oct 03 '22 00:10

curio17