I created monit app that must restart golang site
on crash
$ cd /etc/monit/conf.d
$ vim checkSite
It starting program with nohup
and saving its pid
to file:
check process site with pidfile /root/go/path/to/goSite/run.pid
start program = "/bin/bash -c 'cd /root/go/path/to/goSitePath; nohup ./goSite > /dev/null 2>&1 & echo $! > run.pid'" with timeout 5 seconds
stop program = "/bin/kill -9 `cat /root/go/path/to/goSitePath/run.pid`"
It starts ok.
Process 'site'
status Running
monitoring status Monitored
pid 29723
parent pid 1
uptime 2m
children 0
memory kilobytes 8592
memory kilobytes total 8592
memory percent 0.4%
memory percent total 0.4%
cpu percent 0.0%
cpu percent total 0.0%
data collected Thu, 05 Mar 2015 07:20:32
Then to test how it will restart on crash I killed manually golang site
.
Here I have two issues:
with timeout 5 seconds
site
in monit
becomes Does not exist
even after site in fact restarts. I guess this occurs because after killing and restarting site's pid
is changing randomly, but how to overcome this I don't know.status after restart:
Process 'site'
status Does not exist
monitoring status Monitored
data collected Thu, 05 Mar 2015 08:04:44
How to reduce the time of restarting and how to repair site's monit status
?
monit
log:
[Mar 5 08:04:44] error : 'site' process is not running
[Mar 5 08:04:44] info : 'site' trying to restart
[Mar 5 08:04:44] info : 'site' start: /bin/bash
[Mar 5 08:06:44] info : 'site' process is running with pid 31479
My golang site is rather simple:
package main
import (
"fmt"
"github.com/go-martini/martini"
)
func main() {
m := martini.Classic()
m.Get("/", func() {
fmt.Println("main page")
})
m.Run()
}
I tried to increase speed of monit reload my golang site by removing pid file itself. Say I made kill 29723 && rm run.pid
and turned timer on to count time for site been accessible again. It took 85 seconds. So removing pid file did not help monit to increase speed of reloading site.
monit doesn't have any subscription mechanism to inmediatelly discover if a process has died.
In daemon mode, as documented, monit works by periodically polling the status of all the configured rules, its poll-cycle is configured when daemon starts and defaults in some Linux distributions to 2 minutes, what means that in this case, monit can need till 2 minutes to take any action.
Check this configuration in your monitrc, it's configured with the set daemon
directive, for example, if you want to check the status every 5 seconds, then you should set:
set daemon 5
On every cycle it updates its status, and executes actions if needed depending on this. So if it detects that the process doesn't exist, it will report Does not exist
till the next poll cycle, even if it already takes the decission to restart it.
The timeout
in the start daemon
directive doesn't have anything to do with this poll-cycle, this is the time monit will give to the service to start. If the service doesn't start in this time monit will report it.
If monit doesn't meet your requirements, you can also try supervisord, that is always aware of the state of the executed programs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With