Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can my process detect if the computer is shutting down?

I'm running some applications on EC2 spot instances. Such instances can be killed by Amazon with no notice.

In the shutdown process, processes are killed in some order. We have monitoring/recovery programs that should behave differently based on whether the server is shutting down or the process just crashed. (specifically we don't want to do anything if the server is actually shutting down)

How can I detect in the recovery process (if it is still alive) that processes were killed because of a shutdown?

(More system details: I'm running unknown/untrusted/etc code in a sandbox that doesn't modify external state. Generally if sandboxed code crashes, it is fault of author of the untrusted code and we will not rerun it. But if the sandboxed code is terminated due to the VM shuting down or failing, we need to rerun it on another instance. The problem I'm having right now is that the user's code is terminated first so the monitoring program incorrectly believes the crash is user error.)

like image 341
UsAaR33 Avatar asked May 21 '12 23:05

UsAaR33


People also ask

How do you check when the computer was shut down?

Every time a user starts or shuts down a computer, an event log will be recorded in the Event Viewer. These event logs can be used to track computer active hours. To view these audit logs, go to the Event Viewer.

What is the process of computer shutdown?

To turn off your PC in Windows 10, select the Start button, select the Power button, and then select Shut down.


2 Answers

agent

Run an agent on each machine that spawns sandbox child-processes. The agent runs your code that is "crash proof", and the sandbox code runs user code which could crash.

The monitoring system that is in charge of starting a new machine with a new sandbox process checks which processes have been killed (both the agent and sandbox process or only the sandbox child process).

It does that by opening a TCP connection (RMI/RPC/HTTP) to the agent querying about its child processes. If the agent responds - the machine is still running, and it can be asked about its child sandbox processes. If the agent does not respond - the machine is suspect of being terminated.

agent (variation)

The agent is also in charge of restarting the child sandbox process on the same VM in case it crashes.

lookup service

Use a look-up service (such as Zoo Keeper) to keep track of which processes sent heartbeat keep-alive. If the agent is alive then the machine is still running, if the agent is not alive, then it is not running.

ec2 api

Poll the EC2 APIs to determine if the machine is in running or terminated state.

like image 167
itaifrenkel Avatar answered Oct 05 '22 13:10

itaifrenkel


How does your recovery process work?

If you're using waitpid to monitor the process, when it exits you can determine:

  • Whether it exited normally, and what status the process returned if it did, or
  • Whether it exited due to a signal, and what that signal was.

Depending on how the process is shut down, I'd expect to see it either exit normally or exit via SIGTERM or SIGKILL. SIGILL, SIGABRT, SIGFPE, SIGBUS, SIGSEGV, and SIGSYS would indicate a crash from a programming error.

like image 26
LnxPrgr3 Avatar answered Oct 05 '22 15:10

LnxPrgr3