My application program crashes with EXIT CODE: 9 (SIGKILL)
I never run any command such as 'kill -9 (pid)' or 'pkill (process name)' that can kill the running process.
Where should I start for debugging in this case?
I tried to dump the stack trace when the program crashes, but I found that the SIGKILL cannot be caught for error handling.
The program uses MPI and runs in cluster environments. It dies after around 1 hour of its run.
Is there any COMMON causes that can incur SIGKILL exception?
(It's running on linux; cent os 7)
It might due to not-enough memory for the calculation. The memory assigned to each core is 30Gb, but still the calculation crashed with such error.
The purpose of the exit() function is to terminate the execution of a program. The “return 0”(or EXIT_SUCCESS) implies that the code has executed successfully without any error. Exit codes other than “0”(or EXIT_FAILURE) indicate the presence of an error in the code.
In Linux, an exit code indicates the response from the command or a script after execution. It ranges from 0 to 255. The exit codes help us determine whether a process ran: Successfully.
What the Problem Means. The short answer is that the program did not return a zero as its exit or return code. Waf reports this back in red since it usually means that the program has failed in some way.
@ I answer my own question so that some one can get helps later.
The exception was caused by OutOfMemory.
The process allocates too much memory putting pressures on OS. The OS has a hit man, oom-killer, that kills such processes for the sake of system stability. The oom-killer uses bullets called SIGKILL.
However, since SIGKILL is invisible (it cannot be caught and handled by the application), for some newbies including me, it is not always easy to figure out the true reason for the crash.
One good news is that when the hit man kills your process, it always logs its action at /var/log/messages.
Depending on your OS configuration, oom-killer might not log any message at all. In such a case, you can configure it as well. Search for rsyslog configuration in google.
Finding which process was killed by Linux OOM killer
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With