I have a bash script which waits for 3 mins before signaling a daemon.
During these 3 mins, I need it to exit if it receives SIGINT
.
My current script works when run from the bash, however, when I run it from within another (C) program using the system()
call, it doesn't exit
when it is sent SIGINT
.
Here's my current script:
#!/bin/bash
trap 'exit' INT
sleep 180 &
wait
trap '' INT
/etc/init.d/myd sync
Here's how I'm running it:
kill -INT `pgrep myscript.sh` 2>/dev/null; ! pgrep -x "myscript.sh" > /dev/null && /opt/my/scripts/myscript.sh &
The same one liner, when run using the system()
call, doesn't work.
PS:
Basically, I'm using this mechanism to run the /etc/init.d/myd sync
command after only the last time it is called if it is called multiple times within 3 minutes.
The C code as requested:
system("kill -INT `pgrep myscript.sh` 2>/dev/null; ! pgrep -x \"myscript.sh\" > /dev/null && /opt/my/scripts/myscript.sh &");
The C program is quite huge (spanning tens of files), so I'm only pasting here the specific call. The program is supposed to run as a daemon, but I get this problem even when I don't run it as a daemon (using a command line switch).
I'm able to reproduce this with the following trivial C code:
#include <stdlib.h>
int main(int argc, char *argv[]) {
system("kill -INT `pgrep myscript.sh` 2>/dev/null; ! pgrep -x \"myscript.sh\" > /dev/null && /opt/my/scripts/myscript.sh &");
return 0;
}
bash
, not myscript.sh
sh
in the system()
callAfter some discussion and testing on various OS, once weeding out the subtleties, this really came down to:
ps
would show this:46694 s001 S+ 0:00.01 /bin/bash /opt/my/scripts/myscript.sh
system()
call (via sh
), ps
would show this:46796 s002 S 0:00.00 /bin/bash /opt/my/scripts/myscript.sh
From the ps
manual:
state The first character indicates the run state of the process: S Marks a process that is sleeping for less than about 20 seconds
Additional characters after these, if any, indicate additional state information:
+ The process is in the foreground process group of its control terminal.
So the processes without the + weren't playing with SIGINT.
It turns out we can trap SIGUSR1 instead and send it kill -USR1
cmon.c
#include
int main(int argc, char *argv[]) {
system("mypid=`cat /opt/my/scripts/myscript.pid` ; ps -p $mypid > /dev/null && kill -USR1 $mypid 2>/dev/null ; mypid=`cat /opt/my/scripts/myscript.pid` ; ps -p $mypid > /dev/null || /opt/my/scripts/myscript.sh &");
return 0;
}
/opt/my/scripts/myscript.sh
#!/bin/bash
#process for this instance of bash
bashpid="$$"
#ensure single instance
mypid="$(cat "$(dirname "$0")/myscript.pid" 2>/dev/null)"
ps -p $mypid &> /dev/null && (echo "already running, now going to exit this instance" ; echo "seems we never see this message, that's good, means the system() call is only spawning a new instance if there's no process match." ; kill -9 "$bashpid")
#set pidfile for this instance
echo "$bashpid" > "$(dirname "$0")/myscript.pid"
trap 'echo "BYE" ; exit' SIGUSR1
sleep 5 & # 30 seconds for testing
wait
trap '' SIGUSR1
echo "Sitting here in limbo land" & while true ; do sleep 5 ; done
Simple line to loop the ./cmon
binary and check that we get a different process each time (whilst inside the timeout window):
unset i ; until [[ i -eq 5 ]] ; do ./cmon ; ps ax | grep bash.*myscript.sh | grep -v grep | awk -F ' ' '{print $1 " " $6}' ; sleep 1 ; (( i ++ )) ; done
Same line can be run once the timeout expires, to check all the process numbers are the same
A paste-able full unit test build and test sequence can be found here: ybin link
And a demo of it running here:
Notes about the process:
Update to follow when OP confirms final working
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With