Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Signal not received by bash script run using system(3)

Tags:

bash

I have a bash script which waits for 3 mins before signaling a daemon. During these 3 mins, I need it to exit if it receives SIGINT.

My current script works when run from the bash, however, when I run it from within another (C) program using the system() call, it doesn't exit when it is sent SIGINT.

Here's my current script:

#!/bin/bash

trap 'exit' INT
sleep 180 &
wait
trap '' INT

/etc/init.d/myd sync

Here's how I'm running it:

kill -INT `pgrep myscript.sh` 2>/dev/null; ! pgrep -x "myscript.sh" > /dev/null && /opt/my/scripts/myscript.sh &

The same one liner, when run using the system() call, doesn't work.

PS: Basically, I'm using this mechanism to run the /etc/init.d/myd sync command after only the last time it is called if it is called multiple times within 3 minutes.

EDIT 1

The C code as requested:

system("kill -INT `pgrep myscript.sh` 2>/dev/null; ! pgrep -x \"myscript.sh\" > /dev/null && /opt/my/scripts/myscript.sh &");

The C program is quite huge (spanning tens of files), so I'm only pasting here the specific call. The program is supposed to run as a daemon, but I get this problem even when I don't run it as a daemon (using a command line switch).

I'm able to reproduce this with the following trivial C code:

#include <stdlib.h>

int main(int argc, char *argv[]) {
    system("kill -INT `pgrep myscript.sh` 2>/dev/null; ! pgrep -x \"myscript.sh\" > /dev/null && /opt/my/scripts/myscript.sh &");        
    return 0;
}
like image 532
Zaxter Avatar asked Nov 07 '22 22:11

Zaxter


1 Answers

The why (Updated):

  • The actual process name is bash, not myscript.sh
  • pgrep can match itself via sh in the system() call
  • SIGINT just doesn't play nice when the script is run in the background. (Have used SIGUSR1 instead)

After some discussion and testing on various OS, once weeding out the subtleties, this really came down to:

  1. When starting the script from a tty, ps would show this:
46694 s001  S+     0:00.01 /bin/bash /opt/my/scripts/myscript.sh
  1. When starting the script from the system() call (via sh), ps would show this:
46796 s002  S      0:00.00 /bin/bash /opt/my/scripts/myscript.sh

From the ps manual:

    • state   The first character indicates the run state of the process: 
      
          S   Marks a process that is sleeping for less than about 20 seconds
      Additional characters after these, if any, indicate additional state information:

      + The process is in the foreground process group of its control terminal.

So the processes without the + weren't playing with SIGINT.

It turns out we can trap SIGUSR1 instead and send it kill -USR1


Working test on Ubuntu 16.04 & Mac OS X 10.12:

cmon.c

  • #include 
    int main(int argc, char *argv[]) {
    system("mypid=`cat /opt/my/scripts/myscript.pid` ; ps -p $mypid > /dev/null && kill -USR1 $mypid 2>/dev/null ; mypid=`cat /opt/my/scripts/myscript.pid` ; ps -p $mypid > /dev/null || /opt/my/scripts/myscript.sh &");
    return 0;
    }
    

/opt/my/scripts/myscript.sh

  • #!/bin/bash
    #process for this instance of bash bashpid="$$"

    #ensure single instance mypid="$(cat "$(dirname "$0")/myscript.pid" 2>/dev/null)" ps -p $mypid &> /dev/null && (echo "already running, now going to exit this instance" ; echo "seems we never see this message, that's good, means the system() call is only spawning a new instance if there's no process match." ; kill -9 "$bashpid")

    #set pidfile for this instance echo "$bashpid" > "$(dirname "$0")/myscript.pid"
    trap 'echo "BYE" ; exit' SIGUSR1 sleep 5 & # 30 seconds for testing wait trap '' SIGUSR1
    echo "Sitting here in limbo land" & while true ; do sleep 5 ; done

Testing

Simple line to loop the ./cmon binary and check that we get a different process each time (whilst inside the timeout window):

unset i ; until [[ i -eq 5 ]] ; do ./cmon ; ps ax | grep bash.*myscript.sh | grep -v grep | awk -F ' ' '{print $1 " " $6}' ; sleep 1 ; (( i ++ )) ; done

Same line can be run once the timeout expires, to check all the process numbers are the same

A paste-able full unit test build and test sequence can be found here: ybin link

And a demo of it running here:

enter image description here

Notes about the process:

  • The scripts are modified only for ease of testing (i.e. 10 second timeout window).
  • Single instance checking added redundantly (the single instance trigger in the myscript.sh file should never be triggered.. and never seems to be)

Update to follow when OP confirms final working

like image 157
hmedia1 Avatar answered Nov 15 '22 05:11

hmedia1