Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bash: Inline Execution returns Duplicate "Process". Why?

Tags:

linux

bash

bash: 4.3.42(1)-release (x86_64-pc-linux-gnu)

Executing the following script:

# This is myscript.sh
line=$(ps aux | grep [m]yscript)  # A => returns two duplicates processes (why?)
echo "'$line'"
ps aux | grep [m]yscript          # B => returns only one

Output:

'tom   31836  0.0  0.0  17656  3132 pts/25   S+   10:33   0:00 bash myscript.sh
tom   31837  0.0  0.0  17660  1736 pts/25   S+   10:33   0:00 bash myscript.sh'
tom   31836  0.0  0.0  17660  3428 pts/25   S+   10:33   0:00 bash myscript.sh

Why does the inline executed ps-snippet (A) return two lines?

like image 386
tokosh Avatar asked Jun 14 '16 01:06

tokosh


2 Answers

Summary

This creates a subshell and hence two processes are running:

line=$(ps aux | grep [m]yscript) 

This does not create a subshell. So, myscript.sh has only one process running:

ps aux | grep [m]yscript       

Demonstration

Let's modify the script slightly so that the process and subprocess PIDs are saved in the variable line:

$ cat myscript.sh 
# This is myscript.sh
line=$(ps aux | grep [m]yscript; echo $$ $BASHPID)
echo "'$line'"
ps aux | grep [m]yscript  

In a bash script, $$ is the PID of the script and is unchanged in subshells. By contrast, when a subshell is entered, bash updates $BASHPID with the PID of the subshell.

Here is the output:

$ bash myscript.sh 
'john1024  30226  0.0  0.0  13280  2884 pts/22   S+   18:50   0:00 bash myscript.sh
john1024   30227  0.0  0.0  13284  1824 pts/22   S+   18:50   0:00 bash myscript.sh
30226 30227'
john1024   30226  0.0  0.0  13284  3196 pts/22   S+   18:50   0:00 bash myscript.sh

In this case, 30226 is the PID on the main script and 30227 is the PID of the subshell running ps aux | grep [m]yscript.

like image 90
John1024 Avatar answered Sep 27 '22 03:09

John1024


  • a command substitution ($(...))
  • each segment of a pipeline[1]

cause Bash to create a subshell (a child process created by forking the current shell process), but then Bash optimizes away subshells if they result in a single call to an external utility.

(What I think is happening in the optimization scenario is that a subshell is actually created but then instantly replaced by the external utility's process, via something like exec. Do let me know if you know for sure.)

Applied to your example:

  • line=$(ps aux | grep [m]yscript) creates 3 child processes:

    • 1 subshell - the fork of your script you see as an additional match returned by grep.
    • 2 child processes (1 for each pipeline segment) - ps and grep; they take the place of the optimized-away subshells; their parent process is the 1 remaining subshell created by the command substitution.
  • ps aux | grep [m]yscript creates 2 child processes (1 for each pipeline segment):

    • ps and grep; they take the place of the optimized-away subshells; their parent process is the current shell.

For an overview of the scenarios in which a subshell is created in Bash, see this answer of mine, which, however, doesn't cover the optimizing-away scenarios.


[1] In Bash v4.2+ you can set option lastpipe (off by default) in order to make the last pipeline segment run in the current shell instead of a subshell; aside from a slight efficiency gain, this allows you to declare variables in the last segment that the current shell can see after the pipeline exits.

like image 40
mklement0 Avatar answered Sep 27 '22 03:09

mklement0