When I do <pre class="prettyprint"><code>$ ps -ef | grep cron </code></pre> I get <pre class="prettyprint"><code>root 1036 1 0 Jul28 ? 00:00:00 cron abc 21025 14334 0 19:15 pts/2 00:00:00 grep --color=auto cron </code></pre> My question is why do I see the second line. From my understanding, <code>ps</code> lists the processes and pipes the list to <code>grep</code>. <code>grep</code> hasn't even started running while <code>ps</code> is listing processes, then how come <code>grep</code> process is listed in the o/p ? Related second question: When I do <pre class="prettyprint"><code>$ ps -ef | grep [c]ron </code></pre> I get only <pre class="prettyprint"><code>root 1036 1 0 Jul28 ? 00:00:00 cron </code></pre> What is the difference between first and second <code>grep</code> executions?

When you execute the command: <pre class="prettyprint"><code>ps -ef | grep cron </code></pre> the shell you are using (...I assume bash in your case, due to the color attribute of grep I think you are running a gnu system like a linux distribution, but it's the same on other unix/shell as well...) will execute the <code>pipe()</code> call to create a FIFO, then it will <code>fork()</code> (make a running copy of itself). This will create a new child process. This new generated child process will <code>close()</code> its standard output file descriptor (fd 1) and attach the fd 1 to the write side of the pipe created by the father process (the shell where you executed the command). This is possible because the <code>fork()</code> syscall will maintain, for each, a valid open file descriptor (the pipe fd in this case). After doing so it will <code>exec()</code> the first (in your case) <code>ps</code> command found in your <code>PATH</code> environment variable. With the <code>exec()</code> call the process will become the command you executed. So, you now have the shell process with a child that is, in your case, the <code>ps</code> command with <code>-ef</code> attributes. At this point, the parent (the shell) <code>fork()</code>s again. This newly generated child process <code>close()</code>s its standard input file descriptor (fd 0) and attaches the fd 0 to the read side of the pipe created by the father process (the shell where you executed the command). After doing so it will <code>exec()</code> the first (in your case) <code>grep</code> command found in your PATH environment variable. Now you have the shell process with two children (that are siblings) where the first one is the <code>ps</code> command with <code>-ef</code> attributes and the second one is the <code>grep</code> command with the <code>cron</code> attribute. The read side of the pipe is attached to the <code>STDIN</code> of the <code>grep</code> command and the write side is attached to the <code>STDOUT</code> of the <code>ps</code> command: the standard output of the <code>ps</code> command is attached to the standard input of the <code>grep</code> command. Since <code>ps</code> is written to send on the standard output info on each running process, while grep is written to get on its standard input something that has to match a given pattern, you'll have the answer to your first question: <ol> <li>the shell runs: <code>ps -ef;</code> </li> <li>the shell runs: <code>grep cron;</code> </li> <li> <code>ps</code> sends data (that even contains the string "grep cron") to <code>grep</code> </li> <li> <code>grep</code> matches its search pattern from the <code>STDIN</code> and it matches the string "grep cron" because of the "cron" attribute you passed in to <code>grep</code>: you are instructing <code>grep</code> to match the "cron" string and it does because "grep cron" is a string returned by <code>ps</code> at the time <code>grep</code> has started its execution.</li> </ol> When you execute: <pre class="prettyprint"><code>ps -ef | grep '[c]ron' </code></pre> the attribute passed instructs <code>grep</code> to match something containing "c" followed by "ron". Like the first example, but in this case it will break the match string returned by <code>ps</code> because: <ol> <li>the shell runs: <code>ps -ef;</code> </li> <li>the shell runs: <code>grep [c]ron;</code> </li> <li> <code>ps</code> sends data (that even contains the string <code>grep [c]ron</code>) to <code>grep</code> </li> <li> <code>grep</code> does not match its search pattern from the stdin because a string containing "c" followed by "ron" it's not found, but it has found a string containing "c" followed by "]ron"</li> </ol> GNU <code>grep</code> does not have any string matching limit, and on some platforms (I think Solaris, HPUX, aix) the limit of the string is given by the "$COLUMN" variable or by the terminal's screen width. Hopefully this long response clarifies the shell pipe process a bit. TIP: <pre class="prettyprint"><code>ps -ef | grep cron | grep -v grep </code></pre>

Why does ps o/p list the grep process after the pipe?

Tags:

When I do

$ ps -ef | grep cron

I get

root      1036     1  0 Jul28 ?        00:00:00 cron abc    21025 14334  0 19:15 pts/2    00:00:00 grep --color=auto cron

My question is why do I see the second line. From my understanding, ps lists the processes and pipes the list to grep. grep hasn't even started running while ps is listing processes, then how come grep process is listed in the o/p ?

Ankur Agarwal

2 Answers

When you execute the command:

ps -ef | grep cron

the shell you are using

(...I assume bash in your case, due to the color attribute of grep I think you are running a gnu system like a linux distribution, but it's the same on other unix/shell as well...)

will execute the pipe() call to create a FIFO, then it will fork() (make a running copy of itself). This will create a new child process. This new generated child process will close() its standard output file descriptor (fd 1) and attach the fd 1 to the write side of the pipe created by the father process (the shell where you executed the command). This is possible because the fork() syscall will maintain, for each, a valid open file descriptor (the pipe fd in this case). After doing so it will exec() the first (in your case) ps command found in your PATH environment variable. With the exec() call the process will become the command you executed.

So, you now have the shell process with a child that is, in your case, the ps command with -ef attributes.

At this point, the parent (the shell) fork()s again. This newly generated child process close()s its standard input file descriptor (fd 0) and attaches the fd 0 to the read side of the pipe created by the father process (the shell where you executed the command).

After doing so it will exec() the first (in your case) grep command found in your PATH environment variable.

Now you have the shell process with two children (that are siblings) where the first one is the ps command with -ef attributes and the second one is the grep command with the cron attribute. The read side of the pipe is attached to the STDIN of the grep command and the write side is attached to the STDOUT of the ps command: the standard output of the ps command is attached to the standard input of the grep command.

Since ps is written to send on the standard output info on each running process, while grep is written to get on its standard input something that has to match a given pattern, you'll have the answer to your first question:

the shell runs: ps -ef;
the shell runs: grep cron;
ps sends data (that even contains the string "grep cron") to grep
grep matches its search pattern from the STDIN and it matches the string "grep cron" because of the "cron" attribute you passed in to grep: you are instructing grep to match the "cron" string and it does because "grep cron" is a string returned by ps at the time grep has started its execution.

When you execute:

ps -ef | grep '[c]ron'

the attribute passed instructs grep to match something containing "c" followed by "ron". Like the first example, but in this case it will break the match string returned by ps because:

the shell runs: ps -ef;
the shell runs: grep [c]ron;
ps sends data (that even contains the string grep [c]ron) to grep
grep does not match its search pattern from the stdin because a string containing "c" followed by "ron" it's not found, but it has found a string containing "c" followed by "]ron"

GNU grep does not have any string matching limit, and on some platforms (I think Solaris, HPUX, aix) the limit of the string is given by the "$COLUMN" variable or by the terminal's screen width.

Hopefully this long response clarifies the shell pipe process a bit.

TIP:

ps -ef | grep cron | grep -v grep

106

answered Oct 05 '22 23:10

dAm2K

The shell constructs your pipeline with a series of fork(), pipe() and exec() calls. Depending on the shell any part of it may be constructed first. So grep may already be running before ps even starts. Or, even if ps starts first it will be writing into a 4k kernel pipe buffer and will eventually block (while printing a line of process output) until grep starts up and begins consuming the data in the pipe. In the latter case if ps is able to start and finish before grep even starts you may not see the grep cron in the output. You may have noticed this non-determinism at play already.

answered Oct 06 '22 00:10

Ben Jackson

Related questions
                            
                                Java: signed long to unsigned long string
                            
                                Why don't Haskell compilers facilitate deterministic memory management?
                            
                                How do I create an event handler for a programmatically created object in VB.NET?
                            
                                how to create functions with variable arguments in javascript?
                            
                                Adding border-radius for embedded YouTube video
                            
                                what does movsbl instruction do? [duplicate]
                            
                                Failed to CREATE AN ASSEMBLY in SQL
                            
                                Date Arithmetic in PHP
                            
                                Downloading files with Java
                            
                                Ambiguous type variable `a0' in the constraints
                            
                                Ember.js & REST API
                            
                                Difference OnInit and OnLoad in ASP.NET?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With