Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do you use dup2 and fork together?

Tags:

fork

unix

exec

dup2

I'm taking an operating systems course and I'm having a hard time how input is redirected with dup2 when you have forks. I wrote this small program to try and get a sense for it but I wasn't successful in passing the output of a grand-child to a child. I am trying to mimick the unix command: ps -A | wc -l. I'm new to Unix, but I believe this should count the lines of the list of running processes I get. So my output should be a single number.

#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <iostream>

using namespace std;

int main( int argc, char *argv[] )  {

    char *searchArg = argv[ 1 ];
    pid_t pid;

    if ( ( pid = fork() ) > 0 ) {
        wait( NULL );
        cout << "Inside parent" << endl;
    } 
    else if ( pid == 0 ) {
            int fd1[ 2 ];
            pipe( fd1 );
            cout << "Inside child" << endl;

            if ( pid = fork() > 0 ) {
                dup2( fd1[ 0 ], 0 );
                close( fd1[ 0 ] );
                execlp( "/bin/wc", "-l", NULL );
            }
            else if ( pid == 0 ) {
                cout << "Inside grand child" << endl;
                execlp( "/bin/ps", "-A", NULL );
            }
        }
    return 0;
}

I don't have it in the code above, but here is my guess on how things should go down:

  • We need to redirect standard output of command ps -A (whatever is usually printed to the screen, correct?) so that the wc -l command can use it to count the lines.
  • This standard output can be redirected using dup2, like dup2( ?, 1 ) which means redirect standard output to ?. Then you close ?.

Question: Where do I redirect it to? I know it should be one of the file descriptors, but where should it be redirected so that wc can process it?

  • wc somehow receives the standard output.

Question: How does wc receive the output? Through an execlp parameter? Or does the operating system check one of the file descriptors?

  • Execute wc -l.

Which one of these is closed and left open for wc to receive and process ps's output? I keep thinking this needs to be thought of backwards since ps needs to give its output to wc...but that doesn't seem to make sense since both child and grand-child are being processed in parallel.

pipe dream

like image 545
ShrimpCrackers Avatar asked Jan 14 '12 07:01

ShrimpCrackers


People also ask

What exactly does dup2 do?

The dup2() function duplicates an open file descriptor. Specifically, it provides an alternate interface to the service provided by the fcntl() function using the F_DUPFD constant command value, with fildes2 for its third argument. The duplicated file descriptor shares any locks with the original.

What is dup2 in pipe?

We use the dup2() system call to duplicate the writing file descriptor of the pipe (pfd[1]) onto the standard output file descriptor, 1. We don't need the input end of the pipe (pdf[0]), so we close it. 6. Once that is done, we simple call execvp() to run the program. This program will overwrite our process' memory.

Why is dup2 not working?

It doesn't work because you're not using dup2() for its intended purpose. dup2() is for duplicating file descriptors, not for piping data from one stream to another. Look up the pipe() function if you want to redirect the output from one program to the input of another (but not the reverse).

Why do we use pipe in c?

The pipe can be used by the creating process, as well as all its child processes, for reading and writing. One process can write to this “virtual file” or pipe and another related process can read from it.


1 Answers

First off, let's fix your code so that we add a tiny bit more error-checking to it, and so that it works; replace the bottom bit with:

else if ( pid == 0 ) {
        int fd1[ 2 ];
        pipe( fd1 );
        cout << "Inside child" << endl;

        if ( (pid = fork()) > 0 ) {
            if (dup2( fd1[ 0 ] , 0 ) < 0) {
              cerr << "Err dup2 in child" << endl;
            }
            close( fd1[ 0 ] );
            close( fd1[ 1 ] ); // important; see below
            // Note: /usr/bin, not /bin
            execlp( "/usr/bin/wc", "wc", "-l", NULL );
            cerr << "Err execing in child" << endl;
        }
        else if ( pid == 0 ) {
            cout << "Inside grand child" << endl;
            if (dup2( fd1[ 1 ] , 1 ) < 0) {
              cerr << "Err dup2 in gchild" << endl;
            }
            close( fd1[ 0 ] );
            close( fd1[ 1 ] );
            execlp( "/bin/ps", "ps", "-A", NULL );
            cerr << "Err execing in grandchild" << endl;
        }
}

Now, your questions:

  • Question: Where do I redirect it to? I know it should be one of the file descriptors, but where should it be redirected so that wc can process it?

    The filedescriptors 0, 1, and 2 are special in Unix in that they are the standard input, standard output, and standard error. wc reads from standard input, so whatever is duped to 0.

  • Question: How does wc receive the output? Through an execlp parameter? Or does the operating system check one of the file descriptors?

    In general, after a process has had its image swapped out with exec, it will have all the open file descriptors it had before exec. (Except for those descriptors with the CLOSE_ON_EXEC flag set, but ignore that for now) Therefore, if you dup2 something to 0, then wc will read it.

  • Which one of these is closed and left open for wc to receive and process ps's output?

    As shown above, you can close both ends of the pipe in both child and grandchild, and that'll be fine. In fact, standard practice would recommend that you do that. However, the only truly necessary close line in this specific example is the one I comment as "important" - that's closing the write end of the pipe in the child.

    The idea is this: both child and grand-child have both ends of the pipe open when they start. Now, through dup we've connected wc to the read end of the pipe. wc is going to keep sucking on that pipe until all descriptors on the write end of the pipe are closed, at which point it'll see that it came to the end of the file and stop. Now, in the grand-child, we can get away with not closing anything because ps -A isn't going to do anything with any of the descriptors but write to descriptor 1, and after ps -A finishes spitting out stuff about some processes it'll exit, closing everything it had. In the child, we don't really need to close the read descriptor stored in fd[0] because wc isn't going to try to read from anything but descriptor 0. However, we do need to close the write end of the pipe in the child because otherwise wc is just going to sit there with a pipe that's never completely closed.

    As you can see, the reasoning for why we didn't really need any of the close lines except the one marked "important" depend on the details of how wc and ps are going to behave, so the standard practice is to close the end of a pipe you aren't using completely, and keep open the end you are using only with one descriptor. Since you're using dup2 in both processes, that means four close statements as above.

EDIT: Updated the arguments to execlp.

like image 144
Daniel Martin Avatar answered Oct 03 '22 17:10

Daniel Martin