Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does a pipe work in Linux?

People also ask

What is pipe in Linux with examples?

Pipes are a powerful tool in the open-source Linux operating system. For example, Linux pipes allow you to process a series of commands that refer to a dataset, or to efficiently move data back and forth between commands.

How does pipe work in OS?

In computer programming, especially in UNIX operating systems, a pipe is a technique for passing information from one program process to another. Unlike other forms of interprocess communication (IPC), a pipe is one-way communication only.

How Linux pipes work under the hood?

Piping is one of the core concepts of Linux & Unix based operating systems. Pipes allow you to chain together commands in a very elegant way, passing output from one program to the input of another to get a desired end result.


If you want to redirect the output of one program into the input of another, just use a simple pipeline:

program1 arg arg | program2 arg arg

If you want to save the output of program1 into a file and pipe it into program2, you can use tee(1):

program1 arg arg | tee output-file | program2 arg arg

All programs in a pipeline are run simultaneously. Most programs typically use blocking I/O: if when they try to read their input and nothing is there, they block: that is, they stop, and the operating system de-schedules them to run until more input becomes available (to avoid eating up the CPU). Similarly, if a program earlier in the pipeline is writing data faster than a later program can read it, eventually the pipe's buffer fills up and the writer blocks: the OS de-schedules it until the pipe's buffer gets emptied by the reader, and then it can continue writing again.


EDIT

If you want to use the output of program1 as the command-line parameters, you can use the backquotes or the $() syntax:

# Runs "program1 arg", and uses the output as the command-line arguments for
# program2
program2 `program1 arg`

# Same as above
program2 $(program1 arg)

The $() syntax should be preferred, since they are clearer, and they can be nested.


Piping does not complete the first command before running the second. Unix (and Linux) piping run all commands concurrently. A command will be suspended if

  • It is starved for input.

  • It has produced significantly more output than its successor is ready to consume.

For most programs output is buffered, which means that the OS accumulates a substantial amount of output (perhaps 8000 characters or so) before passing it on to the next stage of the pipeline. This buffering is used to avoid too much switching back and forth between processes and kernel.

If you want output on a pipeline to be sent right away, you can use unbuffered I/O, which in C means calling something like fflush() to be sure that any buffered output is immediately sent on to the next process. Unbuffered input is also possible but is generally unnecessary because a process that is starved for input typically does not wait for a full buffer but will process any input you can get.

For typical applications unbuffered output is not recommended; you generally get the best performance with the defaults. In your case, however, where you want to do dynamic graphing immediately the first process has the info available, you definitely want to be using unbuffered output. If you're using C, calling fflush(stdout) whenever you want output sent will be sufficient.


If your programs are communicating using stdin and stdout, then make sure that you are either calling fflush(stdout) after you write or find some way to disable standard IO buffering. The best reference that I can think of that really describe how to best implement pipelines in C/C++ is Advanced Programming in the UNIX Environment or UNIX Network Programming: Volume 2. You could probably start with a this article as well.


If your two programs insist on reading and writing to files and do not use stdin/stdout, you may find you can use a named pipe instead of a file.

Create a named pipe with the mknod(1) command:

$ mknod /tmp/named-pipe p

Then configure your programs to read and write to /tmp/named-pipe (use whatever path/name you feel is appropriate).

In this case, both programs will run in parallel, blocking as necessary when the pipe becomes full/empty as described in the other answers.