Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can a pipe in Linux ever lose data?

Tags:

linux

posix

pipe

And is there an upper limit on how much data it can contain?

like image 440
Abhijeet Rastogi Avatar asked Apr 26 '10 17:04

Abhijeet Rastogi


People also ask

What happens when a pipe is full Linux?

It is not possible to apply lseek(2) to a pipe. Pipe capacity A pipe has a limited capacity. If the pipe is full, then a write(2) will block or fail, depending on whether the O_NONBLOCK flag is set (see below). Different implementations have different limits for the pipe capacity.

What is the purpose of pipe in Linux?

Pipe is used to combine two or more commands, and in this, the output of one command acts as input to another command, and this command's output may act as input to the next command and so on. It can also be visualized as a temporary connection between two or more commands/ programs/ processes.

How does pipeline work in Linux?

In Unix-like computer operating systems, a pipeline is a mechanism for inter-process communication using message passing. A pipeline is a set of processes chained together by their standard streams, so that the output text of each process (stdout) is passed directly as input (stdin) to the next one.

Are Linux pipes buffered?

Pipes provide asynchronous execution of commands using buffered I/O routines. Thus, all the commands in the pipeline operate in parallel, each in its own process. The size of the buffer since kernel version 2.6. 11 is 65536 bytes (64K) and is equal to the page memory in older kernels.


2 Answers

Barring a machine crash, no it can't lose data. It's easy to misuse it and think you're losing data however, either because a write failed to write all the data you requested and you didn't check the return value or you did something wrong with the read.

The maximum amount of data it can hold is system dependent -- if you try to write more than that, you'll either get a short write or the writer will block until space is available. The pipe(7) man page contains lots of useful info about pipes, including (on Linux at least) how big the buffer is. Linux has buffers of 4K or 64K depending on version.

edit

Tim mentions SIGPIPE, which is also a potential issue that can seem to lose data. If the reader closes the pipe before reading everything in it, the unread data will be thrown away and the writer will get a SIGPIPE signal when they write more or close the pipe, indicating that this has occurred. If they block or ignore the SIGPIPE, they'll get an EPIPE error. This covers the situation Paul mentioned.

PIPE_BUF is a constant that tells you the limit of atomic writes to the buffer. Any write this size or smaller will either succeed completely or block until it can succeed completely (or give EWOULDBLOCK/EAGAIN if the pipe is in non-blocking mode). It has no relation to the actual size of the kernel's pipe buffer, though obviously the buffer must be at least PIPE_BUF in size to meet the atomicity guarentee.

like image 126
Chris Dodd Avatar answered Oct 04 '22 19:10

Chris Dodd


Data can be lost in a pipe when the following happens:

  1. A process (the writer) writes n bytes of data to the pipe, where n≤PIPE_BUF. This write is guaranteed to be atomic and will never block.
  2. A process (the reader) reads only m<n bytes of data and exits.
  3. The writer doesn’t attempt to write to the pipe again.

As a result, the kernel pipe buffer will contain n-m bytes which will be lost when all handles to the pipe have been closed. The writer will not see SIGPIPE or EPIPE since it never attempts to write to the pipe again. Since the writer won’t ever learn that the pipe contains leftover data that will simply disappear, one can consider this data lost.

A non-standard way of detecting this would be for the writer to define a timeout and call the FIONREAD ioctl to determine the number of bytes left in the pipe buffer.

like image 39
eigengrau Avatar answered Oct 04 '22 18:10

eigengrau