I've never used a named pipe before and recently realized that is just what I need.
I'm running a program using gnu parallel which could produce tons (GB's to 1TB, hard to know right now) of output formatted for a data base on mySQL.
I figured out that I can open two terminals: Terminal 1 gets something like:
find . -type f -name "*.h" | parallel --jobs 12 'cprogram {}' > /home/pipe
Where pipe is a fifo made with mkfifo
.
On a second terminal, I run a command similar to this:
mysql DataBaseName -e "LOAD DATA LOCAL INFILE '/home/pipe' INTO TABLE tableName";
It works...
But this is janky...If I understand correctly, there's an EOF generated when the first process ends causing the pipe to close.
Ideally I want to run the first process in a loop with varying parameters. Each iteration could take a long time and I need to make sanity checks so I don't loose a week to find out I've got bugs or faulty logic.
I'd like to know how to use a FIFO for this kind of procedure in a standard way.
If I understand correctly, there's an EOF generated when the first process ends causing the pipe to close.
Sort of. There's a little bit more to it than that - it is technically incorrect to say that the pipe closes as soon as the first process ends.
Instead, pipes and FIFOs return EOF when there is no more data left in the pipe and it is not opened for writing by any process.
Usually, this is solved by having the reader process open the FIFO both for reading and for writing, even though it will never write - for example, a server that accepts local clients by reading from a FIFO could open the FIFO for reading and writing so that when there are no active clients the server doesn't have to deal with the special case of EOF. This is the "standard" way to deal with it, as outlined in Advanced Programming in the UNIX Environment in the chapter about IPC mechanisms.
In your case though, this is really not possible, because you have no permanent process that keeps running (that is, you don't have the equivalent of a server process). You basically need some sort of "persistent writer", i.e., a process that maintains the pipe opened for writing during the different iterations.
One solution I can think of is to cat
standard input to the FIFO in the background. This ensures that cat
opens the FIFO for writing, so there is always an active writer, but by keeping it in the background, you don't actually feed it any input and it never writes to the FIFO. Just be aware that the job will be stopped (but not terminated) by the shell as soon as cat
attempts to read from stdin
(processes running in a background process group are usually sent SIGTTIN and stopped when they attempt to read from stdin
, because they don't have a controlling terminal until they are brought to the foreground). Anyway, as long as you don't feed it any input, you're good - the process is in a stopped state, but the FIFO is still opened for writing nonetheless. You'll never see an EOF on the pipe as long as the background job is not terminated.
So, in short, you:
mkfifo /home/pipe
cat >/home/pipe &
cat
by either bringing it to the foreground and sending it SIGINT (usually, Ctrl+C) or with kill PID
.Note that by doing this the reader process (mysql in this case) will never know when the input is over. It will always block for more input, unless you kill the background cat
before killing mysql.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With