(via https://stackoverflow.com/a/8624829/23582) How does <code>(head; tail) < file</code> work? Note that <code>cat file | (head;tail)</code> doesn't. Also, why does <code>(head; wc -l) < file</code> give <code>0</code> for the output of <code>wc</code>? Note: I understand how head and tail work. Just not the subtleties involved with these particular invocations.

<h3>OS X</h3> For OS X, you can look at the source code for <code>head</code> and the source code for <code>tail</code> to figure out some of what's going on. In the case of <code>tail</code>, you'll want to look at <code>forward.c</code>. So, it turns out that <code>head</code> doesn't do anything special. It just reads its input using the <code>stdio</code> library, so it reads a buffer at a time and might read too much. This means <code>cat file | (head; tail)</code> won't work for small files where <code>head</code>'s buffering makes it read some (or all) of the last 10 lines. On the other hand, <code>tail</code> checks the type of its input file. If it's a regular file, <code>tail</code> seeks to the end and reads backwards until it finds enough lines to emit. This is why <code>(head; tail) < file</code> works on any regular file, regardless of size. <h3>Linux</h3> You could look at the source for <code>head</code> and <code>tail</code> on Linux too, but it's easier to just use <code>strace</code>, like this: <pre class="prettyprint"><code>(strace -o /tmp/head.trace head; strace -o /tmp/tail.trace tail) < file </code></pre> Take a look at <code>/tmp/head.trace</code>. You'll see that the <code>head</code> command tries to fill a buffer (of 8192 bytes in my test) by reading from standard input (file descriptor 0). Depending on the size of <code>file</code>, it may or may not fill the buffer. Anyway, let's assume that it reads 10 lines in that first read. Then, it uses <code>lseek</code> to back up the file descriptor to the end of the 10th line, essentially “unreading” any extra bytes it read. This works because the file descriptor is open on a normal, seekable file. So <code>(head; tail) < file</code> will work for any seekable file, but it won't make <code>cat file | (head; tail)</code> work. On the other hand, <code>tail</code> does not (in my testing) seek to the end and read backwards, like it does on OS X. At least, it doesn't read all the way back to the beginning of the file. Here's my test. Create a small, 12-line input file: <pre class="prettyprint"><code>yes | head -12 | cat -n > /tmp/file </code></pre> Then, try <code>(head; tail) < /tmp/file</code> on Linux. I get this with GNU coreutils 5.97: <pre class="prettyprint"><code> 1 y 2 y 3 y 4 y 5 y 6 y 7 y 8 y 9 y 10 y 11 y 12 y </code></pre> But on OS X, I get this: <pre class="prettyprint"><code> 1 y 2 y 3 y 4 y 5 y 6 y 7 y 8 y 9 y 10 y 3 y 4 y 5 y 6 y 7 y 8 y 9 y 10 y 11 y 12 y </code></pre>

the parenthesis here create a <code>subshell</code> which is another instance of the interpreter to run the commands that are inside, what is interesting is that the subshell acts as a single stdin/stdout combo; in this case it'll first connect stdin to <code>head</code> which echoes the first 10 lines and closes the pipe then the subshell connects its stdin to <code>tail</code> which consumes the rest and writes back the last 10 lines to stdout, but the subshell takes both outputs and writes them as its own stdout and that's why it appears combined. it's worth mentioning that the same effect could be achieved with command grouping like <code>{ head; tail; } < file</code> which is cheaper because it doesn't create another instance of bash.

How does "(head; tail) < file" work?

Q: How do you use head and tail command?

To read the entire file, 'cat', 'more', and 'less' commands are used. But when the specific part of the file is required to read then 'head' and 'tail' commands are used to do that task. 'head' command is used to read the file from the beginning and the 'tail' command is used to read the file from the ending.

Q: What does tail file do?

The tail command shows you data from the end of a file. Usually, new data is added to the end of a file, so the tail command is a quick and easy way to see the most recent additions to a file. It can also monitor a file and display each new text entry to that file as they occur.

Q: How does tail follow work?

The tail -f command prints the last 10 lines of a text or log file, and then waits for new additions to the file to print it in real time. This allows administrators to view a log message as soon as a system creates it.

Q: What does mean in the tail command?

The tail command, as the name implies, print the last N number of data of the given input. By default it prints the last 10 lines of the specified files. If more than one file name is provided then data from each file is precedes by its file name.

2 Answers

OS X

For OS X, you can look at the source code for head and the source code for tail to figure out some of what's going on. In the case of tail, you'll want to look at forward.c.

So, it turns out that head doesn't do anything special. It just reads its input using the stdio library, so it reads a buffer at a time and might read too much. This means cat file | (head; tail) won't work for small files where head's buffering makes it read some (or all) of the last 10 lines.

On the other hand, tail checks the type of its input file. If it's a regular file, tail seeks to the end and reads backwards until it finds enough lines to emit. This is why (head; tail) < file works on any regular file, regardless of size.

Linux

You could look at the source for head and tail on Linux too, but it's easier to just use strace, like this:

(strace -o /tmp/head.trace head; strace -o /tmp/tail.trace tail) < file

Take a look at /tmp/head.trace. You'll see that the head command tries to fill a buffer (of 8192 bytes in my test) by reading from standard input (file descriptor 0). Depending on the size of file, it may or may not fill the buffer. Anyway, let's assume that it reads 10 lines in that first read. Then, it uses lseek to back up the file descriptor to the end of the 10th line, essentially “unreading” any extra bytes it read. This works because the file descriptor is open on a normal, seekable file. So (head; tail) < file will work for any seekable file, but it won't make cat file | (head; tail) work.

On the other hand, tail does not (in my testing) seek to the end and read backwards, like it does on OS X. At least, it doesn't read all the way back to the beginning of the file.

Here's my test. Create a small, 12-line input file:

yes | head -12 | cat -n > /tmp/file

Then, try (head; tail) < /tmp/file on Linux. I get this with GNU coreutils 5.97:

     1  y      2  y      3  y      4  y      5  y      6  y      7  y      8  y      9  y     10  y     11  y     12  y

But on OS X, I get this:

     1  y      2  y      3  y      4  y      5  y      6  y      7  y      8  y      9  y     10  y      3  y      4  y      5  y      6  y      7  y      8  y      9  y     10  y     11  y     12  y

118

answered Sep 23 '22 11:09

rob mayoff

the parenthesis here create a subshell which is another instance of the interpreter to run the commands that are inside, what is interesting is that the subshell acts as a single stdin/stdout combo; in this case it'll first connect stdin to head which echoes the first 10 lines and closes the pipe then the subshell connects its stdin to tail which consumes the rest and writes back the last 10 lines to stdout, but the subshell takes both outputs and writes them as its own stdout and that's why it appears combined.

it's worth mentioning that the same effect could be achieved with command grouping like { head; tail; } < file which is cheaper because it doesn't create another instance of bash.

answered Sep 22 '22 11:09

Samus_

Related questions
                            
                                Pause a running script in Mac terminal and then resume later
                            
                                How to detect if a Node.js script is running through a shell pipe?
                            
                                Using jq to fetch key value from json output
                            
                                Reading lines in a file and avoiding lines with # with Bash
                            
                                How to limit user commands in Linux [closed]
                            
                                Is it possible to print the awk output in the same line
                            
                                Using sed/awk to print lines with matching pattern OR another matching pattern
                            
                                aws cli output automatically being sent to vi
                            
                                In bash, how could I add integers with leading zeroes and maintain a specified buffer
                            
                                How do I iterate through lines in an external file with shell? [duplicate]
                            
                                Taskkill /PID not working in GitBash
                            
                                Working With Hadoop: localhost: Error: JAVA_HOME is not set
                            
                                How do I iterate over each line in a file with Bash?
                            
                                How can I prepend a string to the beginning of each line in a file?
                            
                                How to let Jenkins git commit only if there are changes?
                            
                                How to delete completed kubernetes pod?
                            
                                Return a regex match in a Bash script, instead of replacing it
                            
                                Parsing variables from config file in Bash
                            
                                Bash completion: Honor repository-specific Git alias in alias completion
                            
                                TCP connection, bash only

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How does "(head; tail) < file" work?

Tags:

bash

shell

zellyn

People also ask