Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bash: Head & Tail behavior with bash script

Tags:

Suppose I have following script:-

test.sh

#!/bin/bash command1  #prints 5 lines command2  #prints 3 lines 

I run the script with test.sh|head -n5

What will happen in this case? Will it run both the commands? or will it stop after command1? What if I call it with -n1?

Background: I might be asking a very basic question, but I actually noticed something interesting. My script(different one) was processing 7,000 files and each file produces 1 line of output. It takes 7 minutes to run the script completely but doing head -n1 gave me prompt immediately like the script has terminated after processing first file only

Edit: Following is my script

for i in $(ls filepath);do      echo "$i" # issue here     python mySript "$i" > "/home/user/output/""$i"".out"   fi done 

Removing echo above enables the script to run full 7 minute with head -n1, but with echo it just prints first line then exit.

like image 691
Mangat Rai Modi Avatar asked Oct 20 '14 08:10

Mangat Rai Modi


People also ask

What does head do bash?

More bash commands head is used to print the first ten lines (by default) or any other amount specified of a file or files. cat , on the other hand, is used to read a file sequentially and print it to the standard output (that is, it prints out the entire contents of the file).

What is shell head?

Description. head command is a command-line utility, which prints the first 10 lines of the specified files. If more than one file name is provided then data from each file is preceded by its file name. We can change the number of lines the head command prints by using the -n command line option.

What is the use of head and tail command?

As their names imply, the head command will output the first part of the file, while the tail command will print the last part of the file. Both commands write the result to standard output.

What is tail in bash?

Use of Tail Command By default, the 'tail' command reads the last 10 lines of the file. If you want to read more or less than 10 lines from the ending of the file then you have to use the '-n' option with the 'tail' command.


2 Answers

This is a fairly interesting issue! Thanks for posting it!

I assumed that this happens as head exits after processing the first few lines, so SIGPIPE signal is sent to the bash running the script when it tries to echo $x next time. I used RedX's script to prove this theory:

#!/usr/bin/bash rm x.log for((x=0;x<5;++x)); do     echo $x     echo $x>>x.log done 

This works, as You described! Using t.sh|head -n 2 it writes only 2 lines to the screen and to x.log. But trapping SIGPIPE this behavior changes...

#!/usr/bin/bash trap "echo SIGPIPE>&2" PIPE rm x.log for((x=0;x<5;++x)); do     echo $x     echo $x>>x.log done 

Output:

$ ./t.sh |head -n 2 0 1 ./t.sh: line 5: echo: write error: Broken pipe SIGPIPE ./t.sh: line 5: echo: write error: Broken pipe SIGPIPE ./t.sh: line 5: echo: write error: Broken pipe SIGPIPE 

The write error occurs as stdout is already closed as the other end of the pipe is closed. And any attempt to write to the closed pipe causes a SIGPIPE signal, which terminates the program by default (see man 7 signal). The x.log now contains 5 lines.

This also explains why /bin/echo solved the problem. See the following script:

rm x.log for((x=0;x<5;++x)); do     /bin/echo $x     echo "Ret: $?">&2     echo $x>>x.log done 

Output:

$ ./t.sh |head -n 2 0 Ret: 0 1 Ret: 0 Ret: 141 Ret: 141 Ret: 141 

Decimal 141 = hex 8D. Hex 80 means a signal was received, hex 0D is for SIGPIPE. So when /bin/echo tried to write to stdout it got a SIGPIPE and it was terminated (as default behavior) instead of the bash running the script.

like image 177
TrueY Avatar answered Oct 14 '22 20:10

TrueY


Nice finding. According to my tests it's exactly like you said. For example I have this script that just eats cpu, to let us spot it in top:

for i in `seq 10`   do echo $i   x=`seq 10000000` done 

Piping the script with head -n1 we see the command returning after the first line. This is the head behavior: it completed its work, so it can stop and return the control to you.

The input script should continue running but look what happens: when the head returns, its pid doesn't exist anymore. So when linux tries to send the output of the script to the head process, it does not find the process, so the script crashes and stops.

Let's try it with a python script:

for i in xrange(10):     print i     range(10000000) 

When running it and piping to head you have this:

$ python -u test.py | head -n1 0 Traceback (most recent call last):   File "test.py", line 2, in <module>     print i IOError: [Errno 32] Broken pipe 

The -u option tells python to automatically flush the stdin and stdout, as bash would do. So you see that the program actually stops with an error.

like image 20
enrico.bacis Avatar answered Oct 14 '22 18:10

enrico.bacis