i would like to know difference between below 2 commands, I understand that 2) should be use but i want to know the exact sequence that happens in 1) and 2) suppose filename has 200 characters in it
1) cat filename | grep regex
2) grep regex filename
Normally cat opens file and prints its contents line by line to stdout. But here it outputs its content to pipe'|'. After that grep reads from pipe (it takes pipe as stdin) then if matches regex prints line to stdout. But here there is a detail grep is opened in new shell process so pipe forwards its input as output to new shell process.
Hence grep ^b example| cat –n prints all matching lines, numbered. This is different from grep -n ^b example, where grep adds the line numbers of the matches. grep knows the line numbers of the original file, while cat only sees the output of grep and numbers the lines accordingly. So, given the input file $ cat example bar foo basf
So, grep -c ^b example prints the number (count) of lines matching the pattern, while grep ^b example prints the lines themselves. Given the pipe and no file names, cat reads from the pipe, so the output is the output of grep, with line numbers added. Hence grep ^b example| cat –n prints all matching lines, numbered.
The egrep is a variation of grep that is available in operating systems to perform searching using extended regular expressions. Grep represents Global Regular Expressions Print whereas egrep represents Extended Global Regular Expressions Print.
Functionally (in terms of output), those two are the same. The first one actually creates a separate process cat
which simply send the contents of the file to standard output, which shows up on the standard input of the grep
, because the shell has connected the two with a pipe.
In that sense grep regex <filename
is also equivalent but with one less process.
Where you'll start seeing the difference is in variants when the extra information (the file names) is used by grep
, such as with:
grep -n regex filename1 filename2
The difference between that and:
cat filename1 filename2 | grep -n regex
is that the former knows about the individual files whereas the latter sees it as one file (with no name).
While the former may give you:
filename1:7:line with regex in 10-line file
filename2:2:another regex line
the latter will be more like:
7:line with regex in 10-line file
12:another regex line
Another executable that acts differently if it knows the file names is wc
, the word counter programs:
$ cat qq.in
1
2
3
$ wc -l qq.in # knows file so prints it
3 qq.in
$ cat qq.in | wc -l # does not know file
3
$ wc -l <qq.in # also does not know file
3
First one:
cat filename | grep regex
Normally cat opens file and prints its contents line by line to stdout. But here it outputs its content to pipe'|'. After that grep reads from pipe(it takes pipe as stdin) then if matches regex prints line to stdout. But here there is a detail grep is opened in new shell process so pipe forwards its input as output to new shell process.
Second one:
grep regex filename
Here grep directly reads from file(above it was reading from pipe) and matches regex if matched prints line to stdout.
If you want to check the actual execution time diffrence, first create a file with 100000 lines:
user@server ~ $ for i in $(seq 1 100000); do echo line${1} >> test_f; done
user@server ~ $ wc -l test_f
100000 test_f
Now measure:
user@server ~ $ time grep line test_f
#...
real 0m1.320s
user 0m0.101s
sys 0m0.122s
user@server ~ $ time cat test_f | grep line
#...
real 0m1.288s
user 0m0.132s
sys 0m0.108s
As we can see, the diffrence is not too big...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With