Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using `awk` to print number of lines in file in the BEGIN section

I am trying to write an awk script and before anything is done tell the user how many lines are in the file. I know how to do this in the END section but unable to do so in the BEGIN section. I have searched SE and Google but have only found a half dozen ways to do this in the END section or as part of a bash script, not how to do it before any processing has taken place at all. I was hoping for something like the following:

#!/usr/bin/awk -f

BEGIN{
        print "There are a total of " **TOTAL LINES** " lines in this file.\n"
     }
{

        if($0==4587){print "Found record on line number "NR; exit 0;}
}

But have been unable to determine how to do this, if it is even possible. Thanks.

like image 668
Dylan Avatar asked Sep 21 '25 07:09

Dylan


2 Answers

You can read the file twice:

awk 'NR!=1 && FNR==1 {print NR-1} <some more code here>' file{,}

In your example:

awk 'NR!=1 && FNR==1 {print "There are a total of "NR-1" lines in this file.\n"} $0==4587 {print "Found record on line number "NR; exit 0;}' file{,}

You can use file file instead of file{,} (it just makes it show up twice)
NR!=1 && FNR==1 this will be true only at first line of second file.


To use an awk script containing:

#!/usr/bin/awk -f
NR!=1 && FNR==1 {
    print "There are a total of "NR-1" lines in this file.\n"
    } 
$0==4587 {
    print "Found record on line number "NR; exit 0
    }

call:

awk -f myscript file{,}
like image 118
Jotne Avatar answered Sep 23 '25 17:09

Jotne


To do this robustly and for multiple files you need something like:

$ cat tst.awk
BEGINFILE {
    numLines = 0
    while ( (getline line < FILENAME) > 0 ) {
        numLines++
    }
    print "----\nThere are a total of", numLines, "lines in", FILENAME
}
$0==4587 { print "Found record on line number", FNR, "of", FILENAME; nextfile }
$
$ cat file1
a
4587
c
$
$ cat file2
$
$ cat file3
d
e
f
4587
$
$ awk -f tst.awk file1 file2 file3
----
There are a total of 3 lines in file1
Found record on line number 2 of file1
----
There are a total of 0 lines in file2
----
There are a total of 4 lines in file3
Found record on line number 4 of file3

The above uses GNU awk for BEGINFILE. Any other solution is difficult to implement such that it will handle empty files (you need an array to track files being parsed and print info the the FNR==1 and END sections after the empty file has been skipped).

Using getline has caveats and should not be used lightly, see http://awk.info/?tip/getline, but this is one of the appropriate and robust uses of it. You can also test for non-readable files in BEGINFILE by testing ERRNO and skipping the file (see the gawk manual) - that situation will cause other scripts to abort.

like image 23
Ed Morton Avatar answered Sep 23 '25 16:09

Ed Morton