Let a file with content as under -
abcdefghijklmn
pqrstuvwxyzabc
defghijklmnopq
In general if any operation using awk is performed, it iterates line by line and performs that action on each line.
For e.g:
awk '{print substr($0,8,10)}' file
O/P:
hijklmn
wxyzabc
klmnopq
I would like to know an approach in which all the contents inside the file is treated as a single variable and awk prints just one output.
Example Desired O/P:
hijklmnpqr
It's not that I wish for the desired output for the given question but in general would appreciate if anyone could suggest an approach to provide the content of a file as a whole to the awk.
gawk solutionFrom the docs:
There are times when you might want to treat an entire data file as a single record. The only way to make this happen is to give RS a value that you know doesn’t occur in the input file. This is hard to do in a general way, such that a program always works for arbitrary input files.
$ cat file
abcdefghijklmn
pqrstuvwxyzabc
defghijklmnopq
The RS must be set to a pattern not present in archive, following Denis Shirokov suggestion on the docs (Thanks @EdMorton):
$ gawk '{print ">>>"$0"<<<<"}' RS='^$' file
>>>abcdefghijklmn
pqrstuvwxyzabc
defghijklmnopq
abcdefghijklmn
pqrstuvwxyzabc
defghijklmnopq
<<<<
The trick is in bold font:
It works by setting RS to
^$, a regular expression that will never match if the file has contents. gawk reads data from the file into tmp, attempting to match RS. The match fails after each read, but fails quickly, such that gawk fills tmp with the entire contents of the file
So:
$ gawk '{gsub(/\n/,"");print substr($0,8,10)}' RS='^$' file
Returns:
hijklmnpqr
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With