This was a bug and a fix is available in git repo now.
I can't understand how a circumflex in FS
is interpreted.
For example, here is my file
:
$ cat file
foo bar
baz quz
I wrote this awk script:
BEGIN{FS="^.";OFS="|"}{$1=$1}1
and was expecting this output:
|oo bar
|uz baz
but with gawk I got this:
$ gawk 'BEGIN{FS="^.";OFS="|"}{$1=$1}1' file
||o bar
||z quz
And it gets stranger with more dots:
$ gawk 'BEGIN{FS="^..";OFS="|"}{$1=$1}1' file
||bar
||quz
$ gawk 'BEGIN{FS="^...";OFS="|"}{$1=$1}1' file
||r
||z
$ gawk 'BEGIN{FS="^....";OFS="|"}{$1=$1}1' file
|bar
|quz
I couldn't find an explanation in neither POSIX awk specification nor gawk manual. Can you guys please help me understand what's going on? What am I missing here?
awk Built-in Variables FS - Field Separator The variable FS is used to set the input field separator. In awk , space and tab act as default field separators. The corresponding field value can be accessed through $1 , $2 , $3 ... and so on.
Just put your desired field separator with the -F option in the AWK command and the column number you want to print segregated as per your mentioned field separator.
BEGIN pattern: means that Awk will execute the action(s) specified in BEGIN once before any input lines are read. END pattern: means that Awk will execute the action(s) specified in END before it actually exits.
It is clearly a bug and probably a memory leak. When you ask to print NF
before, the behaviour is as expected:
$ gawk 'BEGIN{FS="^.";OFS="|"; $0="foo"; $1=$1; print}'
||oo
$ gawk 'BEGIN{FS="^.";OFS="|"; $0="foo"; $1=$1; print NF; print}'
2
|oo
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With