Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

caret (^) in FS (gawk)

Tags:

awk

Update

This was a bug and a fix is available in git repo now.


I can't understand how a circumflex in FS is interpreted. For example, here is my file:

$ cat file
foo bar
baz quz

I wrote this awk script:

BEGIN{FS="^.";OFS="|"}{$1=$1}1

and was expecting this output:

|oo bar
|uz baz

but with gawk I got this:

$ gawk 'BEGIN{FS="^.";OFS="|"}{$1=$1}1' file
||o bar
||z quz

And it gets stranger with more dots:

$ gawk 'BEGIN{FS="^..";OFS="|"}{$1=$1}1' file
||bar
||quz
$ gawk 'BEGIN{FS="^...";OFS="|"}{$1=$1}1' file
||r
||z
$ gawk 'BEGIN{FS="^....";OFS="|"}{$1=$1}1' file
|bar
|quz

I couldn't find an explanation in neither POSIX awk specification nor gawk manual. Can you guys please help me understand what's going on? What am I missing here?

like image 258
oguz ismail Avatar asked Apr 16 '19 11:04

oguz ismail


People also ask

What does FS do in awk?

awk Built-in Variables FS - Field Separator The variable FS is used to set the input field separator. In awk , space and tab act as default field separators. The corresponding field value can be accessed through $1 , $2 , $3 ... and so on.

How do you specify a separator in awk?

Just put your desired field separator with the -F option in the AWK command and the column number you want to print segregated as per your mentioned field separator.

What is awk begin?

BEGIN pattern: means that Awk will execute the action(s) specified in BEGIN once before any input lines are read. END pattern: means that Awk will execute the action(s) specified in END before it actually exits.


1 Answers

It is clearly a bug and probably a memory leak. When you ask to print NF before, the behaviour is as expected:

$ gawk 'BEGIN{FS="^.";OFS="|"; $0="foo"; $1=$1; print}'
||oo
$ gawk 'BEGIN{FS="^.";OFS="|"; $0="foo"; $1=$1; print NF; print}'
2
|oo
like image 126
kvantour Avatar answered Oct 18 '22 19:10

kvantour