Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Simple awk command issue (FS, OFS related)

Tags:

I tried to reorganize the format of a file containing:

>Humanl|chr16:86430087-86430726 | element 1 | positive >Humanl|chr16:85620095-85621736 | element 2 | negative >Humanl|chr16:80423343-80424652 | element 3 | negative >Humanl|chr16:80372593-80373755 | element 4 | positive >Humanl|chr16:79969907-79971297 | element 5 | negative >Humanl|chr16:79949950-79951518 | element 6 | negative >Humanl|chr16:79026563-79028162 | element 7 | negative >Humanl|chr16:78933253-78934686 | element 9 | negative >Humanl|chr16:78832182-78833595 | element 10 | negative 

My command is:

awk '{FS="|";OFS="\t"} {print $1,$2,$3,$4,$5}' 

Here is the output:

>Human|chr16:86430087-86430726  |      element 1      | >Human  chr16:85620095-85621736         element 2      negative >Human  chr16:80423343-80424652         element 3      negative >Human  chr16:80372593-80373755         element 4      positive >Human  chr16:79969907-79971297         element 5      negative >Human  chr16:79949950-79951518         element 6      negative >Human  chr16:79026563-79028162         element 7      negative >Human  chr16:78933253-78934686         element 9      negative >Human  chr16:78832182-78833595         element 10     negative 

Every line works fine except for the first line. I don't understand why this happened.

Can someone help me with it? Thanks!

like image 731
olala Avatar asked Apr 24 '13 22:04

olala


People also ask

What is FS and OFS in awk?

FS - Field Separator. NF - Number of Fields. NF - Number of Fields. NR - Total Number of Records. OFS - Output Field Separator.

How do I use awk with FS?

awk Built-in Variables FS - Field SeparatorThe variable FS is used to set the input field separator. In awk , space and tab act as default field separators. The corresponding field value can be accessed through $1 , $2 , $3 ... and so on. -F - command-line option for setting input field separator.

What is awk '- F option?

The -f option only controls where the awk program is read from. If enabled, it means that the first filename is in fact the name of a file that contains the awk program. Otherwise, the first filename is the first file to start looking for patterns.

What is RS and FS in awk?

0) then the output record separator is set to the default record separator (RS), which is newline. If the record count is not a multiple of 3 (NR%3 == 0) then the output record separator is set to the default field separator (FS) which is space.


2 Answers

Short answer

FS and OFS are set too late to affect the first line, use something like this instead:

awk '{print $1,$2,$3,$4,$5}' FS='|' OFS='\t' 

You can also use this shorter version:

awk -v FS='|' -v OFS='\t' '$1=$1' 

A bit longer answer

It doesn't work because awk has already performed record/field splitting at the time when FS and OFS are set. You can force a re-splitting by setting $0 to $0, e.g.:

awk '{FS="|";OFS="\t";$0=$0} {print $1,$2,$3,$4,$5}' 

The conventional ways to do this are 1. set FS and others in the BEGIN clause, 2. set them through the -v VAR=VALUE notation, or 3. append them after the script as VAR=VALUE. My preferred style is the last alternative:

awk '{print $1,$2,$3,$4,$5}' FS='|' OFS='\t' 

Note that there is a significant difference between when -v and post-script variables are set. -v will set variables before the BEGIN clause whilst post-script setting of variables are set just after the BEGIN clause.

like image 86
Thor Avatar answered Oct 06 '22 20:10

Thor


try:

awk 'BEGIN{FS="|";OFS="\t"} {print $1,$2,$3,$4,$5}' 
like image 41
Kent Avatar answered Oct 06 '22 19:10

Kent