Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

removing lines with special characters in awk

Tags:

awk

I have a text file like this:

VAREAKAVVLRDRKSTRLN 2888
ACP*VRWPIYTACGP 292
RDRKSTRLNSSHVVTSRMP 114
VAREA*KAVVLRDRRAHV*T    73

in the 1st column in some rows there is a "*". I want to remove all the lines with that '*'. here is the expected output:

expected output:

VAREAKAVVLRDRKSTRLN 2888
RDRKSTRLNSSHVVTSRMP 114

to do so, I am using this code:

awk -F "\t" '{ if(($1 == '*')) { print $1 "," $2} }' infile.txt > outfile.txt

this code does not return the expected output. how can I fix it?

like image 282
bzmby Avatar asked Nov 07 '25 09:11

bzmby


1 Answers

how can I fix it?

You did

awk -F "\t" '{ if(($1 == '*')) { print $1 "," $2} }' infile.txt > outfile.txt

by doing $1 == "*" you are asking: is first field * not does first contain *? You might use index function which does return position of match if found or 0 otherwise. Let infile.txt content be

VAREAKAVVLRDRKSTRLN 2888
ACP*VRWPIYTACGP 292
RDRKSTRLNSSHVVTSRMP 114
VAREA*KAVVLRDRRAHV*T    73

then

awk 'index($1,"*")==0{print $1,$2}' infile.txt

output

VAREAKAVVLRDRKSTRLN 2888
RDRKSTRLNSSHVVTSRMP 114

Note that if you use index rather than pattern /.../ you do not have to care about characters with special meaning, e.g. .. Note that for data you have you do not have to set field separator (FS) explicitly. Important ' is not legal string delimiter in GNU AWK, you should use " for that purpose, unless your intent is to summon hard to find bugs.

(tested in gawk 4.2.1)

like image 80
Daweo Avatar answered Nov 10 '25 10:11

Daweo



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!