Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace string according dictionary file in awk

Tags:

bash

unix

sed

awk

cat input

aaa paul peter
bbb john mike
ccc paul mike 
bbb paul john

And my dictionary file dict:

cat dict

aaa OOO
bbb 111
ccc 222

I need to find string form input and if match first column in file dict, print second column form file dict to first column file input. I can use sub and gsub, but I have thousands row in dict file (with different letters).

cat output:

000 paul peter
111 john mike
222 paul mike 
111 paul john

Thank you for any help.

My solution:

  awk:

awk '{sub(/aaa/,"000",$1); sub(/bbb/,"111",$1); sub(/ccc/,"222",$1)1' input

UPDATE:

If not found match from input in dict, keep the word in first column unchanged.

cat input

aaa paul peter
bbb john mike
ccc paul mike 
bbb paul john
ddd paul peter

cat dict

aaa OOO
bbb 111
ccc 222

cat output:

000 paul peter
111 john mike
222 paul mike 
111 paul john
ddd paul peter
like image 629
Geroge Avatar asked Dec 13 '25 21:12

Geroge


1 Answers

A more generalized approach as suggested by fedorqui in comments for handling mismatch in the names in the input and dict files can be done something as,

awk 'FNR==NR {dict[$1]=$2; next} {$1=($1 in dict) ? dict[$1] : $1}1' dict input

My original solution below works on the cases when there is no missed mappings between the input and the dict files.

awk 'FNR==NR{hash[$2FS$3]=$1; next}{for (i in hash) if (match(hash[i],$1)){print $2, i} }' input dict
OOO paul peter
111 john mike
111 paul john
222 paul mike

The idea is to create a hash-map with index as $2FS$3 and value as $1, i.e. hash["paul peter"]="aaa", etc. Once this is constructed, now the dictionary file is looked upon to see matching lines from $1 in dict with hash value from input file. If the match is found printing the contents as needed.

like image 193
Inian Avatar answered Dec 16 '25 15:12

Inian



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!