I have a file with a correspondence key -> value:
sort keyFile.txt | head
ENSMUSG00000000001 ENSMUSG00000000001_Gnai3
ENSMUSG00000000003 ENSMUSG00000000003_Pbsn
ENSMUSG00000000003 ENSMUSG00000000003_Pbsn
ENSMUSG00000000028 ENSMUSG00000000028_Cdc45
ENSMUSG00000000028 ENSMUSG00000000028_Cdc45
ENSMUSG00000000028 ENSMUSG00000000028_Cdc45
ENSMUSG00000000031 ENSMUSG00000000031_H19
ENSMUSG00000000031 ENSMUSG00000000031_H19
ENSMUSG00000000031 ENSMUSG00000000031_H19
ENSMUSG00000000031 ENSMUSG00000000031_H19
And I would like to replace every correspondence of "key" with the "value" in the temp.txt:
head temp.txt
ENSMUSG00000000001:001 515
ENSMUSG00000000001:002 108
ENSMUSG00000000001:003 64
ENSMUSG00000000001:004 45
ENSMUSG00000000001:005 58
ENSMUSG00000000001:006 63
ENSMUSG00000000001:007 46
ENSMUSG00000000001:008 11
ENSMUSG00000000001:009 13
ENSMUSG00000000003:001 0
The result should be:
out.txt
ENSMUSG00000000001_Gnai3:001 515
ENSMUSG00000000001_Gnai3:002 108
ENSMUSG00000000001_Gnai3:003 64
ENSMUSG00000000001_Gnai3:004 45
ENSMUSG00000000001_Gnai3:005 58
ENSMUSG00000000001_Gnai3:006 63
ENSMUSG00000000001_Gnai3:007 46
ENSMUSG00000000001_Gnai3:008 11
ENSMUSG00000000001_Gnai3:009 13
ENSMUSG00000000001_Gnai3:001 0
I have tried a few variations following this AWK example but as you can see the result is not what I expected:
awk 'NR==FNR{a[$1]=$1;next}{$1=a[$1];}1' keyFile.txt temp.txt | head
515
108
64
45
58
63
46
11
13
0
My guess is that column 1 of temp does not match 'exactly' column 1 of keyValues. Could someone please help me with this?
R/python/sed solutions are also welcome.
Use awk command like this:
awk 'NR==FNR {a[$1]=$2;next} {
split($1, b, ":");
if (b[1] in a)
print a[b[1]] ":" b[2], $2;
else
print $0;
}' keyFile.txt temp.txt
Code for GNU sed:
$sed -nr '$!N;/^(.*)\n\1$/!bk;D;:k;s#\S+\s+(\w+)_(\w+)#/^\1/s/(\\w+)(:\\w+)\\s+(\\w+)/\\1_\2\\2 \\3/p#;P;s/^(.*)\n//' keyfile.txt|sed -nrf - temp.txt ENSMUSG00000000001_Gnai3:001 515 ENSMUSG00000000001_Gnai3:002 108 ENSMUSG00000000001_Gnai3:003 64 ENSMUSG00000000001_Gnai3:004 45 ENSMUSG00000000001_Gnai3:005 58 ENSMUSG00000000001_Gnai3:006 63 ENSMUSG00000000001_Gnai3:007 46 ENSMUSG00000000001_Gnai3:008 11 ENSMUSG00000000001_Gnai3:009 13 ENSMUSG00000000003_Pbsn:001 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With