I'm trying to join two files which are already sorted
File1
70 CBLB Cbl proto-oncogene B
70 HOXC11 centrosomal protein 57
70 CHD4 chromodomain helicase
70 FANCF FA complementation
70 LUZP2 leucine zipper protein 2
File2
0.700140820757797 ELAVL1
0.700229616476825 HOXC11
0.700328646327188 CHD4
0.700328951649384 LUZP2
Output
Gene Symbol Gene Description Target Score mirDB Target Score Diana
HOXC11 centrosomal protein 57 70 0.700229616476825
CHD4 chromodomain helicase 70 0.700328646327188
LUZP2 leucine zipper protein 2 70 0.700328951649384
To perform this task, I have tried with this script, but it returns an empty file
join -j 2 -o 1.1,1.2,1.3,1.4,2.4 File1 File2 | column -t | sed '1i Gene Symbol, Gene
Description, Target Score mirDB, Target Score Diana' > Output
Any help with awk or join commands requested.
You can try this awk
$ awk 'BEGIN {OFS="\t"; print "Gene Symbol", "Gene Description", "Target Score mirDB", "Target Score Diana"} NR==FNR{array[$2]=$1; next} $0!~array[$2]{print $2,OFS $3" "$4" "$5,$6, $1,OFS array[$2]}' file2 file1
Gene Symbol Gene Description Target Score mirDB Target Score Diana
HOX11 centrosomal protein 57 70 0.700229616476825
CHD4 chromodomain helicase 70 0.700328646327188
LUZP2 leucine zipper protein 2 70 0.700328951649384
BEGIN {
OFS="\t"
print "Gene Symbol", "Gene Description", "Target Score mirDB", "Target Score Diana"
} NR==FNR {
array[$2]=$1
next
} $0!~array[$2] {
print $2,OFS $3" "$4" "$5,$6, $1,OFS array[$2]
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With