I have two files like shown below which are tab-delimited:
file A
chr1 123 aa b c d
chr1 234 a b c d
chr1 345 aa b c d
chr1 456 a b c d
....
file B
xxxx abcd chr1 123 aa c d e
yyyy defg chr1 345 aa e f g
...
I want to join the two files based on 3 columns with "chr1", "123" and "aa" and add first two columns from file B to file A, such that output looks as shown below: output:
chr1 123 aa b c d xxxx abcd
chr1 234 a b c d
chr1 345 aa b c d yyyy defg
chr1 456 a b c d
Could anyone help to do this in awk. If possible using awk oneliners?
Open the two files you want to merge. Select all text (Command+A/Ctrl+A) from one document, then paste it into the new document (Command+V/Ctrl+V). Repeat steps for the second document. This will finish combining the text of both documents into one.
Appending content to an existing file To append content after you merge multiple files in Linux to another file, use double redirection operator. (>>) along with cat command. Rather than overwriting the contents of the file, this command appends the content at the end of the file.
NR and FNR are two built-in awk variables. NR tells us the total number of records that we've read so far, while FNR gives us the number of records we've read in the current input file.
Here is one approach using awk
:
$ awk 'NR==FNR{a[$3,$4]=$1OFS$2;next}{$6=a[$1,$2];print}' OFS='\t' fileb filea
chr1 123 a b c xxxx abcd
chr1 234 a b c
chr1 345 a b c yyyy defg
chr1 456 a b c
Explanation:
NR==FNR # current recond num match the file record num i.e in filea
a[$3,$4]=$1OFS$2 # Create entry in array with fields 3 and 4 as the key
next # Grab the next line (don't process the next block)
$6=a[$1,$2] # Assign the looked up value to field 6 (+rebuild records)
print # Print the current line & the matching entry from fileb ($6)
OFS='\t' # Seperate each field with a single TAB on output
Edit:
For the 3 field problem you simple add the extra field:
$ awk 'NR==FNR{a[$3,$4,$5]=$1OFS$2;next}{$6=a[$1,$2,$3];print}' OFS='\t' fileb filea
chr1 123 aa b c xxxx abcd
chr1 234 a b c
chr1 345 aa b c yyyy defg
chr1 456 a b c
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With