unix: merge 2 files using 2nd columns

Tags:

I'd like to merge two files according to the content of their 2nd columns.

File 1:

"4742"  "209220_at"     2.60700394801826
"104"   "209396_s_at"   2.60651442103297
"749"   "202409_at"     2.59424724783704
"4168"  "209875_s_at"   2.58773204877464
"3973"  "1431_at"       2.52832098784342
"1826"  "207201_s_at"   2.41685345240968

File2:

"653"   "1431_at"       2.14595534191867
"1109"  "207201_s_at"   2.13777517447307
"353"   "212531_at"     2.12706340284672
"381"   "206535_at"     2.11456707231618
"1846"  "204534_at"     2.10919474441178

To have in the end:

"3973"  "1431_at"       2.52832098784342 "653"   "1431_at"       2.14595534191867
"1826"  "207201_s_at"   2.41685345240968 "1109"  "207201_s_at"   2.13777517447307

I have tried comm, diff, some obscure awk one-liner without any success. Any help much appreciated. Ben

553

asked Feb 11 '11 16:02

Benoit B.

2 Answers

You can do that with a combination of the sort and join commands. The straightforward approach is

join -j2 <(sort -k2 file1) <(sort -k2 file2)

but that displays slightly differently than you're looking for. It just shows the common join field and then the remaining fields from each file

"1431_at" "3973" 2.52832098784342 "653" 2.14595534191867
"207201_s_at" "1826" 2.41685345240968 "1109" 2.13777517447307

If you need the format exactly as you showed, then you would need to tell join to output in that manner

join -o 1.1,1.2,1.3,2.1,2.2,2.3 -j2 <(sort -k2 file1) <(sort -k2 file2)

where -o accepts a list of FILENUM.FIELDNUM specifiers.

Note that the <() syntax I'm using isn't POSIX sh, so you should sort to a temporary file if you need POSIX sh syntax.

197

answered Sep 24 '22 02:09

jamessan

awk '
  # store the first file, indexed by col2
  NR==FNR {f1[$2] = $0; next}
  # output only if file1 contains file2's col2
  ($2 in f1) {print f1[$2], $0}
' file1 file2

answered Sep 24 '22 02:09

glenn jackman

Related questions
                            
                                How to restore a window with Xlib?
                            
                                Sort the file in unix using first six characters of a line
                            
                                grep matching specific position in lines using words from other file
                            
                                What is added to an executable when linking with a shared library?
                            
                                How do I trigger the default signal handling behavior?
                            
                                Printing executed commands
                            
                                How are stdin and stdout made unique to the process?
                            
                                Random selection of columns using linux command
                            
                                Timer hangs main thread
                            
                                what is function parameter scope in qore?
                            
                                Unable to understand pthread_create() behaviour in the following program?
                            
                                os.path.isdir() returns false on unaccessible, but existing directory
                            
                                How to wait for the other end of a named pipe to be open?
                            
                                What if the child exits before the parent calls wait()?
                            
                                How do you write to a pty master Rust
                            
                                Detecting interactive shell within ksh ENV script
                            
                                How to create folders using file names and then move files into folders?
                            
                                What are the difference between Cygwin on Windows and real UNIX environment
                            
                                using backreferences regex in sed
                            
                                *nix configuration file storage convention?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

unix: merge 2 files using 2nd columns

Tags:

merge

unix

awk

Benoit B.

People also ask

2 Answers

jamessan

glenn jackman

Recent Activity

Donate For Us