Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Inner join on two text files

Tags:

linux

bash

join

Looking to perform an inner join on two different text files. Basically I'm looking for the inner join equivalent of the GNU join program. Does such a thing exist? If not, an awk or sed solution would be most helpful, but my first choice would be a Linux command.

Here's an example of what I'm looking to do

file 1:

0|Alien Registration Card LUA|Checklist Update
1|Alien Registration Card LUA|Document App Plan
2|Alien Registration Card LUA|SA Application Nbr
3|Alien Registration Card LUA|tmp_preapp-DOB
0|App - CSCE Certificate LUA|Admit Type
1|App - CSCE Certificate LUA|Alias 1
2|App - CSCE Certificate LUA|Alias 2
3|App - CSCE Certificate LUA|Alias 3
4|App - CSCE Certificate LUA|Alias 4

file 2:

Alien Registration Card LUA

Results:

0|Alien Registration Card LUA|Checklist Update
1|Alien Registration Card LUA|Document App Plan
2|Alien Registration Card LUA|SA Application Nbr
3|Alien Registration Card LUA|tmp_preapp-DOB
like image 857
Dave Snigier Avatar asked Nov 07 '12 15:11

Dave Snigier


4 Answers

Here's an awk option, so you can avoid the bash dependency (for portability):

$ awk -F'|' 'NR==FNR{check[$0];next} $2 in check' file2 file1

How does this work?

  • -F'|' -- sets the field separator
  • 'NR==FNR{check[$0];next} -- if the total record number matches the file record number (i.e. we're reading the first file provided), then we populate an array and continue.
  • $2 in check -- If the second field was mentioned in the array we created, print the line (which is the default action if no actions are provided).
  • file2 file1 -- the files. Order is important due to the NR==FNR construct.
like image 117
ghoti Avatar answered Oct 27 '22 07:10

ghoti


Should not the file2 contain LUA at the end?

If yes, you can still use join:

join -t'|' -12 <(sort -t'|' -k2 file1) file2
like image 42
choroba Avatar answered Oct 27 '22 07:10

choroba


Looks like you just need

grep -F -f file2 file1
like image 8
glenn jackman Avatar answered Oct 27 '22 05:10

glenn jackman


You may modify this script:

cat file2 | while read line; do
    grep $line file1 # or whatever you want to do with the $line variable
done

while loop reads file2 line by line and gives that line to the grep command that greps that line in file1. There're some extra output that maybe removed with grep options.

like image 4
hcg Avatar answered Oct 27 '22 06:10

hcg