Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Diff and intersection reporting between two text files

Disclaimer: I am new to programming and scripting in general so please excuse the lack of technical terms

So i have two text file data sets that contain names listed:

First File | Second File
bob        | bob
mark       | mark
larry      | bruce
tom        | tom

I would like to run a script (pref python) that outputs the intersection lines in one text file and the different lines in another text file, ex:

matches.txt:

bob 
mark 
tom 

differences.txt:

bruce

How would I accomplish this with Python? Or with a Unix command line, if it's easy enough?

like image 666
Mark Halpern Avatar asked Nov 27 '22 03:11

Mark Halpern


2 Answers

sort | uniq is good, but comm might be even better. "man comm" for more information.

From the manual page:

EXAMPLES
       comm -12 file1 file2
              Print only lines present in both file1 and file2.

       comm -3 file1 file2
              Print lines in file1 not in file2, and vice versa.

You can also use the Python set type, but comm is easier.

like image 108
dstromberg Avatar answered Dec 07 '22 01:12

dstromberg


Unix shell solution-:

# duplicate lines
sort text1.txt text2.txt | uniq -d

# unique lines
sort text1.txt text2.txt | uniq -u
like image 45
suspectus Avatar answered Dec 06 '22 23:12

suspectus