Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove common lines between two files without sorting? [duplicate]

I have two files not sortered which have some lines in common.

file1.txt

Z
B
A
H
L

file2.txt

S
L
W
Q
A

The way I'm using to remove common lines is the following:

sort -u file1.txt > file1_sorted.txt
sort -u file2.txt > file2_sorted.txt

comm -23 file1_sorted.txt file2_sorted.txt > file_final.txt

Output:

B
H
Z

The problem is that I want to keep the order of file1.txt, I mean:

Desired output:

Z
B
H

One solution I tought is doing a loop to read all the lines of file2.txt and:

sed -i '/^${line_file2}$/d' file1.txt

But if files are big the performance may suck.

  • Do you like my idea?
  • Do you have any alternative to do it?
like image 952
mllamazares Avatar asked Jun 20 '14 09:06

mllamazares


People also ask

How do I find the common line between two files?

Use comm -12 file1 file2 to get common lines in both files. You may also needs your file to be sorted to comm to work as expected. Or using grep command you need to add -x option to match the whole line as a matching pattern. The F option is telling grep that match pattern as a string not a regex match.


2 Answers

You can use just grep (-v for invert, -f for file). Grep lines from input1 that do not match any line in input2:

grep -vf input2 input1 

Gives:

Z
B
H
like image 134
perreal Avatar answered Oct 07 '22 20:10

perreal


grep or awk:

awk 'NR==FNR{a[$0]=1;next}!a[$0]' file2 file1
like image 39
Kent Avatar answered Oct 07 '22 21:10

Kent