I have two files not sortered which have some lines in common.
file1.txt
Z
B
A
H
L
file2.txt
S
L
W
Q
A
The way I'm using to remove common lines is the following:
sort -u file1.txt > file1_sorted.txt
sort -u file2.txt > file2_sorted.txt
comm -23 file1_sorted.txt file2_sorted.txt > file_final.txt
Output:
B
H
Z
The problem is that I want to keep the order of file1.txt, I mean:
Desired output:
Z
B
H
One solution I tought is doing a loop to read all the lines of file2.txt and:
sed -i '/^${line_file2}$/d' file1.txt
But if files are big the performance may suck.
Use comm -12 file1 file2 to get common lines in both files. You may also needs your file to be sorted to comm to work as expected. Or using grep command you need to add -x option to match the whole line as a matching pattern. The F option is telling grep that match pattern as a string not a regex match.
You can use just grep (-v
for invert, -f
for file). Grep lines from input1
that do not match any line in input2
:
grep -vf input2 input1
Gives:
Z
B
H
grep or awk:
awk 'NR==FNR{a[$0]=1;next}!a[$0]' file2 file1
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With