I have two lists of files which I want to diff. The second list has more files in it, and because they are all in alphabetical order when I diff these two lists I get files (lines) that exists in both lists, but in a different place.
I want to diff these two lists, ignoring line place in the list. This way I would get only the new or missing lines in the list.
Thank you.
Explanation: When two files are identical, diff command does not produce any output. It simply returns the shell prompt $. However, we can use the -s option to display an informative message on the terminal if the files are identical.
The Linux diff command is used to compare two files line by line and display the difference between them. This command-line utility lists changes you need to apply to make the files identical.
You can try this approach which involves "subtracting" the two lists as follows:
$ cat file1
a.txt
b.txt
c.txt
$ cat file2
a.txt
a1.txt
b.txt
b2.txt
1) print everything in file2 that is not in file1 i.e. file2 - file1
$ grep -vxFf file1 file2
a1.txt
b2.txt
2) print everything in file1 that is not in file2 i.e. file1 - file2
$ grep -vxFf file2 file1
c.txt
(You can then do what you want with these diffs e.g. write to file, sort etc)
grep options descriptions:
-v, --invert-match select non-matching lines
-x, --line-regexp force PATTERN to match only whole lines
-F, --fixed-strings PATTERN is a set of newline-separated strings
-f, --file=FILE obtain PATTERN from FILE
Do the following:
cat file1 file2 | sort | uniq -u
This will give you a list of lines which are unique (ie, not duplicated).
Explanation:
1) cat file1 file2 will put all of the entries into one list
2) sort will sort the combined list
3) uniq -u will only output the entries which don't have duplicates
comm
command:To demonstrate, let's create two input files:
$ cat <<EOF >a
> a.txt
> b.txt
> c.txt
> EOF
$ cat <<EOF >b
> a.txt
> a1.txt
> b.txt
> b2.txt
> EOF
Now, using the comm
command to get what the question wanted:
$ comm -2 a b
a.txt
b.txt
c.txt
This shows a columnar output with missing files (lines in a
but not in b
) in the first column and extra files (lines in b
but not in a
) in the second column.
comm
do?Here's the output if the command is typed without any switches:
$ comm a b
a.txt
a1.txt
b.txt
b2.txt
c.txt
This shows three columns thus:
a
but not in b
a
and b
b
but not in a
What the numbered switches -123
do is it hides the specified column from the output.
So for example:
-13
results in common lines only-12
results in lines only in b
-23
results in lines only in a
-2
results in the symmetric difference-123
results in no outputIf you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With