I often have a need to compare two files, while ignoring certain changes within those files. I don't want to ignore entire lines, just a portion of them. The most common case of this is timestamps on the lines, but there are a couple dozen other patterns that I need ignore too.
File1:
[2012-01-02] Some random text foo
[2012-01-02] More output here
File2:
[1999-01-01] Some random text bar
[1999-01-01] More output here
In this example, I want to see the difference on line number 1, but not on line number 2.
Using diff's -I option will not work because it ignores the entire line. Ideal output:
--- file1 2013-04-05 13:39:46.000000000 -0500
+++ file2 2013-04-05 13:39:56.000000000 -0500
@@ -1,2 +1,2 @@
-[2012-01-02] Some random text foo
+[1999-01-01] Some random text bar
[2012-01-02] More output here
I can pre-process these files with sed:
sed -e's/^\[....-..-..\]//' < file1 > file1.tmp
sed -e's/^\[....-..-..\]//' < file2 > file2.tmp
diff -u file1.tmp file2.tmp
but then I need to put those temporary files somewhere, and remember to clean them up afterwards. Also, my diff output no longer refers to the original filenames, and no longer emits the original lines.
Is there a widely available variant of diff, or a similar tool, that can do this as a single command?
You can use temporary streams to avoid file creation and cleanup, syntax is following:
$ diff <(command with output) <(other command with output)
In your case:
diff <(cat f1 | sed -e's/^\[....-..-..\]//') <(cat f2 | sed -e's/^\[....-..-..\]//')
Hope this helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With