Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

diff while ignoring patterns within a line, but not the entire line

I often have a need to compare two files, while ignoring certain changes within those files. I don't want to ignore entire lines, just a portion of them. The most common case of this is timestamps on the lines, but there are a couple dozen other patterns that I need ignore too.

File1:

[2012-01-02] Some random text foo
[2012-01-02] More output here

File2:

[1999-01-01] Some random text bar
[1999-01-01] More output here

In this example, I want to see the difference on line number 1, but not on line number 2.

Using diff's -I option will not work because it ignores the entire line. Ideal output:

--- file1       2013-04-05 13:39:46.000000000 -0500
+++ file2       2013-04-05 13:39:56.000000000 -0500
@@ -1,2 +1,2 @@
-[2012-01-02] Some random text foo
+[1999-01-01] Some random text bar
 [2012-01-02] More output here

I can pre-process these files with sed:

sed -e's/^\[....-..-..\]//' < file1 > file1.tmp
sed -e's/^\[....-..-..\]//' < file2 > file2.tmp
diff -u file1.tmp file2.tmp

but then I need to put those temporary files somewhere, and remember to clean them up afterwards. Also, my diff output no longer refers to the original filenames, and no longer emits the original lines.

Is there a widely available variant of diff, or a similar tool, that can do this as a single command?

like image 488
Eric Avatar asked Apr 05 '13 18:04

Eric


1 Answers

You can use temporary streams to avoid file creation and cleanup, syntax is following:

$ diff <(command with output) <(other command with output)

In your case:

diff <(cat f1 | sed -e's/^\[....-..-..\]//') <(cat f2 | sed -e's/^\[....-..-..\]//')

Hope this helps.

like image 106
Stan Prokop Avatar answered Sep 18 '22 11:09

Stan Prokop