Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I "diff" multiple files against a single base file?

Tags:

diff

delphi

I have a configuration file that I consider to be my "base" configuration. I'd like to compare up to 10 other configuration files against that single base file. I'm looking for a report where each file is compared against the base file.

I've been looking at diff and sdiff, but they don't completely offer what I am looking for.

I've considered diff'ing the base against each file individually, but my problem then become merging those into a report. Ideally, if the same line is missing in all 10 config files (when compared to the base config), I'd like that reported in an easy to visualize manner.

Notice that some rows are missing in several of the config files (when compared individually to the base). I'd like to be able to put those on the same line (as above).

Note, the screenshot above is simply a mockup, and not an actual application.

I've looked at using some Delphi controls for this and writing my own (I have Delphi 2007), but if there is a program that already does this, I'd prefer it.

The Delphi controls I've looked at are TDiff, and the TrmDiff* components included in rmcontrols.

like image 810
Mick Avatar asked Apr 21 '09 14:04

Mick


People also ask

How to merge two diff files?

The easy answer is to use the -D flag to merge the files and surround the differences with C style #ifdef statements. From the documentation: -D NAME --ifdef=NAME Output merged file to show `#ifdef NAME' diffs. I usually then just open the merged file in an editor and resolve the merge conflicts by hand.

How do I compare three files in beyond compare?

If the main file and the other files are all located in the same folder, load the folder in the Folder Compare. Then select the main file and one of the other files. Right click and select Open to launch the two files in the Text Compare. Repeat for each file that must be compared to main.


2 Answers

For people that are still wondering how to do this, diffuse is the closest answer, it does N-way merge by way of displaying all files and doing three way merge among neighboors.

like image 111
Seb64 Avatar answered Oct 03 '22 18:10

Seb64


None of the existing diff/merge tools will do what you want. Based on your sample screenshot you're looking for an algorithm that performs alignments over multiple files and gives appropriate weights based on line similarity.

The first issue is weighting the alignment based on line similarity. Most popular alignment algorithms, including the one used by GNU diff, TDiff, and TrmDiff, do an alignment based on line hashes, and just check whether the lines match exactly or not. You can pre-process the lines to remove whitespace or change everything to lower-case, but that's it. Add, remove, or change a letter and the alignment things the entire line is different. Any alignment of different lines at that point is purely accidental.

Beyond Compare does take line similarity into account, but it really only works for 2-way comparisons. Compare It! also has some sort of similarity algorithm, but it also limited to 2-way comparisons. It can slow down the comparison dramatically, and I'm not aware of any other component or program, commercial or open source, that even tries.

The other issue is that you also want a multi-file comparison. That means either running the 2-way diff algorithm a bunch of times and stitching the results together or finding an algorithm that does multiple alignments at once.

Stitching will be difficult: your sample shows that the original file can have missing lines, so you'd need to compare every file to every other file to get the a bunch of alignments, and then you'd need to work out the best way to match those alignments up. A naive stitching algorithm is pretty easy to do, but it will get messed up by trivial matches (blank lines for example).

There are research papers that cover aligning multiple sequences at once, but they're usually focused on DNA comparisons, you'd definitely have to code it up yourself. Wikipedia covers a lot of the basics, then you'd probably need to switch to Google Scholar.

  • Sequence alignment
  • Multiple sequence alignment
  • Gap penalty
like image 23
Zoë Peterson Avatar answered Oct 03 '22 20:10

Zoë Peterson