Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Which is the best way to Compare two documents in Java without any complexity and precise result

I have two word documents which i am trying to compare in java . I tried using

md5 hashcode

HashCode newFile = Files.asByteSource(newFileInput).hash(Hashing.md5());
HashCode oldFile = Files.asByteSource(oldFileInput).hash(Hashing.md5());

and also using,

boolean isEqual = FileUtils.contentEquals(oldFile , newFile);

Even though the contents are same ,compared the content using online tools and beyond compare, still the hashcode in both above method comes as MISMATCH.

any solutions? or way to compare any file type using any API in Java. i need to do deep compare between two word files as in for spaces,fonts , content. etc..

Expected Result : Both file should match

like image 705
backToStack Avatar asked Nov 22 '25 20:11

backToStack


1 Answers

Even if both of your documents look the same or even if both contains the same formatted content, a slightly change like the last modified date will result in a failed comparison. JSON documents are more easier to compare but Word documents are binary. The smallest change can change the document completely.

So you have to do it the hard way: Find a library to read the content of the Word files by yourself and check the content of both files specifically.

like image 64
Milgo Avatar answered Nov 25 '25 10:11

Milgo



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!