Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating MD5 Hash (RFC 1321 conform) in Matlab via Java

I want to calculate MD5 (or other) file hashes (RFC 1321 conform) within MATLAB using the Java-Security-Implementations. Thus I coded

mddigest=java.security.MessageDigest.getInstance('MD5');
filestream=java.io.FileInputStream(java.io.File(filename));
digestream=java.security.DigestInputStream(filestream,mddigest);
md5hash=reshape(dec2hex(typecast(mddigest.digest,'uint8')),1,[])

and the routine is working fine. Somehow, the result differs from given tools.
Maybe there are problems with the file-encoding? Should't MATLAB solve that internally?
I'd like to reproduce the results, one gets by md5sum (on linux), which equal those from HashCalc (Windows).

like image 696
Bastian Ebeling Avatar asked Aug 27 '12 10:08

Bastian Ebeling


2 Answers

There is two problems:

  1. You don't read the file.
  2. You have to transpose the matrix before reshape it.

This code works:

mddigest   = java.security.MessageDigest.getInstance('MD5'); 
filestream = java.io.FileInputStream(java.io.File(filename)); 
digestream = java.security.DigestInputStream(filestream,mddigest);

while(digestream.read() ~= -1) end

md5hash=reshape(dec2hex(typecast(mddigest.digest(),'uint8'))',1,[]);

/!\ Edit : p.vitzliputzli answered a very much faster solution which should be used instead of this one.

like image 137
Stéphane Pinchaux Avatar answered Sep 30 '22 00:09

Stéphane Pinchaux


Stephane's solution works but is quite slow due to MATLAB's limitation of not being able to supply a JAVA byte[] array to the read method of the DigestInputStream (or any other InputStream).

However, we can adapt Thomas Pornin's solution (discarding the FileInputStream) in order to arrive at:

mddigest   = java.security.MessageDigest.getInstance('MD5'); 

bufsize = 8192;

fid = fopen(filename);

while ~feof(fid)
    [currData,len] = fread(fid, bufsize, '*uint8');       
    if ~isempty(currData)
        mddigest.update(currData, 0, len);
    end
end

fclose(fid);

hash = reshape(dec2hex(typecast(mddigest.digest(),'uint8'))',1,[]);

This solution takes about 0.018s to compute the hash of a 713kB file whereas the other solution takes about 31s.

like image 20
p.vitzliputzli Avatar answered Sep 30 '22 02:09

p.vitzliputzli