I want to calculate MD5 (or other) file hashes (RFC 1321 conform) within MATLAB using the Java-Security-Implementations. Thus I coded
mddigest=java.security.MessageDigest.getInstance('MD5');
filestream=java.io.FileInputStream(java.io.File(filename));
digestream=java.security.DigestInputStream(filestream,mddigest);
md5hash=reshape(dec2hex(typecast(mddigest.digest,'uint8')),1,[])
and the routine is working fine. Somehow, the result differs from given tools.
Maybe there are problems with the file-encoding? Should't MATLAB solve that internally?
I'd like to reproduce the results, one gets by md5sum (on linux), which equal those from HashCalc (Windows).
There is two problems:
This code works:
mddigest = java.security.MessageDigest.getInstance('MD5');
filestream = java.io.FileInputStream(java.io.File(filename));
digestream = java.security.DigestInputStream(filestream,mddigest);
while(digestream.read() ~= -1) end
md5hash=reshape(dec2hex(typecast(mddigest.digest(),'uint8'))',1,[]);
/!\ Edit : p.vitzliputzli answered a very much faster solution which should be used instead of this one.
Stephane's solution works but is quite slow due to MATLAB's limitation of not being able to supply a JAVA byte[] array to the read method of the DigestInputStream (or any other InputStream).
However, we can adapt Thomas Pornin's solution (discarding the FileInputStream) in order to arrive at:
mddigest = java.security.MessageDigest.getInstance('MD5');
bufsize = 8192;
fid = fopen(filename);
while ~feof(fid)
[currData,len] = fread(fid, bufsize, '*uint8');
if ~isempty(currData)
mddigest.update(currData, 0, len);
end
end
fclose(fid);
hash = reshape(dec2hex(typecast(mddigest.digest(),'uint8'))',1,[]);
This solution takes about 0.018s to compute the hash of a 713kB file whereas the other solution takes about 31s.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With