Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can md5 be broken up to run across multiple cores/threads?

Tags:

c

md5

When calculating the md5 sum of large files, I see a single cpu core jump to 100% for however long it takes, leaving all other cores idle.

My rudimentary understanding of md5 is the entire process is completely linear, where values are dependent on all previous values read, and there is nothing we can do to make it multi-threaded. Is this true?

Or is there a way to break the files into sections, calculate <something> over multiple parts using multi-cores, and then combine those <something> values into the final md5?

The library we're using to calculate the md5sum is http://libmd5-rfc.sourceforge.net/ but I'd switch to a different one if it was possible to break the md5sum across multiple cores so it completes faster.

(Note: changing to something other than md5 is not the question, nor can it be done because of the other closed systems to which this interfaces. Nor is this question about the safety of using md5.)

like image 915
Stéphane Avatar asked May 23 '12 19:05

Stéphane


1 Answers

No you cannot break it apart at the file level. MD5 maintains a state as it runs through the data.

like image 53
pizza Avatar answered Oct 17 '22 09:10

pizza