Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Expected performance of MD5 calculation in javascript?

I am trying out calculation of MD5 using javascript and looking at fastest MD5 Implementation in JavaScript post 'JKM' implementation is suppose to be one of the faster implementations. I am using SparkMD5 which is based of off JKM implementation. However the example provided https://github.com/satazor/SparkMD5/blob/master/test/readme_example.html takes about 10seconds for a 13MB file (~23 seconds with debugger) while the same file takes only 0.03seconds using md5sum function in linux command line. Are these results too slow for javascript implementation or is this poor performance expected?

like image 505
Nickolay Kondratyev Avatar asked Mar 04 '15 02:03

Nickolay Kondratyev


1 Answers

It is expected.

First, I don't think I need to tell you that JAVASCRIPT IS SLOW. Yes, even with modern JIT optimization etc. JavaScript is still slow.

To show you that it is not your JS implementation's fault, I will do some comparisons with Node.js, so that the browser DOM stuff doesn't get in the way for benchmarking.

Test file generation:

$ dd if=/dev/zero of=file bs=6M count=1

(my server only has 512 MB of RAM and Node.js can't take anything higher than 6M)

Script:

//var md5 = require('crypto-js/md5')
var md5 = require('MD5')
//var md5 = require('spark-md5').hash
//var md5 = require('blueimp-md5').md5

require('fs').readFile('file', 'utf8', function(e, b) {  // Using string here to be fair for all md5 engines
  console.log(md5(b))
})

(you can uncomment the contestants/benchmarkees)

The result is: (file reading overhead removed)

  • MD5: 5.250s - 0.072s = 5.178s
  • crypto-js/md5: 4.914s - 0.072s = 4.842s
  • Blueimp: 4.904s - 0.072s = 4.832s
  • MD5 with Node.js binary buffer instead of string: 1.143s - 0.063s = 1.080s
  • spark: 0.311s - 0.072s = 0.239s
  • md5sum: 0.023s - 0.003s = 0.020s

So no, spark-md5 is in reality not bad at all.

When looking at the example HTML page, I saw that they are using the incremental API. So I did another benchmark:

var md5 = require('spark-md5')

var md5obj = new md5()
var chunkNum = 0

require('fs').createReadStream('file')
  .on('data', function (b) {
    chunkNum ++
    md5obj.append(b.toString())
  })
  .on('end', function () {
    console.log('total ' + chunkNum + ' chunks')
    console.log(md5obj.end())
  })

With 96 chunks, it is 0.313s.

So no, it is not the MD5 implementation's fault at all. Performance this poor is TBH a little surprising, but not totally impossible as well, you are running this code in a browser.

BTW, my server is a DigitalOcean VPS with SSD. The file reading overhead is about 0.072s:

require('fs').readFile('file', 'utf8', function() {})

while with native cat it's about 0.003s.

For MD5 with native Buffer, the overhead is about 0.063s:

require('fs').readFile('file', function() {})
like image 162
Timothy Gu Avatar answered Nov 15 '22 12:11

Timothy Gu