Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ridiculously poor performance with OpenSSL AES/GCM on Raspberry PI 2

I developed a simple C++ program to benchmark the performance of OpenSSL AES/GCM calls to the EVP interface. What it does is to take a 1024 bytes string, encrypt it with a key, then encrypt the result with the same key, and again and again. I am using incremental 4-bytes initialization vectors.

When I tested it on my Macbook Pro (Intel i7) the result was quite impressive: it took exactly one second to run 1048576 iterations of the above procedure on a single core. That's 1 GB/s encryption speed. 8 GB/s (more or less) if we make use of all the cores simultaneously.

Now, I ported the same benchmark on a Raspberry PI 2. When I ran it, however, it took 0.16 seconds to do 1024 iterations. That's more or less 6 MB/s, on a single core.

Now, I obviously understand that there's a huge, huge difference between a modern, costly i7 processor and the small ARM processor that runs on a Raspberry, but still that's 170 times faster. So before assuming that Raspberry PI 2 is really that bad, I wanted to check if those parameters are reasonable.

Has anybody done some kind of benchmark on that? Are 6 MB/s encryption speed reasonable on a Raspberry? Or am I doing something wrong?

(I am powering it via my Macbook USB: could that be so slow because it is not receiving enough power? That definitely doesn't sound reasonable. It wouldn't power on at all, right? Or could there be a downclocking mechanism to save power?)

UPDATE 1: I did openssl -evp speed aes-256-cbc on both my Macbook and the Raspberry.

On the Macbook:

type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-256-cbc     534591.95k   564057.62k   566522.81k   570717.87k   574876.33k

On the Raspberry:

type             16 bytes     64 bytes    256 bytes   1024 bytes   8192 bytes
aes-256-cbc      14288.53k    16653.74k    17165.31k    17298.43k    17337.00k

That's still a factor 33, but the Intel processor can make use of hardware accelerated AES calls. Still, as far as I know GCM mode should be quite faster than CBC. I don't know why but looks like there isn't an openssl benchmark right for GCM, but even assuming they are performing identically I am missing a factor 3.

UPDATE 2 Checked this page: http://elinux.org/RPi_Performance#OpenSSL. Looks like I am missing 10 MB/s more. Grand total: 27 MB/s with AES/CBC (as it should be) vs 6 MB/s with AES/GCM (as it actually is).

like image 259
Matteo Monti Avatar asked Jul 09 '15 01:07

Matteo Monti


2 Answers

Your Intel CPU has dedicated Hardware support using the AESNI extension. If you compile without that, you will get about 250MB/s That difference in performance sounds reasonable. (And how much GHz any CPU has tells you nothing about the performance, except if it is exactly the same CPU type just with different clock)

like image 193
Josef says Reinstate Monica Avatar answered Oct 08 '22 09:10

Josef says Reinstate Monica


If you haven't already accounted for it, why wouldn't the factor of 3 be explained by the factor of roughly 3 difference in processing power?

Raspberry Pi 2 has a 900Mhz processor, and a typical i7 processor has 2.8Ghz, which results the in the Pi having roughly a third of the processing power.

Also, I don't know where you're getting the conclusion that GCM should be faster than CBC under these circumstances. CBC doesn't provide authentication, so that alone should make GCM measurably slower (though perhaps not the factor of 4 you're seeing).

Of course, that goes out the window when you bring in multiple cores, given that CBC cannot be parallelized and GCM can.

like image 1
Andrew Michael Felsher Avatar answered Oct 08 '22 09:10

Andrew Michael Felsher