Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How long does SHA-1 take to create hashes?

Tags:

java

c++

php

hash

sha1

Roughly how long, and how much processing power is required to create SHA-1 hashes of data? Does this differ a lot depending on the original data size? Would generating the hash of a standard HTML file take significantly longer than the string "blah"? How would C++, Java, and PHP compare in speed?

like image 720
Test Tester Avatar asked Aug 14 '11 22:08

Test Tester


3 Answers

You've asked a lot of questions, so hopefully I can try to answer each one in turn.

SHA-1 (and many other hashes designed to be cryptographically strong) are based on repeated application of an encryption or decryption routine to fixed-sized blocks of data. Consequently, when computing a hash value of a long string, the algorithm takes proportionally more time than computing the hash value of a small string. Mathematically, we say that the runtime to hash a string of length N is O(N) when using SHA-1. Consequently, hashing an HTML document should take longer than hashing the string "blah," but only proportionally so. It won't take dramatically longer to do the hash.

As for comparing C++, Java, and PHP in terms of speed, this is dangerous territory and my answer is likely to get blasted, but generally speaking C++ is slightly faster than Java, which is slightly faster than PHP. A good hash implementation written in one of those languages might dramatically outperform the others if they aren't written well. However, you shouldn't need to worry about this. It is generally considered a bad idea to implement your own hash functions, encryption routines, or decryption routines because they are often vulnerable to side-channel attacks in which an attacker can break your security by using bugs in the implementation that are often extremely difficult to have anticipated. If you want to use a good hash function, use a prewritten version. It's likely to be faster, safer, and less error-prone than anything you do by hand.

Finally, I'd suggest not using SHA-1 at all. SHA-1 has known cryptographic weaknesses and you should consider using a strong hash algorithm instead, such as SHA-256.

Hope this helps!

like image 138
templatetypedef Avatar answered Oct 20 '22 06:10

templatetypedef


The "speed" of cryptographic hash functions is often measured in "clock cycles per byte". See this page for an admittedly outdated comparison - you can see how implementation and architecture influence the results. The results vary largely not only due to the algorithm being used, but they are also largely dependent on your processor architecture, the quality of the implementation and if the implementation uses the hardware efficiently. That's why some companies specialize in creating hardware especially well suited for the exact purpose of performing certain cryptographic algorithms as efficiently as possible.

A good example is SHA-512, although it works on larger data chunks than SHA-256 one might be inclined to think that it should generally perform slower than SHA-256 working on smaller input - but SHA-512 is especially well suited for 64 bit processors and performs sometimes even better than SHA-256 there.

All modern hash algorithms are working on fixed-size blocks of data. They perform a fixed number of deterministic operations on a block, and do this for every block until you finally get the result. This also means that the longer your input, the longer the operation will take. From the characteristics just explained we can deduce that the length of the operation is directly proportional to the input size of a message. Mathematically oŕ computer-scientifically speaking we coin this as being an O(n) operation, where n is the input size of the message, as templatetypedef already pointed out.

You should not let the speed of hashing influence your choice of programming language, all modern hash algorithms are really, really fast, regardless of the language. Although C-based implementations will do slightly better than Java, which again will probably be slightly faster than PHP, I bet in practice you won't know the difference.

like image 22
emboss Avatar answered Oct 20 '22 06:10

emboss


SHA-1 processes the data by chunks of 64 bytes. The CPU time needed to hash a file of length n bytes is thus roughly equal to n/64 times the CPU time needed to process one chunk. For a short string, you must first convert the string to a sequence of bytes (SHA-1 works on bytes, not on characters); the string "blah" will become 4 or 8 bytes (if you use UTF-8 or UTF-16, respectively) so it will be hashed as a single chunk. Note that the conversion from characters to bytes may take more time than the hashing itself.

Using the pure Java SHA-1 implementation from sphlib, on my PC (x86 Core2, 2.4 GHz, 64-bit mode), I can hash long messages at a bandwidth of 132 MB/s (that's using a single CPU core). Note that this exceeds the speed of a common hard disk, so when hashing a big file, chances are that the disk will be the bottleneck, not the CPU: the time needed to hash the file will be the time needed to read the file from the disk.

(Also, using native code written in C, SHA-1 speed goes up to 330 MB/s.)

SHA-256 is considered to be widely more secure than SHA-1, and a pure Java implementation of SHA-256 ranks at 85 MB/s on my PC, which is still quite fast. As of 2011, SHA-1 is not recommended.

like image 34
Thomas Pornin Avatar answered Oct 20 '22 07:10

Thomas Pornin