Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is using 2 different hash functions a good way to check for file integrity?

I have a website where users can upload their files; these are stored on the server and their metadata recorded in a database. I'm implementing some simple integrity checks, i.e. "is the content of this file now byte-for-byte identical as when it was uploaded?"

An example: for content of userfile.jpg, MD5 hash is 39f9031a154dc7ba105eb4f76f1a0fd4 and SHA-1 hash is 878d8d667721e356bf6646bd2ec21fff50cdd4a9. If this file's content changes, but has the same MD5 hash before and after, is it probable that the SHA-1 hash will also stay the same? (With hashing, sometimes you can get a hash collision - could this happen with two different hashing algorithms at once?)

Or is computing two different hashes for a file pointless (and I should try some other mechanism for verifying integrity)?


Edit: I'm not really worried about accidental corruption, but I'm supposed to prevent users changing the file unnoticed (birthday attack and friends).

I'll probably go with one hash, SHA-512 - the checks don't happen that often to be a performance bottleneck and anyway, "As Bruce Schneier says, there's enough fast, insecure systems out there already. –@MichaelGG in the comments".

like image 683
Piskvor left the building Avatar asked Feb 11 '09 17:02

Piskvor left the building


People also ask

How can we use hash functions to check integrity?

Verifying a HashData can be compared to a hash value to determine its integrity. Usually, data is hashed at a certain time and the hash value is protected in some way. At a later time, the data can be hashed again and compared to the protected value. If the hash values match, the data has not been altered.

Does hashing verify integrity?

Integrity but not authentication Hash checks are useful for ensuring the integrity of files, but they do not provide any kind of authentication. That is, they are good for ensuring the file or program you have matches the source, but they provide no way of verifying that the source is legitimate.

Do hash functions provide integrity?

A hash function does not provide integrity, a MAC provides integrity. Instead a cryptographic hash function provides three properties, well defined in the world of cryptography: collision resistance, pre-image resistance and second pre-image resistance.

Can a file have two hashes?

Generally, two files can have the same md5 hash only if their contents are exactly the same. Even a single bit of variation will generate a completely different hash value.


1 Answers

MD5 is probably safe for what you're doing, but there's no reason to continue to use a hash with known flaws. In fact, there's no reason you shouldn't be usign SHA256 or SHA512, unless you have some known major performance bottleneck.

Edit: To clarify, there's no reason to use two algorithms; just use one that fits what you need. If you're worried about people doing MD5 collisions on you (as in, is this a security threat?), then use an algorithm that isn't as weak, such as SHA256.

Edit 2: To address an apparently still common misunderstanding: Finding a random collision on a hash is not a 1/2^n probability. It's closer to 1/2^(n/2). So a 128-bit hash can probably be collided with 2^64 attempts. See birthday attack for details.

like image 55
MichaelGG Avatar answered Sep 21 '22 17:09

MichaelGG