Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

HashCode vs SHA-1

Tags:

java

hashcode

I'd like to compare some large objects representing trees and cache something to avoid comparing each time the new object with one already existing...

The question is what would be the best something ? (a compromise between performance and collisions...).

On the one hand, I have a regular hashCode function based on the value of various fields (following the chapter 3 of effective Java. But I'm not able to evaluate the potential collisions entailed by such an approach.

On the other hand, I have the MessageDigest approach from the standard java distribution with SHA-1 algorithm. I presume it's not going to be efficient but I may have less collision. Am I right ? Is it a correct solution in my context or am I completely wrong ?

The thing is that I don't know what would be the size of the objects. Please also note that the value computed is not going to be used in a HashTable.

thx...

like image 731
LB40 Avatar asked May 12 '09 15:05

LB40


People also ask

Is SHA1 still used?

In cryptography, SHA-1 (Secure Hash Algorithm 1) is a cryptographically broken but still widely used hash function which takes an input and produces a 160-bit (20-byte) hash value known as a message digest – typically rendered as 40 hexadecimal digits.

How was SHA1 cracked?

A proof-of-concept attack has been pioneered that “fully and practically” breaks the Secure Hash Algorithm 1 (SHA-1) code-signing encryption, used by legacy computers to sign the certificates that authenticate software downloads and prevent man-in-the-middle tampering.

Is SHA1 faster than MD5?

The message digest length in MD5 can be up to 128 bits. The message digest length for SHA1 can be up to 160 bits. MD5 is faster than SHA.

How long is a SHA-1 hash?

The hash size for the SHA1 algorithm is 160 bits.


1 Answers

Because of the birthday problem, the chance of a collision depends on how many items you are working with.

The 160-bit space of SHA-1 is so large that I doubt you could ever have enough items to see a collision.

The 32-bit space of hashCode() should not have a significant number of collisions until you have over 50,000 items. However, this depends on using a good hash algorithm.

In order to apply a cryptographic digest like SHA-1, you'll need to convert your graph to a string of bytes, which is likely to be computationally expensive, and could be complicated.

like image 184
erickson Avatar answered Oct 04 '22 13:10

erickson