How can it be impossible to "decrypt" an MD5 hash? [duplicate]

Tags:

Possible Duplicate:
How come MD5 hash values are not reversible?

I was reading a question about MD5, and it made me remember something that boggles me. Very simple question, and I'm sorry if it's not a good one. I just can't understand how you convert something to one thing using some algorithm, and there being no way to convert it back using the algorithm in reverse.

So how is this possible?

Also, since multiple strings can create the same MD5 hash, due to it being less data than the input string, how would any other hashing system be any better?

439

asked Apr 27 '10 00:04

Rob

1 Answers

Basically it's because the output of MD5 contains less information than the input. This is basically what distinguishes a hash algorithm from an encryption algorithm.

Here's a simple example: imagine an algorithm to compute the hash of a 10-digit number. The algorithm is "return the last 2 digits." If I take the hash of 8023798734, I get 34, but if all you had is the 34, you would have no way to tell what the original number is because the hashing algorithm discarded 8 digits worth of information. It's similar with MD5, except that the hash is computed via a complex procedure instead of just chopping off part of the data.

So then how can one hash be better than another? For one thing, different hash algorithms can be more or less resistant to collisions (when two inputs produce the same output). The probability of a collision is inversely related to the number of possible hash outputs. Collisions are an undesirable feature of hashes because if your data changes, you want the hash to change too, so one way to get a better hash algorithm is to use a hash with more possible outputs. In the digits example above, taking the last 4 digits instead of the last 2 digits reduces the probability of a collision with a given hash (technically called a preimage) to 1 in 10000 instead of 1 in 100, so it's more likely that all the 10-digit numbers in whatever set you have will have different hash values.

There's also the issue of cryptographic security. When you want to use a hash to make sure that some data is not tampered with, it's desirable that whoever's doing the tampering can't predict what inputs will produce a given output. If they could, they would be able to alter the input data in such a way that the output (the hash) remains the same. Going back to the digits example again, let's say I'm going to email you the number 1879483129 and it is critically important that this number gets to you unaltered. I might call you up and tell you the hash of the number, which would be 29, but since the "last 2 digits" algorithm is not cryptographically secure, a nefarious hacker could change the number en route to, say, 5555555529 and you wouldn't know the difference.

It's been shown that MD5 is not cryptographically secure (and SHA-1 is also compromised). That means that it is possible to find different inputs which correspond to any given output. It's still a fine algorithm for protecting against random bit flips and the like, but if there's a chance someone might want to intentionally corrupt your data, you should really use something more secure, like SHA-256 or greater, probably as part of an HMAC scheme.

178

answered Sep 29 '22 10:09

David Z

Related questions
                            
                                How does native implementation of ValueType.GetHashCode work?
                            
                                How do methods use hash arguments in Ruby?
                            
                                If key does not exist create default value
                            
                                How do I create an MD5 Hash of a string in Cocoa?
                            
                                C# Why can equal decimals produce unequal hash values?
                            
                                How to understand two pairs of parentheses in this code fragment?
                            
                                In Git, what is the difference between long and short hashes?
                            
                                How does Git create unique commit hashes, mainly the first few characters?
                            
                                What is a best practice of writing hash function in java?
                            
                                Best practice for hashing passwords - SHA256 or SHA512?
                            
                                How to convert JS Object to Array
                            
                                How to hash a password with SHA-512 in Java?
                            
                                Why is a SHA-1 Hash 40 characters long if it is only 160 bit?
                            
                                Accessing elements of nested hashes in ruby [duplicate]
                            
                                How to save a hash into a CSV
                            
                                Why does HashMap require that the initial capacity be a power of two?
                            
                                Hash Password in C#? Bcrypt/PBKDF2
                            
                                Why is the hash part of the URL not available on the server side?
                            
                                Performance of Arrays and Hashes in Ruby
                            
                                Creating SHA1 Hash from NSString

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How can it be impossible to "decrypt" an MD5 hash? [duplicate]

Tags:

hash

md5

encryption

Rob

People also ask

1 Answers

David Z

Recent Activity

Donate For Us