Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Hashes (MD5, SHA1, SHA256, SHA384, SHA512) - why isn't it possible to get the value back from the hash?

On this blog post, there is a sentence as below:

This hash is unique for the given text. If you use the hash function on the same text again, you'll get the same hash. But there is no way to get the given text from the hash.

Forgive my ignorance on math but I cannot understand why it is not possible to get the given text from the hash.

I would understand if we use one key to encrypt the value and another to decrypt but I cannot figure it out in my mind. What is really going on here behind the scenes?

Anything that clears my mind will be appreciated.

like image 705
tugberk Avatar asked Jan 13 '12 12:01

tugberk


4 Answers

Hashing is not encryption.

A hash produces a "digest" - a summary of the input. Whatever the input size, the hash size is always the same (see how MD5 returns the same size result for any input size).

With a hash, you can get the same hash from several different inputs (hash collisions) - how would you reverse this? Which is the correct input?

I suggest reading this blog post from Troy Hunt on the matter in order to gain better understanding of hashes, passwords and security.

Encryption is a different thing - you would get a different cypher from the input and key - and the size of the cypher will tend to be larger as the input is larger. This is reversible if you have the right key.


Update (following the different comments):

Though collisions can happen, when using a cryptographically significant hash (like the ones you have posted about), they will be rare and difficult to produce.

When hashing passwords, always use a salt - this reduces the chances of the hash being reversed by rainbow tables to almost nothing (assuming a good salt has been used).

You need to decide about the tradeoffs of the cost of hashing (can be processor intensive) and the cost of what you are protecting.

As you are simply protecting the login details, using the .NET membership provider should provide enough security.

like image 62
Oded Avatar answered Oct 31 '22 23:10

Oded


Hash functions are many to one functions. This means that many inputs will give the same result but that for any given input you get one and only one result.

Why this is so can be intuitively seen by considering a hash function that takes a string input of any length and generates a 32 bit integer. There are obviously far more strings than 2^32 which means that your hash function cannot give each input string a unique output. (see http://en.wikipedia.org/wiki/Pigeonhole_principle for more discussion - the Uses and applications section specifically talks about hashes)

Given we now know that any result from our hash function could have been generated from one or more inputs and we have no information other than the result we have no way to determine which input was used so it cannot be reversed.

like image 25
Chris Avatar answered Oct 31 '22 23:10

Chris


There are at least two reasons:

  1. Hashing usually uses asymmetric functions for calculations - meaning that finding reverse value of some operation is MUCH more difficult (in time/resources/efforts) than the direct operation.

  2. Hashes of same algorithm are always of the same length - meaning there is a limited set of possible hashes. This means that for every hash there will be infinite number of collisions - different source data block which form the same hash value.

like image 2
Sergey Kudriavtsev Avatar answered Oct 31 '22 23:10

Sergey Kudriavtsev


It's not encrypt/decrypt. For example, simple hash function:

int hash(int data)
{
    return data % 2;
}

Problem?

like image 1
okay Avatar answered Oct 31 '22 22:10

okay