Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can a deterministic hashing function be easily decrypted? [duplicate]

Possible Duplicates:
Is it possible to decrypt md5 hashes?
Is it possible to reverse a sha1?

i asked this question: working with HUGE spreadsheet

and got a great answer and i followed the advice. i used this: http://splinter.com.au/blog/?p=86

and i hashed about 300,000 different elements in a column in an excel spreadsheet

since you can do:

=SHA1HASH('The quick brown fox jumps over the lazy dog')

And you'd get back:

2fd4e1c67a2d28fced849ee1bb76e7391b93eb12

couldnt you go backwards as well?

im saying if it encrypts the same text the same way every single time, what is the point?

if you do know the hash algorithm, is it possible to go backwards?

can you please explain to me very simply how does hashing work? how can you convert a 20gb to a 40 character hash? does it take a long time to hash a 20gb hardrive?

like image 900
Alex Gordon Avatar asked Jun 30 '10 20:06

Alex Gordon


2 Answers

General answer

A cryptographic hash function cannot be easily reversed. This is why it is also sometimes called a one-way function. There is no going back.

You should also be careful about calling this 'decryption'. Hashing is not the same as encryption. The set of possible hash value is typically smaller than set of possible inputs so multiple inputs map to the same output.

For any hash function given the output you can't know which of the many inputs was used to generate this particular output.

For cryptographic hashes like SHA1 it is very difficult to even find one input that produces that output.

The simplest way to reverse a cryptographic hash is to guess the input and hash it to see if it gives the right output. If you are wrong, guess again. Another approach is to use rainbow tables.

Regarding using hashing to encrypt SSNs

With your use case of SSNs an attack is feasible due to the relatively small number of possible input values. If you are worried about people getting access to SSNs then it might be best to not store or use the SSN at all in your application, and in particular do not use them as an identifier. Instead you could find or create another identifier, for example an email address, a login name, a GUID or just an incrementing number. It can be tempting to use the SSN as it is already there and at first glance appears to be a unique unchanging identifier, but in practice using it just causes problems. If you absolutely need to store it for some reason then use strong non-deterministic encryption with a secret key and make sure you keep that key safe.

like image 142
Mark Byers Avatar answered Feb 20 '23 19:02

Mark Byers


The whole point of a cryptographic hash is that you can't decrypt it and that it does encrypt the same way every time.

A very common use case for cryptographic hashes is password validation. Imagine I have the password "mypass123", and the hash is "aef8976ea17371bbcd". Then a program or website wishing to validate my password can store the hash "aef8976ea17371bbcd" in their database, instead of the password, and every time I want to log in, the site or program re-hashes my password and makes sure that the hashes match. This allows the site or program to avoid storing my actual password, and so protects my password (in case it's a password I use elsewhere) in the case that the data is stolen or otherwise compromised -- a hacker would not be able to go backwards from the hash to the password.

Another common use of cryptographic hashes is integrity checking. Suppose a given file (e.g. an image of a Linux distribution CD) has a known, publicly available cryptographic hash. If you have a file which purports to be the same thing, you can hash it yourself and see if the hashes match. Here, the fact that it hashes the same way every time allows you to independently validate it, and the fact that it is cryptographically secure means that no one can feasibly create a different, fake file (e.g. with a trojan in it) that has the same hash.

Keep in mind the very important distinction between hashing and encryption, though: hashing loses information. This is why you can't go backwards (decrypt) the hash. You can hash a 20 GiB file and end up with a 40-some character hash. Obviously, this has lost a lot of information in the process. How could you possibly "decrypt" 40-some characters into 20GiB? There's no such thing as compression that works that well! But this is also an advantage, because in order to check the integrity of a 20 GiB file, you only have to distribute a 40-some character hash.

Because information is lost, many files will have the same hash, but the key feature of a cryptographic hash (which is what you're talking about) is that despite the fact that information is lost, it is computationally infeasible to start with a file and construct a second, slightly different file that has the same hash. Any other file with the same hash would be radically different, and not easily mistakable for the original file.

like image 43
Tyler McHenry Avatar answered Feb 20 '23 19:02

Tyler McHenry