Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is an efficient way to write password cracking algorithm (python)

This problem might be relatively simple, but I'm given two text files. One text file contains all encrypted passwords encrypted via crypt.crypt in python. The other list contains over 400k+ normal dictionary words.

The assignment is that given 3 different functions which transform strings from their normal case to all different permutations of capitalizations, transforms a letter to a number (if it looks alike, e.g. G -> 6, B -> 8), and reverses a string. The thing is that given the 10 - 20 encrypted passwords in the password file, what is the most efficient way to get the fastest running solution in python to run those functions on dictionary word in the words file? It is given that all those words, when transformed in whatever way, will encrypt to a password in the password file.

Here is the function which checks if a given string, when encrypted, is the same as the encrypted password passed in:

def check_pass(plaintext,encrypted):
 crypted_pass = crypt.crypt(plaintext,encrypted)
 if crypted_pass == encrypted:
  return True
 else:
  return False

Thanks in advance.

like image 949
Luminance Avatar asked May 23 '10 20:05

Luminance


People also ask

Which method of algorithm would be used when trying to crack a password?

Brute force attack– This method is similar to the dictionary attack. Brute force attacks use algorithms that combine alpha-numeric characters and symbols to come up with passwords for the attack. For example, a password of the value “password” can also be tried as p@$$word using the brute force attack.

Which method of password cracking needs large amounts of memory?

Rainbow tables require large amounts of storage space and can take a long time to generate, but their primary shortcoming is that they may be ineffective against password hashing that uses salting.


2 Answers

Without knowing details about the underlying hash algorithm and possible weaknesses of the algorithm all you can do is to run a brute-force attack trying all possible transformations of the words in your password list.

The only way to speed up such a brute-force attack is to get more powerful hardware and to split the task and run the cracker in parallel.

like image 200
Dirk Vollmar Avatar answered Oct 19 '22 13:10

Dirk Vollmar


On my slow laptop, crypt.crypt takes about 20 microseconds:

$ python -mtimeit -s'import crypt' 'crypt.crypt("foobar", "zappa")'
10000 loops, best of 3: 21.8 usec per loop

so, the brute force approach (really the only sensible one) is "kinda" feasible. By applying your transformation functions you'll get (ballpark estimate) about 100 transformed words per dictionary word (mostly from the capitalization changes), so, about 40 million transformed words out of your whole dictionary. At 20 microseconds each, that will take about 800 seconds, call it 15 minutes, for the effort of trying to crack one of the passwords that doesn't actually correspond to any of the variations; expected time about half that, to crack a password that does correspond.

So, if you have 10 passwords to crack, and they all do correspond to a transformed dictionary word, you should be done in an hour or two. Is that OK? Because there isn't much else you can do except distribute this embarassingly parallel problem over as many nodes and cores as you can grasp (oh, and, use a faster machine in the first place -- that might buy you perhaps a factor of two or thereabouts).

There is no deep optimization trick that you can add, so the general logic will be that of a triple-nested loop: one level loops over the encrypted passwords, one over the words in the dictionary, one over the variants of each dictionary word. There isn't much difference regarding how you nest things (except the loop on the variants must come within the loop on the words, for simplicity). I recommend encapsulating "give me all variants of this word" as a generator (for simplicity, not for speed) and otherwise minimizing the number of function calls (e.g. there is no reason to use that check_pass function since the inline code is just as clear, and will be microscopically faster).

like image 34
Alex Martelli Avatar answered Oct 19 '22 14:10

Alex Martelli