So I tried I tried calculating millions and millions of different combinations of the below string but I was only calculating roughly 1,750 combinations a second which isn't even near the speed I need. So how would I reshape this so multiple processes of the same thing are calculating different parts, while not calculating parts that have already been calculated and also maintaining fast speeds? The code below is partially what I've been using. Any examples would be appreciated!
from itertools import product
for chars in product("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ12234567890!@#$%^&*?,()-=+[]/;", repeat = 4):
print chars
itertools. zip_longest (*iterables, fillvalue=None) Make an iterator that aggregates elements from each of the iterables. If the iterables are of uneven length, missing values are filled-in with fillvalue. Iteration continues until the longest iterable is exhausted.
itertools. product() is used to find the cartesian product from the given iterator, output is lexicographic ordered.
itertools.combinations(iterable, r) This tool returns the length subsequences of elements from the input iterable. Combinations are emitted in lexicographic sorted order. So, if the input iterable is sorted, the combination tuples will be produced in sorted order.
One way to break the product up into parts is to break up the first component of the product, so that each independent job has all the elements starting with a certain set of first letters. For example:
import string
import multiprocessing as mp
import itertools
alphabet = string.ascii_letters+string.digits+"!@#$%^&*?,()-=+[]/;"
num_parts = 4
part_size = len(alphabet) // num_parts
def do_job(first_bits):
for x in itertools.product(first_bits, alphabet, alphabet, alphabet):
print(x)
if __name__ == "__main__":
pool = mp.Pool()
results = []
for i in xrange(num_parts):
if i == num_parts - 1:
first_bit = alphabet[part_size * i :]
else:
first_bit = alphabet[part_size * i : part_size * (i+1)]
results.append(pool.apply_async(do_job(first_bit)))
pool.close()
pool.join()
(where obviously you'd only use results
if do_job
actually returned something).
Are you sure you're only getting 1750 combinations per second? I'm getting about 10 million.
def test(n):
start = time.time()
count = 0
for chars in product("abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ12234567890!@#$%^&*?,()-=+[]/;", repeat = 4):
count += 1
if count == n: break
return time.time() - start
>>> test(10000)
0.03300023078918457
>>> test(1000000)
0.15799999237060547
>>> test(10000000)
1.0469999313354492
I don't think my computer is that much faster than yours.
note: I posted this as an answer because I wanted to show code. It's really more of a comment. So please, no upvotes or downvotes.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With