I have a list of items aprox 60,000 items - i would like to send queries to the database to check if they exist and if they do return some computed results. I run an ordinary query, while iterating through the list one-by-one, the query has been running for the last 4 days. I thought i could use the threading module to improve on this. I did something like this
if __name__ == '__main__':
for ra, dec in candidates:
t = threading.Thread(target=search_sl, args=(ra,dec, q))
t.start()
t.join()
I tested with only 10 items and it worked fine - when i submitted the whole list of 60k items, i run into errors i.e, "maximum number of sessions exceeded". What I want to do is to create maybe 10 thread at a time. When the 1st bunch of thread have finished excuting, i send another request and so on.
You could try using a process pool, which is available in the multiprocessing module. Here is the example from the python docs:
from multiprocessing import Pool
def f(x):
return x*x
if __name__ == '__main__':
pool = Pool(processes=4) # start 4 worker processes
result = pool.apply_async(f, [10]) # evaluate "f(10)" asynchronously
print result.get(timeout=1) # prints "100" unless your computer is *very* slow
print pool.map(f, range(10)) # prints "[0, 1, 4,..., 81]"
http://docs.python.org/library/multiprocessing.html#using-a-pool-of-workers
Try increasing the number of processes until you reach the maximum your system can support.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With