I was sure there was something like this in the standard library, but it seems I was wrong.
I have a bunch of urls that I want to urlopen
in parallel. I want something like the builtin map
function, except the work is done in parallel by a bunch of threads.
Is there a good module that does this?
Python now has the concurrent. futures module, which is the simplest way of getting map to work with either multiple threads or multiple processes.
Python doesn't support multi-threading because Python on the Cpython interpreter does not support true multi-core execution via multithreading. However, Python does have a threading library. The GIL does not prevent threading.
Multiprocessing is a easier to just drop in than threading but has a higher memory overhead. If your code is CPU bound, multiprocessing is most likely going to be the better choice—especially if the target machine has multiple cores or CPUs.
There is a map
method in multiprocessing.Pool. That does multiple processes.
And if multiple processes aren't your dish, you can use multiprocessing.dummy which uses threads.
import urllib import multiprocessing.dummy p = multiprocessing.dummy.Pool(5) def f(post): return urllib.urlopen('http://stackoverflow.com/questions/%u' % post) print p.map(f, range(3329361, 3329361 + 5))
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With