Python 3.4
requests 2.18
I want to make multiple requests.get
calls and return the first that gets a response, stopping the rest. What's the best way to do this?
Edit: I ended up using python's multiprocessing
module and a queue to achieve the result, like petre suggested in comments. Here's the code:
from multiprocessing import Queue, Process
from multiprocessing.queues import Empty
from requests import head
def _req(url, queue):
a = head(url)
if a.status_code == 200:
queue.put(a)
return
def head_all_get_first(urls):
jobs = []
q = Queue()
for url in urls:
p = Process(target=_req, args=(url, q))
jobs.append(p)
for p in jobs:
p.start()
try:
ret = q.get(timeout=20) # blocking get - wait at most 20 seconds for a return
except Empty: # thrown if the timeout is exceeded
ret = None
for p in jobs:
p.terminate()
return ret
Python uses GIL-based execution stepping for principally pure-[SERIAL]
code-execution, thus a maximum one can reach is a "just"-[CONCURRENT]
flow of code-execution, not a true-[PARALLEL]
system scheduling ( further details go way beyond this post, but are not needed now for presenting the below mentioned options ).
Your description could be thus filled with either of:
- using a distributed system ( a coordinated, multi-agent, bidirectionally talking network of processors )
- using a GIL-multiplexed python thread-based backend ( as in multiprocessing
)
- using a GIL-independent python subprocess-based backend ( as in joblib
)
In all cases, there will be different the sum of "costs" of a setup and termination of such a code-execution-infrastructure ( where a subprocess-based one will be the most expensive, as it always first generates a full copy of the initial-state of the python interpreter, including all its memory allocations et al, which is very expensive for "small pieces of meat to process", so if in a need for speed and shaving-off any latency down to any fraction of milliseconds, one may straight exclude this one ).
A nice example of a ( best a semi-persistent ) multi-agent network is using ZeroMQ or nanomsg inter-agent mediation tools ( where a wide range of transport classes { inproc:// | ipc:// | tcp:// | ... | vmci:// }
allows one to operate smart solutions both in local/distributed layouts and your reaching a "first answered" state could be smart, fast and cheaply "advertised" to all the rest of the group of co-operating agents, that were not the first one.
Both the Ventilator
and the Sink
roles could be in the same python process, be them a class-method or in a form of some locally-multiplexed parts of an imperative code and also the uniform PUSH
-strategy could grow smarter, if non-homogeneous tasking is coming to get implemented (yet the principle of operating a smart network of multi-agent processing is the way to go ).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With