Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python : run multiple queries in parallel and get the first finished

I try to create a Python script that performs queries to multiple sites. The script works well (I use urllib2) but just for one link. For multiples sites, I make multiple requests one after the other but it is not very powerful.

What is the ideal solution (the threads I guess) to run multiple queries in parallel and stop others when a query returns a specific string please ?

I found this question but I have not found how to change it to stop the remaining threads... : Python urllib2.urlopen() is slow, need a better way to read several urls

Thank you in advance !

(sorry if I made mistakes in English, I'm French ^^)

like image 270
ncrocfer Avatar asked Nov 04 '22 19:11

ncrocfer


1 Answers

You can use Twisted to deal with multiple requests concurrently. Internally it will use epoll (or iocp or kqueue depending on the platform) to get notified of tcp availability efficently, which is cheaper than using threads. Once one request matches, you cancel the others.

Here is the Twisted http agent tutorial.

like image 126
Tobu Avatar answered Nov 09 '22 16:11

Tobu