Ideal method for sending multiple HTTP requests over Python? [duplicate]

Possible Duplicate:
Multiple (asynchronous) connections with urllib2 or other http library?

I am working on a Linux web server that runs Python code to grab realtime data over HTTP from a 3rd party API. The data is put into a MySQL database. I need to make a lot of queries to a lot of URL's, and I need to do it fast (faster = better). Currently I'm using urllib3 as my HTTP library. What is the best way to go about this? Should I spawn multiple threads (if so, how many?) and have each query for a different URL? I would love to hear your thoughts about this - thanks!

How do you send multiple HTTP requests in Python?

There are two basic ways to generate concurrent HTTP requests: via multiple threads or via async programming. In multi-threaded approach, each request is handled by a specific thread. In asynchronous programming, there is (usually) one thread and an event loop, which periodically checks for the completion of a task.

How do you make HTTP requests faster in Python?

Solution #1: The Synchronous Way We can leverage the Session object to further increase the speed. The Session object will use urllib3's connection pooling, which means, for repeating requests to the same host, the Session object's underlying TCP connection will be re-used, hence gain a performance increase.

How does Python handle multiple Web requests?

Use Multiple Servers One way you can handle multiple requests is to have multiple physical servers. Each request can be given to a server that is free and it can service it. This approach is highly inefficient as we are adding more servers without effectively using resources of the existing servers.

If a lot is really a lot than you probably want use asynchronous io not threads.

requests + gevent = grequests

GRequests allows you to use Requests with Gevent to make asynchronous HTTP Requests easily.

import grequests

urls = [
    'http://www.heroku.com',
    'http://tablib.org',
    'http://httpbin.org',
    'http://python-requests.org',
    'http://kennethreitz.com'
]

rs = (grequests.get(u) for u in urls)
grequests.map(rs)

You should use multithreading as well as pipelining requests. For example search->details->save

The number of threads you can use doesn't depend on your equipment only. How many requests the service can serve? How many concurrent requests does it allow to run? Even your bandwidth can be a bottleneck.

If you're talking about a kind of scraping - the service could block you after certain limit of requests, so you need to use proxies or multiple IP bindings.

As for me, in the most cases, I can run 50-300 concurrent requests on my laptop from python scripts.

Ideal method for sending multiple HTTP requests over Python? [duplicate]

Tags:

python

http

httprequest

concurrency

user1094786

People also ask

2 Answers

Piotr Dobrogost

Maksym Polshcha

Recent Activity

Donate For Us

Ideal method for sending multiple HTTP requests over Python? [duplicate]

Tags:

python

http

httprequest

concurrency

user1094786

People also ask

2 Answers

Piotr Dobrogost

Maksym Polshcha

Related questions

Recent Activity

Donate For Us