First of all, sorry for my bad English. In my project i have a lot of I/O network requests. Main data stored in another projects, and access is provided by web API (JSON/XML), polling. We use this APIs for each new user session (getting information about user). And sometimes, we have a problem with waiting for a response. We use nginx+uwsgi+django. As you know, Django is synchronous (or blocking). We use uwsgi with multithreading for solve problem with network IO waiting. I decided to read about gevent. I understand difference between cooperative and preemptive multitasking. And I hoped that gevent was better solution then uwsgi threads for this issue (network I/O bottleneck). But the results were almost identical. Sometimes gevent was weaker. Maybe somewhere I'm wrong. Tell me, please.
Here is uwsgi config examples. Gevent:
$ uwsgi --http :8001 --module ugtest.wsgi --gevent 40 --gevent-monkey-patch
Threading:
$ uwsgi --http :8001 --module ugtest.wsgi --enable-threads --threads 40
Controller example:
def simple_test_action(request):
# get data from API without parsing (only for simple I/O test)
data = _get_data_by_url(API_URL)
return JsonResponse(data, safe=False)
import httplib
from urlparse import urlparse
def _get_data_by_url(url):
u = urlparse(url)
if str(u.scheme).strip().lower() == 'https':
conn = httplib.HTTPSConnection(u.netloc)
else:
conn = httplib.HTTPConnection(u.netloc)
path_with_params = '%s?%s' % (u.path, u.query, )
conn.request("GET", path_with_params)
resp = conn.getresponse()
print resp.status, resp.reason
body = resp.read()
return body
Test (with geventhttpclient):
def get_info(i):
url = URL('http://localhost:8001/simpletestaction/')
http = HTTPClient.from_url(url, concurrency=100, connection_timeout=60, network_timeout=60)
try:
response = http.get(url.request_uri)
s = response.status_code
body = response.read()
finally:
http.close()
dt_start = dt.now()
print 'Start: %s' % dt_start
threads = [gevent.spawn(get_info, i) for i in xrange(401)]
gevent.joinall(threads)
dt_end = dt.now()
print 'End: %s' % dt_end
print dt_end-dt_start
In both cases i have a similar time. What are the advantages of a gevent/greenlets and cooperative multitasking in a similar issue (API proxying)?
gevent allows writing asynchronous, coroutine-based code that looks like standard synchronous Python. It uses greenlet to enable task switching without writing async/await or using asyncio . eventlet is another library that does the same thing.
By default, Gunicorn uses a synchronous worker class to serve requests, but it can be easily configured to use gevent by simply adding -k gevent to the run command.
monkey – Make the standard library cooperative. Make the standard library cooperative. The primary purpose of this module is to carefully patch, in place, portions of the standard library with gevent-friendly functions that behave in the same way as the original (at least as closely as possible).
Serving non-blocking is not about performance, it's about concurrency. If 99% of request time is spent in a sub-request, you can't just optimize those 99%. But when all available threads get busy serving, new clients are refused, although 99% of threads' time is spent in waiting for sub-request completion. Non-blocking serving lets you utilize that idle time by sharing it between "handlers" that are no more limited by the number of available threads. So if 99% is waiting, then the other 1% is CPU-bound processing, hence you can have 100x more connections simultaneously before you max out your CPU--without having 100x more threads, which may be too expensive (and with Python's GIL issue, you have to use sub-processes that are even more expensive).
Now, as roberto said, your code must be 100% non-blocking to be able to salvage the idle time. However, as you can see from the percent example above, it becomes critical only when the requests are almost completely IO-bound. If that's the case, it's likely you don't need Django, at least for that part of your app.
A concurrency of 40 is not such a level to let gevent shines. Gevent is about concurrency not parallelism (or per-request performance), so having such a "low" level of concurrency is not a good way to get improvements.
Generally you will see gevent concurrency with a level of thousands, not 40 :)
For blocking I/O python threads are not bad (the GIL is released during I/O), the advantage of gevent is in resource usage (having 1000 python threads will be overkill) and the removal of the need to think about locking and friends.
And obviously, remember that your whole app must be gevent-friendly to get an advantage, and django (by default) requires a bit of tuning (as an example database adapters must be changed with something gevent friendly).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With