Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Uwsgi with gevent vs threads

First of all, sorry for my bad English. In my project i have a lot of I/O network requests. Main data stored in another projects, and access is provided by web API (JSON/XML), polling. We use this APIs for each new user session (getting information about user). And sometimes, we have a problem with waiting for a response. We use nginx+uwsgi+django. As you know, Django is synchronous (or blocking). We use uwsgi with multithreading for solve problem with network IO waiting. I decided to read about gevent. I understand difference between cooperative and preemptive multitasking. And I hoped that gevent was better solution then uwsgi threads for this issue (network I/O bottleneck). But the results were almost identical. Sometimes gevent was weaker. Maybe somewhere I'm wrong. Tell me, please.

Here is uwsgi config examples. Gevent:

$ uwsgi --http :8001 --module ugtest.wsgi --gevent 40 --gevent-monkey-patch

Threading:

$ uwsgi --http :8001 --module ugtest.wsgi --enable-threads --threads 40

Controller example:

def simple_test_action(request):
    # get data from API without parsing (only for simple I/O test)
    data = _get_data_by_url(API_URL)
    return JsonResponse(data, safe=False)

import httplib
from urlparse import urlparse
def _get_data_by_url(url):
    u = urlparse(url)
    if str(u.scheme).strip().lower() == 'https':
        conn = httplib.HTTPSConnection(u.netloc)
    else:
        conn = httplib.HTTPConnection(u.netloc)
    path_with_params = '%s?%s' % (u.path, u.query, )
    conn.request("GET", path_with_params)
    resp = conn.getresponse()
    print resp.status, resp.reason
    body = resp.read()
    return body

Test (with geventhttpclient):

def get_info(i):
    url = URL('http://localhost:8001/simpletestaction/')
    http = HTTPClient.from_url(url, concurrency=100, connection_timeout=60, network_timeout=60)
    try:
        response = http.get(url.request_uri)
        s = response.status_code
        body = response.read()
    finally:
        http.close()


dt_start = dt.now()
print 'Start: %s' % dt_start

threads = [gevent.spawn(get_info, i) for i in xrange(401)]
gevent.joinall(threads)
dt_end = dt.now()

print 'End: %s' % dt_end
print dt_end-dt_start

In both cases i have a similar time. What are the advantages of a gevent/greenlets and cooperative multitasking in a similar issue (API proxying)?

like image 977
OLMER Avatar asked Jan 11 '15 18:01

OLMER


People also ask

What is gevent flask?

gevent allows writing asynchronous, coroutine-based code that looks like standard synchronous Python. It uses greenlet to enable task switching without writing async/await or using asyncio . eventlet is another library that does the same thing.

How do I run Gunicorn with gevent?

By default, Gunicorn uses a synchronous worker class to serve requests, but it can be easily configured to use gevent by simply adding -k gevent to the run command.

What is gevent monkey patch?

monkey – Make the standard library cooperative. Make the standard library cooperative. The primary purpose of this module is to carefully patch, in place, portions of the standard library with gevent-friendly functions that behave in the same way as the original (at least as closely as possible).


2 Answers

Serving non-blocking is not about performance, it's about concurrency. If 99% of request time is spent in a sub-request, you can't just optimize those 99%. But when all available threads get busy serving, new clients are refused, although 99% of threads' time is spent in waiting for sub-request completion. Non-blocking serving lets you utilize that idle time by sharing it between "handlers" that are no more limited by the number of available threads. So if 99% is waiting, then the other 1% is CPU-bound processing, hence you can have 100x more connections simultaneously before you max out your CPU--without having 100x more threads, which may be too expensive (and with Python's GIL issue, you have to use sub-processes that are even more expensive).

Now, as roberto said, your code must be 100% non-blocking to be able to salvage the idle time. However, as you can see from the percent example above, it becomes critical only when the requests are almost completely IO-bound. If that's the case, it's likely you don't need Django, at least for that part of your app.

like image 176
jwalker Avatar answered Oct 11 '22 09:10

jwalker


A concurrency of 40 is not such a level to let gevent shines. Gevent is about concurrency not parallelism (or per-request performance), so having such a "low" level of concurrency is not a good way to get improvements.

Generally you will see gevent concurrency with a level of thousands, not 40 :)

For blocking I/O python threads are not bad (the GIL is released during I/O), the advantage of gevent is in resource usage (having 1000 python threads will be overkill) and the removal of the need to think about locking and friends.

And obviously, remember that your whole app must be gevent-friendly to get an advantage, and django (by default) requires a bit of tuning (as an example database adapters must be changed with something gevent friendly).

like image 43
roberto Avatar answered Oct 11 '22 08:10

roberto