Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastapi python code execution speed impacted by deployment with uvicorn vs gunicorn

I wrote an fastapi app. And now I am thinking about deploying it however I seem to get strange unexpected performance issues that seem to depend on wether I use uvicorn vs gunicorn. In particular all code (even standard library pure python code) seems to get slower if I use gunicorn. For performance debugging I wrote a small app that demonstrates this:

import asyncio, time
from fastapi import FastAPI, Path
from datetime import datetime

app = FastAPI()

@app.get("/delay/{delay1}/{delay2}")
async def get_delay(
    delay1: float = Path(..., title="Nonblocking time taken to respond"),
    delay2: float = Path(..., title="Blocking time taken to respond"),
):
    total_start_time = datetime.now()
    times = []
    for i in range(100):
        start_time = datetime.now()
        await asyncio.sleep(delay1)
        time.sleep(delay2)
        times.append(str(datetime.now()-start_time))
    return {"delays":[delay1,delay2],"total_time_taken":str(datetime.now()-total_start_time),"times":times}

Running the fastapi appi with:

gunicorn api.performance_test:app -b localhost:8001 -k uvicorn.workers.UvicornWorker --workers 1

The resonse body of a get to http://localhost:8001/delay/0.0/0.0 is consistently something like:

{
  "delays": [
    0.0,
    0.0
  ],
  "total_time_taken": "0:00:00.057946",
  "times": [
    "0:00:00.000323",
    ...smilar values omitted for brevity...
    "0:00:00.000274"
  ]
}

However using:

uvicorn api.performance_test:app --port 8001 

I consitently get timings like these

{
  "delays": [
    0.0,
    0.0
  ],
  "total_time_taken": "0:00:00.002630",
  "times": [
    "0:00:00.000037",
    ...snip...
    "0:00:00.000020"
  ]
}

The difference becomes even more prounced when I uncomment the await asyncio.sleep(delay1) statement.

So I am wondering what gunicorn/uvicorn do to the python/fastapi runtime to create this factor 10 difference in the speed of code execution.

For what is is worth I performed these tests using Python 3.8.2 on OS X 11.2.3 with an intel I7 processor.

And these are the relevant parts of my pip freeze output

fastapi==0.65.1
gunicorn==20.1.0
uvicorn==0.13.4
like image 757
M.D. Avatar asked May 29 '21 10:05

M.D.


People also ask

Does FastAPI need Gunicorn?

Gunicorn by itself is not compatible with FastAPI, as FastAPI uses the newest ASGI standard. But Gunicorn supports working as a process manager and allowing users to tell it which specific worker process class to use. Then Gunicorn would start one or more worker processes using that class.

Do I need Gunicorn with Uvicorn?

Running with Gunicorn For production deployments we recommend using gunicorn with the uvicorn worker class. For a PyPy compatible configuration use uvicorn.

Does FastAPI need Uvicorn?

The main thing you need to run a FastAPI application in a remote server machine is an ASGI server program like Uvicorn. There are 3 main alternatives: Uvicorn: a high performance ASGI server.

How do I use fastapi and uvicorn in Python?

The first step is to install FastAPI and Uvicorn using pip: $ python -m pip install fastapi uvicorn[standard] With that, you have FastAPI and Uvicorn installed and are ready to learn how to use them. FastAPI is the framework you’ll use to build your API, and Uvicorn is the server that will use the API you build to serve requests.

How do I build an API with uvicorn?

FastAPI is the framework you’ll use to build your API, and Uvicorn is the server that will use the API you build to serve requests. To get started, in this section, you will create a minimal FastAPI app, run it with a server using Uvicorn, and then learn all the interacting parts. This will give you a very quick overview of how everything works.

What is fastapi?

What Is FastAPI? FastAPI is a modern, high-performance web framework for building APIs with Python based on standard type hints. It has the following key features: Fast to run: It offers very high performance, on par with NodeJS and Go, thanks to Starlette and pydantic.

How do I run the fastapi on https?

To run the FastAPI on HTTPS: edit systemctl service file of your app service. On Ubuntu, those files located at (/etc/systemd/system/) Add two arguments related to the SSL certificate to the execute command: --certfile="/etc/letsencrypt/live/yourdomain/fullchain.pem" --keyfile="/etc/letsencrypt/live/yourdomain/privkey.pem"


2 Answers

I can't reproduce your results.

My environment: ubuntu on WSL2 on Windows 10

relevant parts of my pip freeze output:

fastapi==0.65.1
gunicorn==20.1.0
uvicorn==0.14.0

I modified code a little:

import asyncio, time
from fastapi import FastAPI, Path
from datetime import datetime
import statistics

app = FastAPI()

@app.get("/delay/{delay1}/{delay2}")
async def get_delay(
    delay1: float = Path(..., title="Nonblocking time taken to respond"),
    delay2: float = Path(..., title="Blocking time taken to respond"),
):
    total_start_time = datetime.now()
    times = []
    for i in range(100):
        start_time = datetime.now()
        await asyncio.sleep(delay1)
        time.sleep(delay2)
        time_delta= (datetime.now()-start_time).microseconds
        times.append(time_delta)

    times_average = statistics.mean(times)

    return {"delays":[delay1,delay2],"total_time_taken":(datetime.now()-total_start_time).microseconds,"times_avarage":times_average,"times":times}

Apart from first loading of website, my results for both methods are nearly the same.

Times are between 0:00:00.000530 and 0:00:00.000620 most of the time for both methods.

First attempt for each takes longer: around 0:00:00.003000. However after I restarted Windows and tried those tests again I noticed I no longer have increased times on first requests after server startup (I think it is thanks to a lot of free RAM after restart)


Examples of not-first runs (3 attempts):

# `uvicorn performance_test:app --port 8083`

{"delays":[0.0,0.0],"total_time_taken":553,"times_avarage":4.4,"times":[15,7,5,4,4,4,4,5,5,4,4,5,4,4,5,4,4,5,4,4,5,4,4,5,4,4,4,5,4,4,5,4,4,5,4,4,4,4,4,5,4,5,5,4,4,4,4,4,4,5,4,4,4,5,4,4,4,4,4,4,5,4,4,5,4,4,4,4,5,4,4,5,4,4,4,4,4,5,4,4,5,4,4,5,4,4,5,4,4,4,4,4,4,4,5,4,4,4,5,4]}
{"delays":[0.0,0.0],"total_time_taken":575,"times_avarage":4.61,"times":[15,6,5,5,5,5,5,5,5,5,5,4,5,5,5,5,4,4,4,4,4,5,5,5,4,5,4,4,4,5,5,5,4,5,5,4,4,4,4,5,5,5,5,4,4,4,4,5,5,4,4,4,4,4,4,4,4,5,5,4,4,4,4,5,5,5,5,5,5,5,4,4,4,4,5,5,4,5,5,4,4,4,4,4,4,5,5,5,4,4,4,4,5,5,5,5,4,4,4,4]}
{"delays":[0.0,0.0],"total_time_taken":548,"times_avarage":4.31,"times":[14,6,5,4,4,4,4,4,4,4,5,4,4,4,4,4,4,5,4,4,5,4,4,4,4,4,4,4,5,4,4,4,5,4,4,4,4,4,4,4,4,5,4,4,4,4,4,4,5,4,4,4,4,4,5,5,4,4,4,4,4,4,4,5,4,4,4,4,4,5,4,4,5,4,4,5,4,4,5,4,4,4,4,4,4,4,5,4,4,5,4,4,5,4,4,5,4,4,4,4]}


# `gunicorn performance_test:app -b localhost:8084 -k uvicorn.workers.UvicornWorker --workers 1`

{"delays":[0.0,0.0],"total_time_taken":551,"times_avarage":4.34,"times":[13,6,5,5,5,5,5,4,4,4,5,4,4,4,4,4,5,4,4,5,4,4,5,4,4,4,4,4,5,4,4,4,4,4,5,4,4,4,4,4,4,4,5,4,4,5,4,4,4,4,4,4,4,4,5,4,4,4,4,4,4,4,5,4,4,4,4,4,4,4,4,4,5,4,4,5,4,5,4,4,5,4,4,4,4,5,4,4,5,4,4,4,4,4,4,4,5,4,4,5]}
{"delays":[0.0,0.0],"total_time_taken":558,"times_avarage":4.48,"times":[14,7,5,5,5,5,5,5,4,4,4,4,4,4,5,5,4,4,4,4,5,4,4,4,5,5,4,4,4,5,5,4,4,4,5,4,4,4,5,5,4,4,4,4,5,5,4,4,5,5,4,4,5,5,4,4,4,5,4,4,5,4,4,5,5,4,4,4,5,4,4,4,5,4,4,4,5,4,5,4,4,4,5,4,4,4,5,4,4,4,5,4,4,4,5,4,4,4,5,4]}
{"delays":[0.0,0.0],"total_time_taken":550,"times_avarage":4.34,"times":[15,6,5,4,4,4,4,4,4,5,4,4,4,4,4,5,4,4,5,4,4,5,4,4,4,4,4,5,4,4,4,4,5,5,4,4,4,4,5,4,4,4,4,4,5,4,4,5,4,4,5,4,4,5,4,4,5,4,4,5,4,4,4,4,4,4,5,4,4,5,4,4,4,4,4,4,4,4,4,5,4,4,5,4,4,4,4,4,4,4,4,5,4,4,5,4,4,4,4,4]}

Examples of not-first runs with commented await asyncio.sleep(delay1) (3 attempts):

# `uvicorn performance_test:app --port 8083`

{"delays":[0.0,0.0],"total_time_taken":159,"times_avarage":0.6,"times":[3,1,0,0,1,1,1,1,1,1,1,1,0,0,0,0,0,0,1,1,1,1,1,0,0,1,1,0,0,0,0,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,0,0,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,1,0,0,0,0,0,1,1,1,1,1,0]}
{"delays":[0.0,0.0],"total_time_taken":162,"times_avarage":0.49,"times":[3,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,0,0,0,0,0,0,1,1,1,1,1,0,1,0,0,0,0,1,1,1,1,1,0,0,0,0,1,1,1,1,0,0,1,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,1,0,0,0,0,1,1,1,1,0,0,0,0,1,1,1,1,0,0,0,0,1,1]}
{"delays":[0.0,0.0],"total_time_taken":156,"times_avarage":0.61,"times":[3,1,1,1,1,1,1,1,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,0,1,0,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,0,1,1,1,1,1,0,0,0,0,0,1,1,1,1,1]}


# `gunicorn performance_test:app -b localhost:8084 -k uvicorn.workers.UvicornWorker --workers 1`

{"delays":[0.0,0.0],"total_time_taken":159,"times_avarage":0.59,"times":[2,0,0,0,0,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,0,0,0,0,1,0,1,1,1,1,1,0,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,0,1,1,1,1,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,0,0,0,0,1,1,1,1,1,0,0]}
{"delays":[0.0,0.0],"total_time_taken":165,"times_avarage":0.62,"times":[3,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,1,0,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,0,1,1,1,1,1]}
{"delays":[0.0,0.0],"total_time_taken":164,"times_avarage":0.54,"times":[2,0,0,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,0,0,0,1,1,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,0,1,1,1,1,1]}

I made a Python script to benchmark those times more precisely:

import statistics
import requests
from time import sleep

number_of_tests=1000

sites_to_test=[
    {
        'name':'only uvicorn    ',
        'url':'http://127.0.0.1:8083/delay/0.0/0.0'
    },
    {
        'name':'gunicorn+uvicorn',
        'url':'http://127.0.0.1:8084/delay/0.0/0.0'
    }]


for test in sites_to_test:

    total_time_taken_list=[]
    times_avarage_list=[]

    requests.get(test['url']) # first request may be slower, so better to not measure it

    for a in range(number_of_tests):
        r = requests.get(test['url'])
        json= r.json()

        total_time_taken_list.append(json['total_time_taken'])
        times_avarage_list.append(json['times_avarage'])
        # sleep(1) # results are slightly different with sleep between requests

    total_time_taken_avarage=statistics.mean(total_time_taken_list)
    times_avarage_avarage=statistics.mean(times_avarage_list)

    print({'name':test['name'], 'number_of_tests':number_of_tests, 'total_time_taken_avarage':total_time_taken_avarage, 'times_avarage_avarage':times_avarage_avarage})

Results:

{'name': 'only uvicorn    ', 'number_of_tests': 2000, 'total_time_taken_avarage': 586.5985, 'times_avarage_avarage': 4.820865}
{'name': 'gunicorn+uvicorn', 'number_of_tests': 2000, 'total_time_taken_avarage': 571.8415, 'times_avarage_avarage': 4.719035}

Results with commented await asyncio.sleep(delay1)

{'name': 'only uvicorn    ', 'number_of_tests': 2000, 'total_time_taken_avarage': 151.301, 'times_avarage_avarage': 0.602495}
{'name': 'gunicorn+uvicorn', 'number_of_tests': 2000, 'total_time_taken_avarage': 144.4655, 'times_avarage_avarage': 0.59196}

I also made another version of above script which changes urls every 1 request (it gives slightly higher times):

import statistics
import requests
from time import sleep

number_of_tests=1000

sites_to_test=[
    {
        'name':'only uvicorn    ',
        'url':'http://127.0.0.1:8083/delay/0.0/0.0',
        'total_time_taken_list':[],
        'times_avarage_list':[]
    },
    {
        'name':'gunicorn+uvicorn',
        'url':'http://127.0.0.1:8084/delay/0.0/0.0',
        'total_time_taken_list':[],
        'times_avarage_list':[]
    }]


for test in sites_to_test:
    requests.get(test['url']) # first request may be slower, so better to not measure it

for a in range(number_of_tests):

    for test in sites_to_test:
        r = requests.get(test['url'])
        json= r.json()

        test['total_time_taken_list'].append(json['total_time_taken'])
        test['times_avarage_list'].append(json['times_avarage'])
        # sleep(1) # results are slightly different with sleep between requests


for test in sites_to_test:
    total_time_taken_avarage=statistics.mean(test['total_time_taken_list'])
    times_avarage_avarage=statistics.mean(test['times_avarage_list'])

    print({'name':test['name'], 'number_of_tests':number_of_tests, 'total_time_taken_avarage':total_time_taken_avarage, 'times_avarage_avarage':times_avarage_avarage})

Results:

{'name': 'only uvicorn    ', 'number_of_tests': 2000, 'total_time_taken_avarage': 589.4315, 'times_avarage_avarage': 4.789385}
{'name': 'gunicorn+uvicorn', 'number_of_tests': 2000, 'total_time_taken_avarage': 589.0915, 'times_avarage_avarage': 4.761095}

Results with commented await asyncio.sleep(delay1)

{'name': 'only uvicorn    ', 'number_of_tests': 2000, 'total_time_taken_avarage': 152.8365, 'times_avarage_avarage': 0.59173}
{'name': 'gunicorn+uvicorn', 'number_of_tests': 2000, 'total_time_taken_avarage': 154.4525, 'times_avarage_avarage': 0.59768}

This answer should help you debug your results better.

I think it may help to investigate your results if you share more details about your OS / machine.

Also please restart your computer/server, it may have impact.


Update 1:

I see that I used newer version of uvicorn 0.14.0 than stated in question 0.13.4. I also tested with older version 0.13.4 but results are similar, I still can't reproduce your results.


Update 2:

I run some more benchmarks and I noticed something interesting:

with uvloop in requirements.txt:

whole requirements.txt:

uvicorn==0.14.0
fastapi==0.65.1
gunicorn==20.1.0
uvloop==0.15.2

Results:

{'name': 'only uvicorn    ', 'number_of_tests': 500, 'total_time_taken_avarage': 362.038, 'times_avarage_avarage': 2.54142}
{'name': 'gunicorn+uvicorn', 'number_of_tests': 500, 'total_time_taken_avarage': 366.814, 'times_avarage_avarage': 2.56766}

without uvloop in requirements.txt:

whole requirements.txt:

uvicorn==0.14.0
fastapi==0.65.1
gunicorn==20.1.0

Results:

{'name': 'only uvicorn    ', 'number_of_tests': 500, 'total_time_taken_avarage': 595.578, 'times_avarage_avarage': 4.83828}
{'name': 'gunicorn+uvicorn', 'number_of_tests': 500, 'total_time_taken_avarage': 584.64, 'times_avarage_avarage': 4.7155}

Update 3:

I was using only Python 3.9.5 in this answer.

like image 179
Karol Zlot Avatar answered Oct 27 '22 09:10

Karol Zlot


The difference is due to the underlying web server you use.

An analogy can be: two cars, same brand, same options, just a different engine, what's the difference?

Web servers are not exactly like a car, but I guess you get the point I'm trying to make.

Basically, gunicorn is a synchronous web server, while uvicorn is an asynchronous web server. Since you're using fastapi and await keywords I guess that you already know what asyncio/asynchornous programming is.

I don't know the code differences, so take my answer with a grain of salt, but uvicorn is more performant because of the asynchronous part. My guess for the timing difference, is that if you use an async web server, it is already configured on startup for handling async functions, while if you use a sync web server, it isn't and there is some kind of overhead in order to abstract that part.

It's not a proper answer, but it gives you a hint on where the difference could lie.

like image 2
lsabi Avatar answered Oct 27 '22 09:10

lsabi