I wrote an fastapi app. And now I am thinking about deploying it however I seem to get strange unexpected performance issues that seem to depend on wether I use uvicorn vs gunicorn. In particular all code (even standard library pure python code) seems to get slower if I use gunicorn. For performance debugging I wrote a small app that demonstrates this:
import asyncio, time
from fastapi import FastAPI, Path
from datetime import datetime
app = FastAPI()
@app.get("/delay/{delay1}/{delay2}")
async def get_delay(
delay1: float = Path(..., title="Nonblocking time taken to respond"),
delay2: float = Path(..., title="Blocking time taken to respond"),
):
total_start_time = datetime.now()
times = []
for i in range(100):
start_time = datetime.now()
await asyncio.sleep(delay1)
time.sleep(delay2)
times.append(str(datetime.now()-start_time))
return {"delays":[delay1,delay2],"total_time_taken":str(datetime.now()-total_start_time),"times":times}
Running the fastapi appi with:
gunicorn api.performance_test:app -b localhost:8001 -k uvicorn.workers.UvicornWorker --workers 1
The resonse body of a get to http://localhost:8001/delay/0.0/0.0
is consistently something like:
{
"delays": [
0.0,
0.0
],
"total_time_taken": "0:00:00.057946",
"times": [
"0:00:00.000323",
...smilar values omitted for brevity...
"0:00:00.000274"
]
}
However using:
uvicorn api.performance_test:app --port 8001
I consitently get timings like these
{
"delays": [
0.0,
0.0
],
"total_time_taken": "0:00:00.002630",
"times": [
"0:00:00.000037",
...snip...
"0:00:00.000020"
]
}
The difference becomes even more prounced when I uncomment the await asyncio.sleep(delay1)
statement.
So I am wondering what gunicorn/uvicorn do to the python/fastapi runtime to create this factor 10 difference in the speed of code execution.
For what is is worth I performed these tests using Python 3.8.2 on OS X 11.2.3 with an intel I7 processor.
And these are the relevant parts of my pip freeze
output
fastapi==0.65.1
gunicorn==20.1.0
uvicorn==0.13.4
Gunicorn by itself is not compatible with FastAPI, as FastAPI uses the newest ASGI standard. But Gunicorn supports working as a process manager and allowing users to tell it which specific worker process class to use. Then Gunicorn would start one or more worker processes using that class.
Running with Gunicorn For production deployments we recommend using gunicorn with the uvicorn worker class. For a PyPy compatible configuration use uvicorn.
The main thing you need to run a FastAPI application in a remote server machine is an ASGI server program like Uvicorn. There are 3 main alternatives: Uvicorn: a high performance ASGI server.
The first step is to install FastAPI and Uvicorn using pip: $ python -m pip install fastapi uvicorn[standard] With that, you have FastAPI and Uvicorn installed and are ready to learn how to use them. FastAPI is the framework you’ll use to build your API, and Uvicorn is the server that will use the API you build to serve requests.
FastAPI is the framework you’ll use to build your API, and Uvicorn is the server that will use the API you build to serve requests. To get started, in this section, you will create a minimal FastAPI app, run it with a server using Uvicorn, and then learn all the interacting parts. This will give you a very quick overview of how everything works.
What Is FastAPI? FastAPI is a modern, high-performance web framework for building APIs with Python based on standard type hints. It has the following key features: Fast to run: It offers very high performance, on par with NodeJS and Go, thanks to Starlette and pydantic.
To run the FastAPI on HTTPS: edit systemctl service file of your app service. On Ubuntu, those files located at (/etc/systemd/system/) Add two arguments related to the SSL certificate to the execute command: --certfile="/etc/letsencrypt/live/yourdomain/fullchain.pem" --keyfile="/etc/letsencrypt/live/yourdomain/privkey.pem"
My environment: ubuntu on WSL2 on Windows 10
relevant parts of my pip freeze
output:
fastapi==0.65.1
gunicorn==20.1.0
uvicorn==0.14.0
I modified code a little:
import asyncio, time
from fastapi import FastAPI, Path
from datetime import datetime
import statistics
app = FastAPI()
@app.get("/delay/{delay1}/{delay2}")
async def get_delay(
delay1: float = Path(..., title="Nonblocking time taken to respond"),
delay2: float = Path(..., title="Blocking time taken to respond"),
):
total_start_time = datetime.now()
times = []
for i in range(100):
start_time = datetime.now()
await asyncio.sleep(delay1)
time.sleep(delay2)
time_delta= (datetime.now()-start_time).microseconds
times.append(time_delta)
times_average = statistics.mean(times)
return {"delays":[delay1,delay2],"total_time_taken":(datetime.now()-total_start_time).microseconds,"times_avarage":times_average,"times":times}
Apart from first loading of website, my results for both methods are nearly the same.
Times are between 0:00:00.000530
and 0:00:00.000620
most of the time for both methods.
First attempt for each takes longer: around 0:00:00.003000
.
However after I restarted Windows and tried those tests again I noticed I no longer have increased times on first requests after server startup (I think it is thanks to a lot of free RAM after restart)
Examples of not-first runs (3 attempts):
# `uvicorn performance_test:app --port 8083`
{"delays":[0.0,0.0],"total_time_taken":553,"times_avarage":4.4,"times":[15,7,5,4,4,4,4,5,5,4,4,5,4,4,5,4,4,5,4,4,5,4,4,5,4,4,4,5,4,4,5,4,4,5,4,4,4,4,4,5,4,5,5,4,4,4,4,4,4,5,4,4,4,5,4,4,4,4,4,4,5,4,4,5,4,4,4,4,5,4,4,5,4,4,4,4,4,5,4,4,5,4,4,5,4,4,5,4,4,4,4,4,4,4,5,4,4,4,5,4]}
{"delays":[0.0,0.0],"total_time_taken":575,"times_avarage":4.61,"times":[15,6,5,5,5,5,5,5,5,5,5,4,5,5,5,5,4,4,4,4,4,5,5,5,4,5,4,4,4,5,5,5,4,5,5,4,4,4,4,5,5,5,5,4,4,4,4,5,5,4,4,4,4,4,4,4,4,5,5,4,4,4,4,5,5,5,5,5,5,5,4,4,4,4,5,5,4,5,5,4,4,4,4,4,4,5,5,5,4,4,4,4,5,5,5,5,4,4,4,4]}
{"delays":[0.0,0.0],"total_time_taken":548,"times_avarage":4.31,"times":[14,6,5,4,4,4,4,4,4,4,5,4,4,4,4,4,4,5,4,4,5,4,4,4,4,4,4,4,5,4,4,4,5,4,4,4,4,4,4,4,4,5,4,4,4,4,4,4,5,4,4,4,4,4,5,5,4,4,4,4,4,4,4,5,4,4,4,4,4,5,4,4,5,4,4,5,4,4,5,4,4,4,4,4,4,4,5,4,4,5,4,4,5,4,4,5,4,4,4,4]}
# `gunicorn performance_test:app -b localhost:8084 -k uvicorn.workers.UvicornWorker --workers 1`
{"delays":[0.0,0.0],"total_time_taken":551,"times_avarage":4.34,"times":[13,6,5,5,5,5,5,4,4,4,5,4,4,4,4,4,5,4,4,5,4,4,5,4,4,4,4,4,5,4,4,4,4,4,5,4,4,4,4,4,4,4,5,4,4,5,4,4,4,4,4,4,4,4,5,4,4,4,4,4,4,4,5,4,4,4,4,4,4,4,4,4,5,4,4,5,4,5,4,4,5,4,4,4,4,5,4,4,5,4,4,4,4,4,4,4,5,4,4,5]}
{"delays":[0.0,0.0],"total_time_taken":558,"times_avarage":4.48,"times":[14,7,5,5,5,5,5,5,4,4,4,4,4,4,5,5,4,4,4,4,5,4,4,4,5,5,4,4,4,5,5,4,4,4,5,4,4,4,5,5,4,4,4,4,5,5,4,4,5,5,4,4,5,5,4,4,4,5,4,4,5,4,4,5,5,4,4,4,5,4,4,4,5,4,4,4,5,4,5,4,4,4,5,4,4,4,5,4,4,4,5,4,4,4,5,4,4,4,5,4]}
{"delays":[0.0,0.0],"total_time_taken":550,"times_avarage":4.34,"times":[15,6,5,4,4,4,4,4,4,5,4,4,4,4,4,5,4,4,5,4,4,5,4,4,4,4,4,5,4,4,4,4,5,5,4,4,4,4,5,4,4,4,4,4,5,4,4,5,4,4,5,4,4,5,4,4,5,4,4,5,4,4,4,4,4,4,5,4,4,5,4,4,4,4,4,4,4,4,4,5,4,4,5,4,4,4,4,4,4,4,4,5,4,4,5,4,4,4,4,4]}
Examples of not-first runs with commented await asyncio.sleep(delay1)
(3 attempts):
# `uvicorn performance_test:app --port 8083`
{"delays":[0.0,0.0],"total_time_taken":159,"times_avarage":0.6,"times":[3,1,0,0,1,1,1,1,1,1,1,1,0,0,0,0,0,0,1,1,1,1,1,0,0,1,1,0,0,0,0,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,0,0,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,1,0,0,0,0,0,1,1,1,1,1,0]}
{"delays":[0.0,0.0],"total_time_taken":162,"times_avarage":0.49,"times":[3,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,0,0,0,0,0,0,1,1,1,1,1,0,1,0,0,0,0,1,1,1,1,1,0,0,0,0,1,1,1,1,0,0,1,0,0,0,0,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,0,0,0,0,1,0,0,0,0,1,1,1,1,0,0,0,0,1,1,1,1,0,0,0,0,1,1]}
{"delays":[0.0,0.0],"total_time_taken":156,"times_avarage":0.61,"times":[3,1,1,1,1,1,1,1,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,0,1,0,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,0,1,1,1,1,1,0,0,0,0,0,1,1,1,1,1]}
# `gunicorn performance_test:app -b localhost:8084 -k uvicorn.workers.UvicornWorker --workers 1`
{"delays":[0.0,0.0],"total_time_taken":159,"times_avarage":0.59,"times":[2,0,0,0,0,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,0,0,0,0,1,0,1,1,1,1,1,0,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,0,1,1,1,1,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,0,0,0,0,1,1,1,1,1,0,0]}
{"delays":[0.0,0.0],"total_time_taken":165,"times_avarage":0.62,"times":[3,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,1,0,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,0,1,1,1,1,1]}
{"delays":[0.0,0.0],"total_time_taken":164,"times_avarage":0.54,"times":[2,0,0,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,1,1,0,0,0,0,0,0,0,1,1,1,1,1,0,0,0,1,1,0,0,0,0,0,1,1,1,1,1,1,0,0,0,0,0,1,1,1,1,1]}
I made a Python script to benchmark those times more precisely:
import statistics
import requests
from time import sleep
number_of_tests=1000
sites_to_test=[
{
'name':'only uvicorn ',
'url':'http://127.0.0.1:8083/delay/0.0/0.0'
},
{
'name':'gunicorn+uvicorn',
'url':'http://127.0.0.1:8084/delay/0.0/0.0'
}]
for test in sites_to_test:
total_time_taken_list=[]
times_avarage_list=[]
requests.get(test['url']) # first request may be slower, so better to not measure it
for a in range(number_of_tests):
r = requests.get(test['url'])
json= r.json()
total_time_taken_list.append(json['total_time_taken'])
times_avarage_list.append(json['times_avarage'])
# sleep(1) # results are slightly different with sleep between requests
total_time_taken_avarage=statistics.mean(total_time_taken_list)
times_avarage_avarage=statistics.mean(times_avarage_list)
print({'name':test['name'], 'number_of_tests':number_of_tests, 'total_time_taken_avarage':total_time_taken_avarage, 'times_avarage_avarage':times_avarage_avarage})
Results:
{'name': 'only uvicorn ', 'number_of_tests': 2000, 'total_time_taken_avarage': 586.5985, 'times_avarage_avarage': 4.820865}
{'name': 'gunicorn+uvicorn', 'number_of_tests': 2000, 'total_time_taken_avarage': 571.8415, 'times_avarage_avarage': 4.719035}
Results with commented await asyncio.sleep(delay1)
{'name': 'only uvicorn ', 'number_of_tests': 2000, 'total_time_taken_avarage': 151.301, 'times_avarage_avarage': 0.602495}
{'name': 'gunicorn+uvicorn', 'number_of_tests': 2000, 'total_time_taken_avarage': 144.4655, 'times_avarage_avarage': 0.59196}
I also made another version of above script which changes urls every 1 request (it gives slightly higher times):
import statistics
import requests
from time import sleep
number_of_tests=1000
sites_to_test=[
{
'name':'only uvicorn ',
'url':'http://127.0.0.1:8083/delay/0.0/0.0',
'total_time_taken_list':[],
'times_avarage_list':[]
},
{
'name':'gunicorn+uvicorn',
'url':'http://127.0.0.1:8084/delay/0.0/0.0',
'total_time_taken_list':[],
'times_avarage_list':[]
}]
for test in sites_to_test:
requests.get(test['url']) # first request may be slower, so better to not measure it
for a in range(number_of_tests):
for test in sites_to_test:
r = requests.get(test['url'])
json= r.json()
test['total_time_taken_list'].append(json['total_time_taken'])
test['times_avarage_list'].append(json['times_avarage'])
# sleep(1) # results are slightly different with sleep between requests
for test in sites_to_test:
total_time_taken_avarage=statistics.mean(test['total_time_taken_list'])
times_avarage_avarage=statistics.mean(test['times_avarage_list'])
print({'name':test['name'], 'number_of_tests':number_of_tests, 'total_time_taken_avarage':total_time_taken_avarage, 'times_avarage_avarage':times_avarage_avarage})
Results:
{'name': 'only uvicorn ', 'number_of_tests': 2000, 'total_time_taken_avarage': 589.4315, 'times_avarage_avarage': 4.789385}
{'name': 'gunicorn+uvicorn', 'number_of_tests': 2000, 'total_time_taken_avarage': 589.0915, 'times_avarage_avarage': 4.761095}
Results with commented await asyncio.sleep(delay1)
{'name': 'only uvicorn ', 'number_of_tests': 2000, 'total_time_taken_avarage': 152.8365, 'times_avarage_avarage': 0.59173}
{'name': 'gunicorn+uvicorn', 'number_of_tests': 2000, 'total_time_taken_avarage': 154.4525, 'times_avarage_avarage': 0.59768}
This answer should help you debug your results better.
I think it may help to investigate your results if you share more details about your OS / machine.
Also please restart your computer/server, it may have impact.
Update 1:
I see that I used newer version of uvicorn 0.14.0
than stated in question 0.13.4
.
I also tested with older version 0.13.4
but results are similar, I still can't reproduce your results.
Update 2:
I run some more benchmarks and I noticed something interesting:
whole requirements.txt:
uvicorn==0.14.0
fastapi==0.65.1
gunicorn==20.1.0
uvloop==0.15.2
Results:
{'name': 'only uvicorn ', 'number_of_tests': 500, 'total_time_taken_avarage': 362.038, 'times_avarage_avarage': 2.54142}
{'name': 'gunicorn+uvicorn', 'number_of_tests': 500, 'total_time_taken_avarage': 366.814, 'times_avarage_avarage': 2.56766}
whole requirements.txt:
uvicorn==0.14.0
fastapi==0.65.1
gunicorn==20.1.0
Results:
{'name': 'only uvicorn ', 'number_of_tests': 500, 'total_time_taken_avarage': 595.578, 'times_avarage_avarage': 4.83828}
{'name': 'gunicorn+uvicorn', 'number_of_tests': 500, 'total_time_taken_avarage': 584.64, 'times_avarage_avarage': 4.7155}
Update 3:
I was using only Python 3.9.5
in this answer.
The difference is due to the underlying web server you use.
An analogy can be: two cars, same brand, same options, just a different engine, what's the difference?
Web servers are not exactly like a car, but I guess you get the point I'm trying to make.
Basically, gunicorn
is a synchronous
web server, while uvicorn
is an asynchronous
web server. Since you're using fastapi
and await
keywords I guess that you already know what asyncio
/asynchornous programming
is.
I don't know the code differences, so take my answer with a grain of salt, but uvicorn
is more performant because of the asynchronous
part. My guess for the timing difference, is that if you use an async
web server, it is already configured on startup for handling async
functions, while if you use a sync
web server, it isn't and there is some kind of overhead in order to abstract that part.
It's not a proper answer, but it gives you a hint on where the difference could lie.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With