What is the difference between deploying FastAPI apps dockerized using Uvicorn and Tiangolo's Gunicorn+Uvicorn? And why do my results show that I get a better result when deploying only using Uvicorn than Gunicorn+Uvicorn? When I searched in Tiangolo's documentation, it says: <blockquote> You can use Gunicorn to manage Uvicorn and run multiple of these concurrent processes. That way, you get the best of concurrency and parallelism. </blockquote> From this, can I assume that using this Gunicorn will get a better result? This is my testing using JMeter. I deployed my script to Google Cloud Run, and this is the result: Using Python and Uvicorn: <img src="https://i.stack.imgur.com/iH8kd.png" alt="enter image description here"> Using Tiangolo's Gunicorn+Uvicorn: <img src="https://i.stack.imgur.com/NJCYK.png" alt="enter image description here"> This is my Dockerfile for Python (Uvicorn): <pre class="prettyprint"><code>FROM python:3.8-slim-buster RUN apt-get update --fix-missing RUN DEBIAN_FRONTEND=noninteractive apt-get install -y libgl1-mesa-dev python3-pip git RUN mkdir /usr/src/app WORKDIR /usr/src/app COPY ./requirements.txt /usr/src/app/requirements.txt RUN pip3 install -U setuptools RUN pip3 install --upgrade pip RUN pip3 install -r ./requirements.txt --use-feature=2020-resolver COPY . /usr/src/app CMD ["python3", "/usr/src/app/main.py"] </code></pre> This is my Dockerfile for Tiangolo's Gunicorn+Uvicorn: <pre class="prettyprint"><code>FROM tiangolo/uvicorn-gunicorn-fastapi:python3.8-slim RUN apt-get update && apt-get install wget gcc -y RUN mkdir -p /app WORKDIR /app COPY ./requirements.txt /app/requirements.txt RUN python -m pip install --upgrade pip RUN pip install --no-cache-dir -r /app/requirements.txt COPY . /app </code></pre> You can see the error from Tiangolo's Gunicorn+Uvicorn. Is it caused by Gunicorn? Edited. So, in my case, I using lazy load method to load my Machine Learning model. This is my class to load the model. <pre class="prettyprint lang-py prettyprint-override"><code>class MyModelPrediction: # init method or constructor def __init__(self, brand): self.brand = brand # Sample Method def load_model(self): pathfile_model = os.path.join("modules", "model/") brand = self.brand.lower() top5_brand = ["honda", "toyota", "nissan", "suzuki", "daihatsu"] if brand not in top5_brand: brand = "ex_Top5" with open(pathfile_model + f'{brand}_all_in_one.pkl', 'rb') as file: brand = joblib.load(file) else: with open(pathfile_model + f'{brand}_all_in_one.pkl', 'rb') as file: brand = joblib.load(file) return brand </code></pre> And, this is my endpoint for my API. <pre class="prettyprint lang-py prettyprint-override"><code>@router.post("/predict", response_model=schemas.ResponsePrediction, responses={422: schemas.responses_dict[422], 400: schemas.responses_dict[400], 500: schemas.responses_dict[500]}, tags=["predict"], response_class=ORJSONResponse) async def detect( *, # db: Session = Depends(deps.get_db_api), car: schemas.Car = Body(...), customer_id: str = Body(None, title='Customer unique identifier') ) -> Any: """ Predict price for used vehicle.\n """ global list_detections try: start_time = time.time() brand = car.dict()['brand'] obj = MyModelPrediction(brand) top5_brand = ["honda", "toyota", "nissan", "suzuki", "daihatsu"] if brand not in top5_brand: brand = "non" if usedcar.price_engine_4w[brand]: pass else: usedcar.price_engine_4w[brand] = obj.load_model() print("Load success") elapsed_time = time.time() - start_time print(usedcar.price_engine_4w) print("ELAPSED MODEL TIME : ", elapsed_time) list_detections = await get_data_model(**car.dict()) if list_detections is None: result_data = None else: result_data = schemas.Prediction(**list_detections) result_data = result_data.dict() except Exception as e: # noqa raise HTTPException( status_code=500, detail=str(e), ) else: if result_data['prediction_price'] == 0: raise HTTPException( status_code=400, detail="The system cannot process your request", ) else: result = { 'code': 200, 'message': 'Successfully fetched data', 'data': result_data } return schemas.ResponsePrediction(**result) </code></pre>

Gunicorn is mainly an application server using the WSGI standard. That means that Gunicorn can serve applications like Flask and Django. Gunicorn by itself is not compatible with FastAPI, as FastAPI uses the newest ASGI standard. Uvicorn is an ASGI web server implementation for Python. However, it's capabilities as a process manager leave much to be desired. Uvicorn has a Gunicorn-compatible worker class. Using that combination, Gunicorn would act as a process manager, listening on the port and the IP. And it would transmit the communication to the worker processes running the Uvicorn class. And then the Gunicorn-compatible Uvicorn worker class would be in charge of converting the data sent by Gunicorn to the ASGI standard for FastAPI to use it. If you have a cluster of machines with Kubernetes, Docker Swarm Mode, Nomad, or another similar complex system to manage distributed containers on multiple machines, then you will probably want to handle replication at the cluster level instead of using a process manager (like Gunicorn with workers) in each container. One of those distributed container management systems like Kubernetes normally has some integrated way of handling replication of containers while still supporting load balancing for the incoming requests. All at the cluster level. In those cases, you would probably want to build a Docker image from scratch, installing your dependencies, and running a single Uvicorn process instead of running something like Gunicorn with Uvicorn workers.

What is the difference between Uvicorn and Gunicorn+Uvicorn?

Tags:

gunicorn

google-cloud-run

fastapi

uvicorn

What is the difference between deploying FastAPI apps dockerized using Uvicorn and Tiangolo's Gunicorn+Uvicorn? And why do my results show that I get a better result when deploying only using Uvicorn than Gunicorn+Uvicorn?

When I searched in Tiangolo's documentation, it says:

You can use Gunicorn to manage Uvicorn and run multiple of these concurrent processes. That way, you get the best of concurrency and parallelism.

From this, can I assume that using this Gunicorn will get a better result?

This is my testing using JMeter. I deployed my script to Google Cloud Run, and this is the result:

Using Python and Uvicorn:

enter image description here

Using Tiangolo's Gunicorn+Uvicorn:

enter image description here

This is my Dockerfile for Python (Uvicorn):

FROM python:3.8-slim-buster
RUN apt-get update --fix-missing
RUN DEBIAN_FRONTEND=noninteractive apt-get install -y libgl1-mesa-dev python3-pip git
RUN mkdir /usr/src/app
WORKDIR /usr/src/app
COPY ./requirements.txt /usr/src/app/requirements.txt
RUN pip3 install -U setuptools
RUN pip3 install --upgrade pip
RUN pip3 install -r ./requirements.txt --use-feature=2020-resolver
COPY . /usr/src/app
CMD ["python3", "/usr/src/app/main.py"]

This is my Dockerfile for Tiangolo's Gunicorn+Uvicorn:

FROM tiangolo/uvicorn-gunicorn-fastapi:python3.8-slim
RUN apt-get update && apt-get install wget gcc -y
RUN mkdir -p /app
WORKDIR /app
COPY ./requirements.txt /app/requirements.txt
RUN python -m pip install --upgrade pip
RUN pip install --no-cache-dir -r /app/requirements.txt
COPY . /app

You can see the error from Tiangolo's Gunicorn+Uvicorn. Is it caused by Gunicorn?

Edited.

So, in my case, I using lazy load method to load my Machine Learning model. This is my class to load the model.

class MyModelPrediction:
    # init method or constructor
    def __init__(self, brand):
        self.brand = brand

    # Sample Method
    def load_model(self):
        pathfile_model = os.path.join("modules", "model/")
        brand = self.brand.lower()
        top5_brand = ["honda", "toyota", "nissan", "suzuki", "daihatsu"]

        if brand not in top5_brand:
            brand = "ex_Top5"
            with open(pathfile_model + f'{brand}_all_in_one.pkl', 'rb') as file:
                brand = joblib.load(file)
        else:
            with open(pathfile_model + f'{brand}_all_in_one.pkl', 'rb') as file:
                brand = joblib.load(file)

        return brand

And, this is my endpoint for my API.

@router.post("/predict", response_model=schemas.ResponsePrediction, responses={422: schemas.responses_dict[422], 400: schemas.responses_dict[400], 500: schemas.responses_dict[500]}, tags=["predict"], response_class=ORJSONResponse)
async def detect(
    *,
    # db: Session = Depends(deps.get_db_api),
    car: schemas.Car = Body(...),
    customer_id: str = Body(None, title='Customer unique identifier')
) -> Any:
    """
    Predict price for used vehicle.\n
    """
    global list_detections
    try:
        start_time = time.time()
        brand = car.dict()['brand']
        obj = MyModelPrediction(brand)

        top5_brand = ["honda", "toyota", "nissan", "suzuki", "daihatsu"]
        if brand not in top5_brand:
            brand = "non"

        if usedcar.price_engine_4w[brand]:
            pass
        else:
            usedcar.price_engine_4w[brand] = obj.load_model()
            print("Load success")

        elapsed_time = time.time() - start_time
        print(usedcar.price_engine_4w)
        print("ELAPSED MODEL TIME : ", elapsed_time)

        list_detections = await get_data_model(**car.dict())

        if list_detections is None:
            result_data = None
        else:
            result_data = schemas.Prediction(**list_detections)
            result_data = result_data.dict()

    except Exception as e:  # noqa
        raise HTTPException(
            status_code=500,
            detail=str(e),
        )
    else:
        if result_data['prediction_price'] == 0:
            raise HTTPException(
                status_code=400,
                detail="The system cannot process your request",
            )
        else:
            result = {
                'code': 200,
                'message': 'Successfully fetched data',
                'data': result_data
            }

    return schemas.ResponsePrediction(**result)

744

asked Feb 25 '21 03:02

Rudy Tri Saputra

1 Answers

Gunicorn is mainly an application server using the WSGI standard. That means that Gunicorn can serve applications like Flask and Django. Gunicorn by itself is not compatible with FastAPI, as FastAPI uses the newest ASGI standard.

Uvicorn is an ASGI web server implementation for Python. However, it's capabilities as a process manager leave much to be desired.

Uvicorn has a Gunicorn-compatible worker class.

Using that combination, Gunicorn would act as a process manager, listening on the port and the IP. And it would transmit the communication to the worker processes running the Uvicorn class. And then the Gunicorn-compatible Uvicorn worker class would be in charge of converting the data sent by Gunicorn to the ASGI standard for FastAPI to use it.

If you have a cluster of machines with Kubernetes, Docker Swarm Mode, Nomad, or another similar complex system to manage distributed containers on multiple machines, then you will probably want to handle replication at the cluster level instead of using a process manager (like Gunicorn with workers) in each container. One of those distributed container management systems like Kubernetes normally has some integrated way of handling replication of containers while still supporting load balancing for the incoming requests. All at the cluster level. In those cases, you would probably want to build a Docker image from scratch, installing your dependencies, and running a single Uvicorn process instead of running something like Gunicorn with Uvicorn workers.

140

answered Oct 29 '22 10:10

zo0M

Related questions
                            
                                Mezzanine - Can't load css and js in Heroku
                            
                                set up pipenv with supervisor
                            
                                FastAPI Gunicorn Uvicorn for Production Deployment with Google Cloud Run (Stress Testing)
                            
                                How to configure nginx to properly serve mp4 videos to Safari?
                            
                                How to deploy a WordPress site and Django site on the same domain?
                            
                                Gunicorn - Where do Unix sockets come from?
                            
                                Run multiple django project with nginx and gunicorn
                            
                                Error: class uri 'eventlet' invalid or not found
                            
                                gunicorn via mod_proxy is redirecting outside of the project's scope, despite ProxyPassReverse
                            
                                MongoClient opened before fork. Create MongoClient
                            
                                UnicodeEncodeError when uploading files in Django admin
                            
                                How to host a Django project in a subpath?
                            
                                How to log requests and responses in Django?
                            
                                gunicorn: ERROR (no such file) nginx + gunicorn + supervisor
                            
                                Logging stdout to gunicorn access log?
                            
                                How exactly do I server static files with nginx and gunicorn for a Django app?
                            
                                How do I define a settings.LOGGING so that gunicorn will find the version value it wants?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With