Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to Send a Streaming Response via LlamaIndex to a FastAPI Endpoint?

I need to send a streaming response using LlamaIndex to my FastAPI endpoint. Below is the code I've written so far:

@bot_router.post("/bot/pdf_convo")
async def pdf_convo(query: QuestionInput):
    chat_engine = cache["chat_engine"]
    user_question = query.content
    streaming_response = chat_engine.stream_chat(user_question)
    for token in streaming_response.response_gen:
        print(token, end="")

I'd appreciate any guidance on how to properly implement the streaming response with LlamaIndex. Thank you!

like image 458
Mubashir Ahmed Siddiqui Avatar asked Oct 27 '25 12:10

Mubashir Ahmed Siddiqui


1 Answers

Reference

FastAPI - StreamingResponse

Solution

In order to use the StreamingResponse class provided, you'll need to create an async generator or a normal generator/iterator, then pass it into the StreamingResponse object. In your case, you want to pass the streaming_response.response_gen into a function that will return a generator. For example, this is how I would do it:

async def response_streamer(response):
    for token in response:
        yield f"{token}"

@bot_router.post("/bot/pdf_convo")
async def pdf_convo(query: QuestionInput):
    chat_engine = cache["chat_engine"]
    user_question = query.content
    streaming_response = chat_engine.stream_chat(user_question)
    return StreamingResponse(response_streamer(streaming_response.response_gen))
like image 106
stevenong99 Avatar answered Oct 29 '25 01:10

stevenong99



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!