I have a .csv file that I would like to render in a FastAPI app. I only managed to render the .csv file in JSON format as follows:
def transform_question_format(csv_file_name):
json_file_name = f"{csv_file_name[:-4]}.json"
# transforms the csv file into json file
pd.read_csv(csv_file_name ,sep=",").to_json(json_file_name)
with open(json_file_name, "r") as f:
json_data = json.load(f)
return json_data
@app.get("/questions")
def load_questions():
question_json = transform_question_format(question_csv_filename)
return question_json
When I tried returning directly pd.read_csv(csv_file_name ,sep=",").to_json(json_file_name), it works, as it returns a string.
How should I proceed? I believe this is not the good way to do it.
The below shows four different ways of returning the data stored in a .csv file/Pandas DataFrame (for solutions without using Pandas DataFrame, have a look here). Related answers on how to efficiently return a large dataframe can be found here and here as well.
The first option is to convert the file data into JSON and then parse it into a dict. You can optionally change the orientation of the data using the orient parameter in the .to_json() method.
Note: Better not to use this option. See Updates below.
from fastapi import FastAPI
import pandas as pd
import json
app = FastAPI()
df = pd.read_csv("file.csv")
def parse_csv(df):
res = df.to_json(orient="records")
parsed = json.loads(res)
return parsed
@app.get("/questions")
def load_questions():
return parse_csv(df)
Update 1: Using .to_dict() method would be a better option, as it would return a dict directly, instead of converting the DataFrame into JSON (using df.to_json()) and then that JSON string into dict (using json.loads()), as described earlier. Example:
@app.get("/questions")
def load_questions():
return df.to_dict(orient="records")
Update 2: When using .to_dict() method and returning the dict, FastAPI, behind the scenes, automatically converts that return value into JSON using the Python standard json.dumps(), after converting it into JSON-compatible data first, using the jsonable_encoder, and then putting that JSON-compatible data inside of a JSONResponse (see this answer for more details). Thus, to avoid that extra processing, you could still use the .to_json() method, but this time, put the JSON string in a custom Response and return it directly, as shown below:
from fastapi import Response
@app.get("/questions")
def load_questions():
return Response(df.to_json(orient="records"), media_type="application/json")
Another option is to return the data in string format, using .to_string() method.
@app.get("/questions")
def load_questions():
return df.to_string()
You could also return the data as an HTML table, using .to_html() method.
from fastapi.responses import HTMLResponse
@app.get("/questions")
def load_questions():
return HTMLResponse(content=df.to_html(), status_code=200)
Finally, you can always return the file as is using FastAPI's FileResponse.
from fastapi.responses import FileResponse
@app.get("/questions")
def load_questions():
return FileResponse(path="file.csv", filename="file.csv")
With the DataFrame.to_dict() method, not all Pandas datatypes are serializable by the json package:
df = pd.DataFrame({
"TrainID": ["T001", "T002", "T003"],
"Route": ["Amsterdam - Utrecht", "Rotterdam - Den Haag", "Eindhoven - Tilburg"],
"DepartureTime": [
pd.Timestamp("2022-03-09 08:00:00"),
pd.Timestamp("2022-03-09 09:15:00"),
pd.Timestamp("2022-03-09 10:30:00"),
],
"ArrivalTime": [
pd.Timestamp("2022-03-09 09:00:00"),
pd.Timestamp("2022-03-09 09:45:00"),
pd.Timestamp("2022-03-09 11:00:00"),
],
"Status": ["On Time", "Delayed", "Cancelled"],
})
json.dumps(df.to_dict(orient="records"))
TypeError: Object of type Timestamp is not JSON serializable
I have a DataFrameJSONResponse class to use the pandas.DataFrame.to_json instead of the json.dumps:
from fastapi.responses import Response
from typing import Any
class DataFrameJSONResponse(Response):
media_type = "application/json"
def render(self, content: Any) -> bytes:
return content.to_json(orient="records", date_format='iso').encode("utf-8")
@app.get("/test", response_class=DataFrameJSONResponse)
async def test_dataframe():
df = pd.DataFrame(
{
"TrainID": ["T001", "T002", "T003"],
"Route": [
"Amsterdam - Utrecht",
"Rotterdam - Den Haag",
"Eindhoven - Tilburg",
],
"DepartureTime": [
pd.Timestamp("2022-03-09 08:00:00"),
pd.Timestamp("2022-03-09 09:15:00"),
pd.Timestamp("2022-03-09 10:30:00"),
],
"ArrivalTime": [
pd.Timestamp("2022-03-09 09:00:00"),
pd.Timestamp("2022-03-09 09:45:00"),
pd.Timestamp("2022-03-09 11:00:00"),
],
"Status": ["On Time", "Delayed", "Cancelled"],
}
)
return DataFrameJSONResponse(df)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With