I have a standard DRF web application that outputs CSV data for one of the routes. Rendering the entire CSV representation takes a while to do. The data set is quite large so I wanted to have a streaming HTTP response so the client doesn't time out.
However using the example provided in https://github.com/mjumbewu/django-rest-framework-csv/blob/2ff49cff4b81827f3f450fd7d56827c9671c5140/rest_framework_csv/renderers.py#L197 doesn't quite accomplish this. The data is still one large payload instead of being chunked and the client ends up waiting for a response before the bytes are received.
The structure is similar to what follows:
models.py
class Report(models.Model):
count = models.PostiveIntegerField(blank=True)
...
renderers.py
class ReportCSVRenderer(CSVStreamingRenderer):
header = ['count']
serializers.py
class ReportSerializer(serializers.ModelSerializer):
count = fields.IntegerField()
class Meta:
model = Report
views.py
class ReportCSVView(generics.Viewset, mixins.ListModelMixin):
def get_queryset(self):
return Report.objects.all()
def list(self, request, *args, **kwargs):
queryset = self.get_queryset()
data = ReportSerializer(queryset, many=True)
renderer = ReportCSVRenderer()
response = StreamingHttpResponse(renderer.render(data), content_type='text/csv')
response['Content-Disposition'] = 'attachment; filename="f.csv"'
return response
NOTE: had to comment out or change some things.
Thank you
A simpler solution, inspired by the @3066d0's one:
renderers.py
class ReportsRenderer(CSVStreamingRenderer):
header = [ ... ]
labels = { ... }
views.py
class ReportCSVViewset(ListModelMixin, GenericViewSet):
queryset = Report.objects.select_related('stuff')
serializer_class = ReportCSVSerializer
renderer_classes = [ReportsRenderer]
PAGE_SIZE = 1000
def list(self, request, *args, **kwargs):
queryset = self.filter_queryset(self.get_queryset())
response = StreamingHttpResponse(
request.accepted_renderer.render(self._stream_serialized_data(queryset)),
status=200,
content_type="text/csv",
)
response["Content-Disposition"] = 'attachment; filename="reports.csv"'
return response
def _stream_serialized_data(self, queryset):
serializer = self.get_serializer_class()
paginator = Paginator(queryset, self.PAGE_SIZE)
for page in paginator.page_range:
yield from serializer(paginator.page(page).object_list, many=True).data
The point is that you need to pass a generator that yields serialized data as the data
argument to the renderer, and then the CSVStreamingRenderer
does its things and streams the response itself. I prefer this approach, because this way you do not need to override the code of a third-party library.
Django's StreamingHttpResponse
can be much slower than a traditional HttpResponse
for small responses.
Don't use it if you don't need to; the Django Docs actually recommend that StreamingHttpResponse
should only be used in when it is absolutely required that the whole content isn't iterated before transferring the data to the client."
Also for your problem you may find useful setting the chunk_size, switching to FileResponse or returning to a normal Response (if using the REST framework) or HttpResponse.
Edit 1: About setting the chunk size:
In the File api you can open the File in chunks so not all the file gets loaded in memory.
I hope you find this useful.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With