Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Adding rows manually to StreamingHttpResponse (Django)

Tags:

python

csv

django

I am using Django's StreamingHttpResponse to stream a large CSV file on the fly. According to the docs, an iterator is passed to the response's streaming_content parameter:

import csv
from django.http import StreamingHttpResponse

def get_headers():
    return ['field1', 'field2', 'field3']

def get_data(item):
    return {
        'field1': item.field1,
        'field2': item.field2,
        'field3': item.field3,
    }

# StreamingHttpResponse requires a File-like class that has a 'write' method
class Echo(object):
    def write(self, value):
        return value


def get_response(queryset):
    writer = csv.DictWriter(Echo(), fieldnames=get_headers())
    writer.writeheader() # this line does not work

    response = StreamingHttpResponse(
        # the iterator
        streaming_content=(writer.writerow(get_data(item)) for item in queryset),
        content_type='text/csv',
    )
    response['Content-Disposition'] = 'attachment;filename=items.csv'

    return response

My question is: how can I manually write a row on the CSV writer? manually calling writer.writerow(data) or writer.writeheader() (which also internally calls writerow()) does not seem to write to the dataset, and instead only the generated / streamed data from streaming_content is written on the output dataset.

like image 712
silentbugs Avatar asked Aug 08 '17 21:08

silentbugs


Video Answer


2 Answers

The answer is yielding results with a generator function instead of calculating them on the fly (within StreamingHttpResponse's streaming_content argument) and using the pseudo buffer we created (Echo Class) in order to write a row to the response:

import csv
from django.http import StreamingHttpResponse

def get_headers():
    return ['field1', 'field2', 'field3']

def get_data(item):
    return {
        'field1': item.field1,
        'field2': item.field2,
        'field3': item.field3,
    }

# StreamingHttpResponse requires a File-like class that has a 'write' method
class Echo(object):
    def write(self, value):
        return value

def iter_items(items, pseudo_buffer):
    writer = csv.DictWriter(pseudo_buffer, fieldnames=get_headers())
    yield pseudo_buffer.write(get_headers())

    for item in items:
        yield writer.writerow(get_data(item))

def get_response(queryset):
    response = StreamingHttpResponse(
        streaming_content=(iter_items(queryset, Echo())),
        content_type='text/csv',
    )
    response['Content-Disposition'] = 'attachment;filename=items.csv'
    return response
like image 169
silentbugs Avatar answered Sep 26 '22 21:09

silentbugs


The proposed solution can actually lead to incorrect/mismatched CSVs (header mismatched with data). You'd want to replace the affected section with something like:

header = dict(zip(fieldnames, fieldnames))
yield writer.writerow(header)

instead. This is from the implementation of writeheader https://github.com/python/cpython/blob/08045391a7aa87d4fbd3e8ef4c852c2fa4e81a8a/Lib/csv.py#L141:L143

For some reason, it's not behaving well with yield

Hope this helps someone in the future :)

Also note that no fix is needed if using python 3.8+ because of this PR: https://bugs.python.org/issue27497

like image 41
tr33hous Avatar answered Sep 22 '22 21:09

tr33hous