In Scrapy, I have my items specified in a certain order in items.py, & my spider has those items again in the same order. However, when I run the spider & save the results as a csv, the column order from the items.py or the spider is not maintained. How can I get the CSV to show columns in a specific order. Example code would be very appreciated.
Thanks.
This is related to Modifiying CSV export in scrapy
The problem is that the exporter is instantiated without any keyword parameters, so the keywords like EXPORT_FIELDS are ignored. The solution is the same: you need to subclass the CSV item exporter to pass the keyword parameters.
Following the above recipe, I created a new file xyzzy/feedexport.py (change "xyzzy" to whatever your scrapy class is named):
"""
The standard CSVItemExporter class does not pass the kwargs through to the
CSV writer, resulting in EXPORT_FIELDS and EXPORT_ENCODING being ignored
(EXPORT_EMPTY is not used by CSV).
"""
from scrapy.conf import settings
from scrapy.contrib.exporter import CsvItemExporter
class CSVkwItemExporter(CsvItemExporter):
def __init__(self, *args, **kwargs):
kwargs['fields_to_export'] = settings.getlist('EXPORT_FIELDS') or None
kwargs['encoding'] = settings.get('EXPORT_ENCODING', 'utf-8')
super(CSVkwItemExporter, self).__init__(*args, **kwargs)
and then added it into xyzzy/settings.py:
FEED_EXPORTERS = {
'csv': 'xyzzy.feedexport.CSVkwItemExporter'
}
Now the CSV exporter will honor the EXPORT_FIELD setting - also add to xyzzy/settings.py:
# By specifying the fields to export, the CSV export honors the order
# rather than using a random order.
EXPORT_FIELDS = [
'field1',
'field2',
'field3',
]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With