Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scrapy csv file has uniform empty rows?

Tags:

python

scrapy

here is the spider:

import scrapy
from danmurphys.items import DanmurphysItem

class MySpider(scrapy.Spider):
    name = 'danmurphys'
    allowed_domains = ['danmurphys.com.au']
    start_urls = ['https://www.danmurphys.com.au/dm/navigation/navigation_results_gallery.jsp?params=fh_location%3D%2F%2Fcatalog01%2Fen_AU%2Fcategories%3C%7Bcatalog01_2534374302084767_2534374302027742%7D%26fh_view_size%3D120%26fh_sort%3D-sales_value_30_days%26fh_modification%3D&resetnav=false&storeExclusivePage=false']


    def parse(self, response):        
        urls = response.xpath('//h2/a/@href').extract()
        for url in urls:            
            request = scrapy.Request(url , callback=self.parse_page)      
            yield request

    def parse_page(self , response):
        item = DanmurphysItem()
        item['brand'] = response.xpath('//span[@itemprop="brand"]/text()').extract_first().strip()
        item['name'] = response.xpath('//span[@itemprop="name"]/text()').extract_first().strip()
        item['url'] = response.url     
        return item

and here is the items :

import scrapy
class DanmurphysItem(scrapy.Item):  
    brand = scrapy.Field()
    name = scrapy.Field()
    url = scrapy.Field()

when I run the spider with this command :

scrapy crawl danmurphys -o output.csv

the output is like this : enter image description here

like image 749
Ibrahim Avatar asked Sep 13 '16 19:09

Ibrahim


1 Answers

To fix this in Scrapy 1.3, you can patch it by adding newline='' as parameter to io.TextIOWrapper in the __init__ method of the CsvItemExporter class in scrapy.exporters.

like image 72
Cito Avatar answered Oct 15 '22 10:10

Cito