Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Logging in Scrapy

I am having trouble with logging in scrapy, and most of what I can find is out of date.

I have set LOG_FILE="log.txt" in the settings.py file and from the documentation, this should work:

Scrapy provides a logger within each Spider instance, that can be accessed and used like this:

import scrapy

class MySpider(scrapy.Spider):

    name = 'myspider'
    start_urls = ['http://scrapinghub.com']

    def parse(self, response):
        self.logger.info('Parse function called on %s', response.url)

But when I do:

class MySpider(CrawlSpider):
    #other code
    def parse_page(self,response):
        self.logger.info("foobar")

I get nothing. If I set

logger = logging.basicConfig(filename="log.txt",level=logging.INFO)

At the top of my file, after my imports, it creates a log file, and the default output gets logged just fine, but

class MySpider(CrawlSpider):
    #other code
    def parse_page(self,response):
        logger.info("foobar")

Fails to make an appearance. I have also tried putting it in the class __init__, as such:

def __init__(self, *a, **kw):
    super(FanfictionSpider, self).__init__(*a, **kw)
    logging.basicConfig(filename="log.txt",level=logging.INFO)

I once again get no output to the file, just to the console, and foobar does not show up. Can someone please direct me on how to correctly log in Scrapy?

like image 820
Devon M Avatar asked Jul 16 '16 17:07

Devon M


3 Answers

For logging I just put this on the spider class:

import logging
from scrapy.utils.log import configure_logging 


class SomeSpider(scrapy.Spider):
    configure_logging(install_root_handler=False)
    logging.basicConfig(
        filename='log.txt',
        format='%(levelname)s: %(message)s',
        level=logging.INFO
    )

This will put all scrapy output into the project root directory as a log.txt file

If you want to log something manually you shouldn't use the scrapy logger, it's deprecated. Just use the python one

import logging
logging.error("Some error")
like image 62
Rafael Almeida Avatar answered Oct 06 '22 00:10

Rafael Almeida


It seems that you're not calling your parse_page method at any time. Try to commenting your parse method and you're going to receive a NotImplementedError because you're starting it and you're saying it 'do nothing'.

Maybe if you implement your parse_page method it'll work

def parse(self, response):
    self.logger.info('Parse function called on %s', response.url)
    self.parse_page(response)

Hope it helps you.

like image 24
Sebastian Palma Avatar answered Oct 06 '22 00:10

Sebastian Palma


I was unable to make @Rafael Almeda's solution work until I added the following to the import section of my spider.py code:

from scrapy.utils.log import configure_logging 
like image 26
mdkb Avatar answered Oct 05 '22 23:10

mdkb