How to profile a scrapy python script?

Tags:

Taking the example script

import scrapy
from scrapy.spiders import CrawlSpider, Rule
from scrapy.linkextractors import LinkExtractor

class MySpider(CrawlSpider):
    name = 'example.com'
    allowed_domains = ['example.com']
    start_urls = ['http://www.example.com']

    rules = (
        # Extract links matching 'category.php' (but not matching 'subsection.php')
        # and follow links from them (since no callback means follow=True by default).
        Rule(LinkExtractor(allow=('category\.php', ), deny=('subsection\.php', ))),

        # Extract links matching 'item.php' and parse them with the spider's method parse_item
        Rule(LinkExtractor(allow=('item\.php', )), callback='parse_item'),
    )

    def parse_item(self, response):
        self.logger.info('Hi, this is an item page! %s', response.url)
        item = scrapy.Item()
        item['id'] = response.xpath('//td[@id="item_id"]/text()').re(r'ID: (\d+)')
        item['name'] = response.xpath('//td[@id="item_name"]/text()').extract()
        item['description'] = response.xpath('//td[@id="item_description"]/text()').extract()
        return item

can anyone please let me know practically how to do profiling of this script ?

Thanks

835

asked Oct 21 '17 16:10

Shanthi

1 Answers

That depends on what you want to profile:

If it is general debugging, see Debugging Spiders
If it is memory usage, see Debugging memory leaks
If it is crawling rate and other statistics, see Stats Collection
If you want to interact with the spider at runtime, the Telnet Console might be the way to go.

There is also Logging, of course.

134

answered Oct 21 '22 17:10

Gallaecio

Related questions
                            
                                Why can't I "deactivate" pyenv / virtualenv? How to "fix" installation
                            
                                Cannot find ODBC driver in AWS Lambda when using unixODBC
                            
                                Why is Twisted's adbapi failing to recover data from within unittests?
                            
                                sqlalchemy, filter a json column containing an array [duplicate]
                            
                                Using spyder with virtualenv
                            
                                A dictionary with a unique possible value for each key?
                            
                                Password protect a SPECIFIC Jupyter notebook
                            
                                How do I know if tensorflow using cuda and cudnn or not?
                            
                                Delay load python DLL when embedding python+numpy
                            
                                How to find memory leak with pandas
                            
                                printing text below tqdm progress bar
                            
                                How to force dicts to be unordered (for testing)?
                            
                                Python - Error (Relay access denied) while sending email
                            
                                Pycharm not displaying wide Dataframe in Jupyter Notebook
                            
                                "Allocating size to..." GTK Warning when using Gtk.TreeView inside Gtk.ScrolledWindow
                            
                                How can I get Vim to recognize Python3 syntax?
                            
                                ModuleNotFoundError: No module named 'tensorflow.python.training'
                            
                                No warning at undefined variables in PyCharm Community 2017.2
                            
                                Multi-output regression model always returns the same value for a batch in Tensorflow
                            
                                How to locate the four elements using selenium in python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to profile a scrapy python script?

Tags:

python

profiling

scrapy

scrapy-spider

Shanthi

People also ask

1 Answers

Gallaecio

Recent Activity

Donate For Us