Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use PyCharm to debug Scrapy projects

I am working on Scrapy 0.20 with Python 2.7. I found PyCharm has a good Python debugger. I want to test my Scrapy spiders using it. Anyone knows how to do that please?

What I have tried

Actually I tried to run the spider as a script. As a result, I built that script. Then, I tried to add my Scrapy project to PyCharm as a model like this:
File->Setting->Project structure->Add content root. 

But I don't know what else I have to do

like image 815
William Kinaan Avatar asked Feb 14 '14 20:02

William Kinaan


People also ask

Can I Debug in PyCharm?

In PyCharm debugger, you can preview int variables in the hexadecimal or binary format. This might be particularly helpful when you debug network scripts that include binary protocols.

How do you Debug a scrapy spider?

Start Scrapy Shell from your Spider Code shell. inspect_response method in your spider code. This will open a Scrapy shell session that will let you interact with the current response object. from scrapy.

How do I enable debugging actions in PyCharm?

Just right-click any line in the editor and select the Debug <filename> command from the context menu. After the program has been suspended, use the debugger to get the information about the state of the program and how it changes during running.


1 Answers

The scrapy command is a python script which means you can start it from inside PyCharm.

When you examine the scrapy binary (which scrapy) you will notice that this is actually a python script:

#!/usr/bin/python  from scrapy.cmdline import execute execute() 

This means that a command like scrapy crawl IcecatCrawler can also be executed like this: python /Library/Python/2.7/site-packages/scrapy/cmdline.py crawl IcecatCrawler

Try to find the scrapy.cmdline package. In my case the location was here: /Library/Python/2.7/site-packages/scrapy/cmdline.py

Create a run/debug configuration inside PyCharm with that script as script. Fill the script parameters with the scrapy command and spider. In this case crawl IcecatCrawler.

Like this: PyCharm Run/Debug Configuration

Put your breakpoints anywhere in your crawling code and it should work™.

like image 171
Pullie Avatar answered Sep 24 '22 04:09

Pullie