Python Scrapy tutorial KeyError: 'Spider not found:

Tags:

1 Answers

When you start a project with scrapy as the project name it creates the directory structure you printed:

.
├── scrapy
│   ├── __init__.py
│   ├── items.py
│   ├── pipelines.py
│   ├── settings.py
│   └── spiders
│       ├── __init__.py
│       └── juno_spider.py
└── scrapy.cfg

But using scrapy as the project name has a collateral effect. If you open the generated scrapy.cfg you will see that your default settings points to your scrapy.settings module.

[settings]
default = scrapy.settings

When we cat the scrapy.settings file we see:

BOT_NAME = 'scrapy'

SPIDER_MODULES = ['scrapy.spiders']
NEWSPIDER_MODULE = 'scrapy.spiders'

Well, nothing strange here. The bot name, the list of modules where Scrapy will look for spiders, and the module where to create new spiders using the genspider command. So far, so good.

Now let's check the scrapy library. It has been properly installed under your proscraper isolated virtualenv under the /home/tim/.virtualenvs/proscraper/lib/python2.7/site-packages/scrapy directory. Remember that site-packages is always added to the sys.path, that contains all the paths from where Python is going to search for the modules. So, guess what... the scrapy library also has a settings module /home/tim/.virtualenvs/proscraper/lib/python2.7/site-packages/scrapy/settings that imports /home/tim/.virtualenvs/proscraper/lib/python2.7/site-packages/scrapy/settings/default_settings.py that holds the default values for all the settings. Special attention to the default SPIDER_MODULES entry:

SPIDER_MODULES = []

Maybe you are starting to get what is happening. Choosing scrapy as the project name also generated a scrapy.settings module that clashes with the scrapy library scrapy.settings. And here is where the order in how the corresponding paths were inserted in sys.path will make Python to import one or the other. First to appear wins. In this case the scrapy library settings wins. And hence the KeyError: 'Spider not found: juno'.

To solve this conflict you could rename your project folder to another name, let's say scrap:

.
├── scrap
│   ├── __init__.py

Modify your scrapy.cfg to point to the proper settings module:

[settings]
default = scrap.settings

And update your scrap.settings to point to the proper spiders:

SPIDER_MODULES = ['scrap.spiders']

But as @paultrmbrth suggested I would recreate the project with another name.

103

answered Oct 17 '22 17:10

dreyescat

Related questions
                            
                                os.path.isdir returns false when folder exists?
                            
                                location of python27.dll from python itself
                            
                                How to filter strings in pandas series index
                            
                                Solving a bounded non-linear minimization with scipy in python
                            
                                Python Pandas iterrows() with previous values
                            
                                regex for repeating words in a string in Python
                            
                                Cloning a private Github repo using a script
                            
                                Differences between BaseHttpServer and wsgiref.simple_server
                            
                                Does scipy logsumexp() deal with the underflow challenge?
                            
                                Using the python MySQLDB SScursor with nested queries
                            
                                Strange assignment in numpy arrays
                            
                                Python dir equivalent in perl?
                            
                                uwsgi returns blank output
                            
                                Dump elementtree into xml file
                            
                                Why does s[len(s)-1:-1:-1] not work?
                            
                                How do I change a value in a .npz file?
                            
                                How does 'yield' work in tornado when making an asynchronous call?
                            
                                Difference between every pair of columns of two numpy arrays (how to do it more efficiently)?
                            
                                Pandas report top-n in group and pivot
                            
                                Writing a formated binary file from a Pandas Dataframe

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python Scrapy tutorial KeyError: 'Spider not found:

Tags:

python

scrapy

Tim

People also ask

1 Answers

dreyescat

Recent Activity

Donate For Us