I am newbie in Python, installed Scrapy successfully, using PyDev in eclipse. When I run the programm it is showing like this (Figure illustrates)
[]
I am running this code :
import scrapy
class DmozSpider(scrapy.Spider):
name = "dmoz"
allowed_domains = ["dmoz.org"]
start_urls = [
"http://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
"http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/"
]
def parse(self, response):
for sel in response.xpath('//ul/li'):
title = sel.xpath('a/text()').extract()
link = sel.xpath('a/@href').extract()
desc = sel.xpath('text()').extract()
print title, link, desc
What is it? Unable to run program.
Your current directory is not a Scrapy project.
A scrapy project has a defined format and files. Have a look at: http://doc.scrapy.org/en/latest/intro/tutorial.html
You really should go though the tutorial once.
Basically, a Scrapy project has a directory structure as:
tutorial/
scrapy.cfg # deploy configuration file
tutorial/ # project's Python module, you'll import your code from here
__init__.py
items.py # project items file
pipelines.py # project pipelines file
settings.py # project settings file
spiders/ # a directory where you'll later put your spiders
__init__.py
...
To create a scrapy project go to your project folder and run:
scrapy startproject projectname
After you have created the project, you can now run scrapy from your project root folder. Make sure you are at the root of the project when you run scrapy.
I had the same issue and the solution turned out to be trivial.
I was trying to run scrapy crawl name_of_project
at the Pycharm project directory level, not scrapy project directory level.
Maybe it will help someone, who like me had the same issue despite the good settings.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With