Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Scrapy 1.1.0 - no active project

I am newbie in Python, installed Scrapy successfully, using PyDev in eclipse. When I run the programm it is showing like this (Figure illustrates)

[error screenshot]

I am running this code :

import scrapy

class DmozSpider(scrapy.Spider):
    name = "dmoz"
    allowed_domains = ["dmoz.org"]
    start_urls = [
        "http://www.dmoz.org/Computers/Programming/Languages/Python/Books/",
        "http://www.dmoz.org/Computers/Programming/Languages/Python/Resources/"
    ]

    def parse(self, response):
        for sel in response.xpath('//ul/li'):
            title = sel.xpath('a/text()').extract()
            link = sel.xpath('a/@href').extract()
            desc = sel.xpath('text()').extract()
            print title, link, desc

What is it? Unable to run program.

like image 629
Awais Avatar asked Jun 27 '16 11:06

Awais


2 Answers

Your current directory is not a Scrapy project.

A scrapy project has a defined format and files. Have a look at: http://doc.scrapy.org/en/latest/intro/tutorial.html

You really should go though the tutorial once.

Basically, a Scrapy project has a directory structure as:

tutorial/
    scrapy.cfg            # deploy configuration file

    tutorial/             # project's Python module, you'll import your code from here
        __init__.py

        items.py          # project items file

        pipelines.py      # project pipelines file

        settings.py       # project settings file

        spiders/          # a directory where you'll later put your spiders
            __init__.py
            ...

To create a scrapy project go to your project folder and run:

scrapy startproject projectname

After you have created the project, you can now run scrapy from your project root folder. Make sure you are at the root of the project when you run scrapy.

like image 130
squgeim Avatar answered Nov 12 '22 02:11

squgeim


I had the same issue and the solution turned out to be trivial. I was trying to run scrapy crawl name_of_project at the Pycharm project directory level, not scrapy project directory level. Maybe it will help someone, who like me had the same issue despite the good settings.

like image 22
wrozda Avatar answered Nov 12 '22 02:11

wrozda