I'm new to Python and Scrapy and I'm walking through the Scrapy tutorial. I've been able to create my project by using DOS interface and typing:
scrapy startproject dmoz
The tutorial later refers to the Crawl command:
scrapy crawl dmoz.org
But each time I try to run that I get a message that this is not a legit command. In looking around further it looks like I need to be inside a project and that's what I can't figure out. I've tried changing directories into the "dmoz" folder I created in startproject but that does not recognize Scrapy at all.
I'm sure I'm missing something obvious and I'm hoping someone can point it out.
You have to execute it in your 'startproject' folder. You will have another commands if it finds your scrapy.cfg file. You can see the diference here:
$ scrapy startproject bar
$ cd bar/
$ ls
bar scrapy.cfg
$ scrapy
Scrapy 0.12.0.2536 - project: bar
Usage:
scrapy <command> [options] [args]
Available commands:
crawl Start crawling from a spider or URL
deploy Deploy project in Scrapyd target
fetch Fetch a URL using the Scrapy downloader
genspider Generate new spider using pre-defined templates
list List available spiders
parse Parse URL (using its spider) and print the results
queue Deprecated command. See Scrapyd documentation.
runserver Deprecated command. Use 'server' command instead
runspider Run a self-contained spider (without creating a project)
server Start Scrapyd server for this project
settings Get settings values
shell Interactive scraping console
startproject Create new project
version Print Scrapy version
view Open URL in browser, as seen by Scrapy
Use "scrapy <command> -h" to see more info about a command
$ cd ..
$ scrapy
Scrapy 0.12.0.2536 - no active project
Usage:
scrapy <command> [options] [args]
Available commands:
fetch Fetch a URL using the Scrapy downloader
runspider Run a self-contained spider (without creating a project)
settings Get settings values
shell Interactive scraping console
startproject Create new project
version Print Scrapy version
view Open URL in browser, as seen by Scrapy
Use "scrapy <command> -h" to see more info about a command
The PATH environmental variables aren't set.
You can set the PATH environmental variables for both Python and Scrapy by finding System Properties (My Computer > Properties > Advanced System Settings) navigating to the Advanced tab and clicking the Environment Variables button. In the new window, scroll to Variable Path in the System Variables window and add the following lines separated by semi-colons
C:\{path to python folder} C:\{path to python folder}\Scripts
example
C:\Python27;C:\Python27\Scripts
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With