Before Scrapy 1.0, I could've run the Scrapy Shell against a local file quite simply:
$ scrapy shell index.html
After upgrading to 1.0.3, it started to throw an error:
$ scrapy shell index.html
2015-10-12 15:32:59 [scrapy] INFO: Scrapy 1.0.3 started (bot: scrapybot)
2015-10-12 15:32:59 [scrapy] INFO: Optional features available: ssl, http11, boto
2015-10-12 15:32:59 [scrapy] INFO: Overridden settings: {'LOGSTATS_INTERVAL': 0}
Traceback (most recent call last):
File "/Users/user/.virtualenvs/so/bin/scrapy", line 11, in <module>
sys.exit(execute())
File "/Users/user/.virtualenvs/so/lib/python2.7/site-packages/scrapy/cmdline.py", line 143, in execute
_run_print_help(parser, _run_command, cmd, args, opts)
File "/Users/user/.virtualenvs/so/lib/python2.7/site-packages/scrapy/cmdline.py", line 89, in _run_print_help
func(*a, **kw)
File "/Users/user/.virtualenvs/so/lib/python2.7/site-packages/scrapy/cmdline.py", line 150, in _run_command
cmd.run(args, opts)
File "/Users/user/.virtualenvs/so/lib/python2.7/site-packages/scrapy/commands/shell.py", line 50, in run
spidercls = spidercls_for_request(spider_loader, Request(url),
File "/Users/user/.virtualenvs/so/lib/python2.7/site-packages/scrapy/http/request/__init__.py", line 24, in __init__
self._set_url(url)
File "/Users/user/.virtualenvs/so/lib/python2.7/site-packages/scrapy/http/request/__init__.py", line 59, in _set_url
raise ValueError('Missing scheme in request url: %s' % self._url)
ValueError: Missing scheme in request url: index.html
Is this behavior intended or is this a bug in Scrapy Shell?
As a workaround, I can use an absolute path to the file in a "file" URL scheme:
$ scrapy shell file:////absolute/path/to/index.html
which is, obviously, much less convenient and easy.
Update: for Scrapy >=1.1, this is a built-in feature, you can do:
scrapy shell file:///path/to/file.html
Old answer:
As per discussion in Running scrapy shell against a local file, the relevant change was introduced by this commit. There was a Pull Request for this issue created to make Scrapy shell open local files again and, it is planned to be a part of Scrapy 1.1.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With