I want to scrap the job website. i want to do some testing in scrapy shell.
Hence if i type this
scrapy shell http://www.seek.com.au
Then if i type
from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
then it works fine
But if i do this
scrapy shell http://www.seek.com.au/JobSearch?DateRange=31&SearchFrom=quick&Keywords=python&nation=3000
Then if i type
from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
Then it says invalid bash command from
and it exits the scrapy job and come on screen as stopped job
>>> from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
-bash: from: command not found
[5]+ Stopped scrapy shell http://www.seek.com.au/JobSearch?DateRange=31
[7] Done Keywords=php
While working with Scrapy, one needs to create scrapy project. In Scrapy, always try to create one spider which helps to fetch data, so to create one, move to spider folder and create one python file over there. Create one spider with name gfgfetch.py python file. Move to the spider folder and create gfgfetch.py .
Finally you hit Ctrl-D (or Ctrl-Z in Windows) to exit the shell and resume the crawling: >>> ^D 2014-01-23 17:50:03-0400 [scrapy.
apparently, you need to enclose your url within double quotes:
scrapy shell "http://www.seek.com.au/JobSearch?DateRange=31&SearchFrom=quick&Keywords=python&nation=3000"
>>> from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
>>> lx = SgmlLinkExtractor()
then everything works smoothly (this above is my actual shell output)
tried it without double quotes, doesn't work (the fetch thread keeps running and first key press exits to bash without changing my visual output, thus giving me the same error you have)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With