Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can i use scrapy shell to with parameters on url

I want to scrap the job website. i want to do some testing in scrapy shell.

Hence if i type this

scrapy shell http://www.seek.com.au

Then if i type

from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor

then it works fine

But if i do this

scrapy shell http://www.seek.com.au/JobSearch?DateRange=31&SearchFrom=quick&Keywords=python&nation=3000

Then if i type

from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor

Then it says invalid bash command from and it exits the scrapy job and come on screen as stopped job

>>> from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
-bash: from: command not found

[5]+  Stopped                 scrapy shell http://www.seek.com.au/JobSearch?DateRange=31
[7]   Done                    Keywords=php
like image 234
user1894766 Avatar asked Dec 11 '12 15:12

user1894766


People also ask

How do you scrape data from a website using Scrapy?

While working with Scrapy, one needs to create scrapy project. In Scrapy, always try to create one spider which helps to fetch data, so to create one, move to spider folder and create one python file over there. Create one spider with name gfgfetch.py python file. Move to the spider folder and create gfgfetch.py .

How do you get a Scrapy shell off?

Finally you hit Ctrl-D (or Ctrl-Z in Windows) to exit the shell and resume the crawling: >>> ^D 2014-01-23 17:50:03-0400 [scrapy.


1 Answers

apparently, you need to enclose your url within double quotes:

scrapy shell "http://www.seek.com.au/JobSearch?DateRange=31&SearchFrom=quick&Keywords=python&nation=3000"
>>> from scrapy.contrib.linkextractors.sgml import SgmlLinkExtractor
>>> lx = SgmlLinkExtractor() 

then everything works smoothly (this above is my actual shell output)

tried it without double quotes, doesn't work (the fetch thread keeps running and first key press exits to bash without changing my visual output, thus giving me the same error you have)

like image 185
Samuele Mattiuzzo Avatar answered Oct 13 '22 05:10

Samuele Mattiuzzo