Scraping a JSON response with Scrapy

Tags:

How do you use Scrapy to scrape web requests that return JSON? For example, the JSON would look like this:

{     "firstName": "John",     "lastName": "Smith",     "age": 25,     "address": {         "streetAddress": "21 2nd Street",         "city": "New York",         "state": "NY",         "postalCode": "10021"     },     "phoneNumber": [         {             "type": "home",             "number": "212 555-1234"         },         {             "type": "fax",             "number": "646 555-4567"         }     ] }

I would be looking to scrape specific items (e.g. name and fax in the above) and save to csv.

508

asked Aug 11 '13 12:08

Thomas Kingaroy

2 Answers

It's the same as using Scrapy's HtmlXPathSelector for html responses. The only difference is that you should use json module to parse the response:

class MySpider(BaseSpider):     ...       def parse(self, response):          jsonresponse = json.loads(response.text)           item = MyItem()          item["firstName"] = jsonresponse["firstName"]                        return item

Hope that helps.

104

answered Oct 01 '22 10:10

alecxe

Don't need to use json module to parse the reponse object.

class MySpider(BaseSpider): ...   def parse(self, response):      jsonresponse = response.json()       item = MyItem()      item["firstName"] = jsonresponse.get("firstName", "")                  return item

answered Oct 01 '22 11:10

HARVYS 789

Related questions
                            
                                How do I change the format of a Python log message on a per-logger basis?
                            
                                dataframe.describe() suppress scientific notation [duplicate]
                            
                                What statically typed languages are similar to Python? [closed]
                            
                                Sorting a dictionary by value then by key [duplicate]
                            
                                Difference between two numpy arrays in python
                            
                                Plotting with a transparent marker but non-transparent edge
                            
                                Is the shortcircuit behaviour of Python's any/all explicit?
                            
                                Custom Python Exceptions with Error Codes and Error Messages
                            
                                urllib.urlencode doesn't like unicode values: how about this workaround?
                            
                                str.isdecimal() and str.isdigit() difference example
                            
                                Matplotlib - adding subplots to a subplot?
                            
                                @asyncio.coroutine vs async def
                            
                                Is there a function in Python to list the attributes and methods of a particular object?
                            
                                SQLAlchemy: a better way for update with declarative?
                            
                                Draw a transparent rectangles and polygons in pygame
                            
                                Python dictionary : removing u' chars
                            
                                How can JSON data with null value be converted to a dictionary
                            
                                Why does python `any` return a bool instead of the value?
                            
                                Am I safe mixing types in a Python list?
                            
                                `os.symlink` vs `ln -s`

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Scraping a JSON response with Scrapy

Tags:

python

json

web-scraping

scrapy

Thomas Kingaroy

People also ask

2 Answers

alecxe

HARVYS 789

Recent Activity

Donate For Us