Passing extra arguments to scrapy.Request()

Tags:

Actually I want to store all the data(text,hrefs,images) related to specific website to a single folder.In order to do that I need to pass the path for that folder to all different parsing function.So I want to pass this path as extra kwargs in scrapy.Request() like this:

yield scrapy.Request(url=url,dont_filter=True, callback=self.parse,errback = self.errback_function,kwargs={'path': '/path/to_folder'})

But it gives the error TypeError: __init__() got an unexpected keyword argument 'kwargs'

How can I pass that path to next function?

804

asked Oct 05 '17 06:10

Amrit

1 Answers

For anyone who may need it......

You can pass extra arguments by using meta arguments like this...

   yield scrapy.Request(url=url,dont_filter=True, 
callback=self.parse,errback = self.errback_function,  meta={'filepath': filepath})

UPDATE:

Request.cb_kwargs was introduced in version 1.7. Prior to that, using Request.meta was recommended for passing information around callbacks. After 1.7, Request.cb_kwargs became the preferred way for handling user information, leaving Request.meta for communication with components like middlewares and extensions.

So for version >= 1.7 following would work :

   request = scrapy.Request('http://www.example.com/index.html',
                             callback=self.parse_page2,
                             cb_kwargs=dict(main_url=response.url))

you can refer to this documentation: https://doc.scrapy.org/en/latest/topics/request-response.html#passing-additional-data-to-callback-functions

answered Oct 06 '22 01:10

Amrit

Related questions
                            
                                Python DataFrame: Replace values using dictionary, convert NaN if not in dictionary
                            
                                Sending multiple medias with tweepy
                            
                                Installing hunspell package
                            
                                python AttributeError assert_called
                            
                                combining tqdm with delayed execution with dask in python
                            
                                Join tables in two databases using SQLAlchemy
                            
                                socket.gaierror: [Errno -2] Name or service not known with Python3
                            
                                Sqlalchemy enum migration update fails saying does not exist
                            
                                How to install Yandex CatBoost on Anaconda x64?
                            
                                Subtracting many columns in a df by one column in another df
                            
                                ffmpeg installation on macOS for MoviePy fails with SSL error
                            
                                Querying "like" in pymongo [duplicate]
                            
                                Drop if all entries in a spark dataframe's specific column is null
                            
                                How to automatically detect columns that contain datetime in a pandas dataframe
                            
                                Why do pandas and dask perform better when importing from CSV compared to HDF5?
                            
                                Python numpy equivalent of R rep and rep_len functions
                            
                                Cython compilation error "Not allowed in a constant expression"
                            
                                How to import models from one app to another app in Django?
                            
                                Python Dictionary: "in" vs "get"
                            
                                how to set the position of a tkinter window without setting the dimensions

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Passing extra arguments to scrapy.Request()

Tags:

python

scrapy

scrapy-spider

Amrit

People also ask

1 Answers

Amrit

Recent Activity

Donate For Us