Scrapy Shell - How to change USER_AGENT

Tags:

I have a fully functioning scrapy script to extract data from a website. During setup, the target site banned me based on my USER_AGENT information. I subsequently added a RotateUserAgentMiddleware to rotate the USER_AGENT randomly. This works great.

However, now when I trying to use the scrapy shell to test xpath and css requests, I get a 403 error. I'm sure this is because the USER_AGENT of the scrapy shell is defaulting to some value the target site has blacklisted.

Question: is it possible to fetch a URL in the scrapy shell with a different USER_AGENT than the default?

fetch('http://www.test') [add something ?? to change USER_AGENT]

Thx

474

asked Aug 21 '14 15:08

dfriestedt

2 Answers

scrapy shell -s USER_AGENT='custom user agent' 'http://www.example.com'

164

answered Oct 07 '22 23:10

marven

Inside the scrapy shell, you can set the User-Agent in the request header.

url = 'http://www.example.com'
request = scrapy.Request(url, headers={'User-Agent': 'Mybot'})
fetch(request)

answered Oct 08 '22 00:10

salmanwahed

Related questions
                            
                                Decoding HTML entities with Python
                            
                                How to generate a repeatable random number sequence?
                            
                                django - how to unit test a post request using request.FILES
                            
                                Counting each letter's frequency in a string
                            
                                pyMySQL set connection character set
                            
                                How to run os.mkdir() with -p option in Python?
                            
                                If vs. else if vs. else statements?
                            
                                How to raise an exception in a Jinja2 macro?
                            
                                Way to convert image straight from URL to base64 without saving as a file in Python
                            
                                Convert a dictionary to a pandas dataframe
                            
                                Pandas: How to remove rows from a dataframe based on a list?
                            
                                How do I iterate over all lines of files passed on the command line?
                            
                                Custom django admin templates not working
                            
                                How do I check for valid Git branch names?
                            
                                Python sort() first element of list
                            
                                ImportError: No module named crispy-forms
                            
                                Conditional compilation in Python
                            
                                What are the important language features (idioms) of Python to learn early on [duplicate]
                            
                                Install Python 2.6 without using installer on Win32
                            
                                Django: WSGIRequest' object has no attribute 'user' on some pages?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Scrapy Shell - How to change USER_AGENT

Tags:

python

shell

scrapy

agent

dfriestedt

People also ask

2 Answers

marven

salmanwahed

Recent Activity

Donate For Us