I am learning Scrapy, a web crawling framework. I know I can set <code>USER_AGENT</code> in <code>settings.py</code> file of the Scrapy project. When I run the Scrapy, I can see the <code>USER_AGENT</code>'s value in <code>INFO</code> logs. This <code>USER_AGENT</code> gets set in every download request to the server I want to crawl. But I am using multiple <code>USER_AGENT</code> randomly with the help of this solution. I guess this randomly chosen <code>USER_AGENT</code> would be working. I want to confirm it. So, how I can make Scrapy shows <code>USER_AGENT</code> per download request so I can see the value of <code>USER_AGENT</code> in the logs?

Just FYI. I've implemented a simple <code>RandomUserAgentMiddleware</code> middleware based on <code>fake-useragent</code>. Thanks to <code>fake-useragent</code>, you don't need to configure the list of User-Agents - it picks them up based on browser usage statistics from a real-world database.

How to make Scrapy show user agent per download request in log?

Tags:

python

user-agent

web-scraping

scrapy

web-crawler

I am learning Scrapy, a web crawling framework.

I know I can set USER_AGENT in settings.py file of the Scrapy project. When I run the Scrapy, I can see the USER_AGENT's value in INFO logs.
This USER_AGENT gets set in every download request to the server I want to crawl.

But I am using multiple USER_AGENT randomly with the help of this solution. I guess this randomly chosen USER_AGENT would be working. I want to confirm it. So, how I can make Scrapy shows USER_AGENT per download request so I can see the value of USER_AGENT in the logs?

538

asked Apr 18 '14 10:04

Alok

1 Answers

Just FYI.

I've implemented a simple RandomUserAgentMiddleware middleware based on fake-useragent.

Thanks to fake-useragent, you don't need to configure the list of User-Agents - it picks them up based on browser usage statistics from a real-world database.

185

answered Oct 20 '22 22:10

alecxe

Related questions
                            
                                Pyqt5 qthread + signal not working + gui freeze
                            
                                Why does Python skip elements when I modify a list while iterating over it?
                            
                                Remove elements as you traverse a list in Python [duplicate]
                            
                                I/O error(socket error): [Errno 111] Connection refused
                            
                                what does the '~' mean in python? [duplicate]
                            
                                Explain polymorphism
                            
                                remove unwanted space in between a string [duplicate]
                            
                                Using python to run another program?
                            
                                Python PIL For Loop to work with Multi-image TIFF
                            
                                PyQt and context menu
                            
                                Why allow concatenation of string literals?
                            
                                find row or column containing maximum value in numpy array
                            
                                Normalization to bring in the range of [0,1]
                            
                                Count lines of code in directory using Python
                            
                                How to install PIL on Spyder(Anaconda 3)?
                            
                                Computing cosine similarity between two tensors in Keras
                            
                                Properly using subprocess.PIPE in python?
                            
                                How to take input in an array + PYTHON? [duplicate]
                            
                                Restricting the value in Tkinter Entry widget
                            
                                FreqDist in NLTK not sorting output

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With