I am trying to get the first non-ad result on a simple query on Google.
res = requests.get('https://www.google.com?q=' + query)
Assign any value to query and you will get an error. I have tried to add some headers, but nothing changes.
I have tried to add all other parameters that google typically associates to a query and again nothing changes.
No problems if you do the search with selenium.
The error code is 429, but this seems to be just a standard response for this query. It has nothing to do with my IP and I am not spamming Google, and this does not disappear after a while.
Do you know why this happens, and is there some header I can add, or any other solution to just see the results, as if you were searching that keyword on google?
Wait to send another request. The simplest way to fix an HTTP 429 error is to wait to send another request. Often, this status code is sent with a “Retry-after” header that specifies a period of time to wait before sending another request.
A 429 "Too many requests" error can occur due to daily per-user limits, including mail sending limits, bandwidth limits, or a per-user concurrent request limit.
How Do I Fix a 429 Error? In some cases, the error will go away on its own if you wait a little while. In other instances, in which the error is due to a DDoS attack or issue with a plugin, you need to be proactive in fixing the problem.
Since you are getting status code 429
which means you have sent too many requests in a given amount of time ("rate limiting"). Read in more detail here.
Add Headers in your request just like this:
headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5)\
AppleWebKit/537.36 (KHTML, like Gecko) Cafari/537.36'}
So the final request will be:
url = 'https://www.google.com?q=' + query
res = requests.get(url, headers=headers)
The HTTP 429 Too Many Requests response status code indicates that the user has sent too many requests in a given amount of time ("rate limiting"). The response representations SHOULD include details explaining the condition, and MAY include a Retry-After
header indicating how long to wait before making a new request.
When a server is under attack or just receiving a very large number of requests from a single party, responding to each with a 429
status code will consume resources. Therefore, servers are not required to use the 429
status code; when limiting resource usage, it may be more appropriate to just drop connections, or take other steps.
However, when I took you code and executed the same test, I got the perfect result as follows:
Code Block:
import requests
query = "selenium"
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36'}
url = 'https://www.google.com?q=' + query
res = requests.get(url, headers=headers)
print(res)
Console Output:
<Response [200]>
You can find a relevant discussion in Failed to load resource: the server responded with a status of 429 (Too Many Requests) and 404 (Not Found) with ChromeDriver Chrome through Selenium
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With