Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Error 429 with simple query on google with requests python

I am trying to get the first non-ad result on a simple query on Google.

res = requests.get('https://www.google.com?q=' + query)

Assign any value to query and you will get an error. I have tried to add some headers, but nothing changes.

I have tried to add all other parameters that google typically associates to a query and again nothing changes.

No problems if you do the search with selenium.

The error code is 429, but this seems to be just a standard response for this query. It has nothing to do with my IP and I am not spamming Google, and this does not disappear after a while.

Do you know why this happens, and is there some header I can add, or any other solution to just see the results, as if you were searching that keyword on google?

like image 852
Adrian Nicoli Avatar asked Jun 25 '19 16:06

Adrian Nicoli


People also ask

How do I fix error 429 in Python?

Wait to send another request. The simplest way to fix an HTTP 429 error is to wait to send another request. Often, this status code is sent with a “Retry-after” header that specifies a period of time to wait before sending another request.

What does error 429 mean on Google?

A 429 "Too many requests" error can occur due to daily per-user limits, including mail sending limits, bandwidth limits, or a per-user concurrent request limit.

Does Error 429 go away?

How Do I Fix a 429 Error? In some cases, the error will go away on its own if you wait a little while. In other instances, in which the error is due to a DDoS attack or issue with a plugin, you need to be proactive in fixing the problem.


2 Answers

Since you are getting status code 429 which means you have sent too many requests in a given amount of time ("rate limiting"). Read in more detail here.

Add Headers in your request just like this:

headers = {'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5)\
            AppleWebKit/537.36 (KHTML, like Gecko) Cafari/537.36'}

So the final request will be:

url = 'https://www.google.com?q=' + query
res = requests.get(url, headers=headers)
like image 74
ParthS007 Avatar answered Nov 06 '22 15:11

ParthS007


429 Too Many Requests

The HTTP 429 Too Many Requests response status code indicates that the user has sent too many requests in a given amount of time ("rate limiting"). The response representations SHOULD include details explaining the condition, and MAY include a Retry-After header indicating how long to wait before making a new request.

When a server is under attack or just receiving a very large number of requests from a single party, responding to each with a 429 status code will consume resources. Therefore, servers are not required to use the 429 status code; when limiting resource usage, it may be more appropriate to just drop connections, or take other steps.

However, when I took you code and executed the same test, I got the perfect result as follows:

  • Code Block:

    import requests
    
    query = "selenium"
    headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 6.2; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36'}
    url = 'https://www.google.com?q=' + query
    res = requests.get(url, headers=headers)
    print(res)
    
  • Console Output:

    <Response [200]>
    

You can find a relevant discussion in Failed to load resource: the server responded with a status of 429 (Too Many Requests) and 404 (Not Found) with ChromeDriver Chrome through Selenium

like image 22
undetected Selenium Avatar answered Nov 06 '22 13:11

undetected Selenium