To not get banned you can try to add random delays between queries. For this you can use the Python's sleep() function that suspends (waits) execution of the current thread for a given number of seconds with the randint() function that returns a random integer.
First, Google probably are blocking you because they don't like it when you take too many of their resources. The best way to fix this is to slow it down, not delay randomly. Stick a 1 second wait after every request and you'll probably stop having problems.
That said:
from random import randint
from time import sleep
sleep(randint(10,100))
will sleep a random number of seconds (between 10 and 100).
Best to use:
from numpy import random
from time import sleep
sleeptime = random.uniform(2, 4)
print("sleeping for:", sleeptime, "seconds")
sleep(sleeptime)
print("sleeping is over")
as a start and slowly decreasy range to see what works best (fastest).
Since you're not testing Google's speed, figure out some way to simulate it when doing your testing (as @bstpierre suggested in his comment). This should solve your problem and factor its variable response times out at the same time.
Also you can try to use few proxy servers for prevent ban by IP adress. urllib support proxies by special constructor parameter, httplib can use proxy too
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With