I am using Google Drive API through pydrive to move files between two google drive accounts. I have been testing with a folder with 16 files. My code always raises an error in the sixth file
"User rate limit exceeded">
I know that there is a limit for the number of request (10/s or 1000/100s), but I have tried the exponential backoff suggested by the Google Drive API to handle this error. Even after 248s it still raises the same error.
Here an example what I am doing
def MoveToFolder(self,files,folder_id,drive):
total_files = len(files)
for cont in range(total_files):
success = False
n=0
while not success:
try:
drive.auth.service.files().copy(fileId=files[cont]['id'],
body={"parents": [{"kind": "drive#fileLink", "id": folder_id}]}).execute()
time.sleep(random.randint(0,1000)/1000)
success = True
except:
wait = (2**n) + (random.randint(0,1000)/1000)
time.sleep(wait)
success = False
n += 1
I tried to use "Batching request" to copy the files, but it raises the same errors for 10 files.
def MoveToFolderBatch(self,files,folder_id,drive):
cont=0
batch = drive.auth.service.new_batch_http_request()
for file in files:
cont+=1
batch.add(drive.auth.service.files().copy(fileId=file['id'],
body={"parents": [
{"kind": "drive#fileLink", "id": folder_id}]}))
batch.execute()
Does anyone have any tips?
EDIT: According to google support:
Regarding your User rate limit exceeded error, is not at all related to the per-user rate limit set in the console. Instead it is coming from internal Google systems that the Drive API relies on and are most likely to occur when a single account owns all the files within a domain. We don't recommend that a single account owns all the files, and instead have individual users in the domain own the files. For transferring files, you can check this link. Also, please check this link on the recommendations to avoid the error.
Resolve a 403 error: Project rate limit exceeded To fix this error, try any of the following: Raise the per-user quota in the Google Cloud project. For more information, request a quota increase. Batch requests to make fewer API calls.
try not to over use the refresh button - this will cost you 3 calls per click (All Tweets, Replies & DMs) UPDATE: try lowering the total % in the settings window, twitter API tab to around 60-70% - you'll get less frequent updates but you'll use less API.
The most common request limit interval is fifteen minutes. If an endpoint has a rate limit of 900 requests/15-minutes, then up to 900 requests over any 15-minute interval is allowed. Rate limits are applied based on which authentication method you are using.
If your LiveAgent failed to read tweets with message Rate limit exceeded, it means that Twitter rejected 3 consecutive attempts to access its API under your Twitter account (approx. 15 minutes without an update). Chances are, that this error is caused by exhausting your hourly API request limit.
403: User Rate Limit Exceeded is basically flood protection.
{
"error": {
"errors": [
{
"domain": "usageLimits",
"reason": "userRateLimitExceeded",
"message": "User Rate Limit Exceeded"
}
],
"code": 403,
"message": "User Rate Limit Exceeded"
}
}
You need to slow down. Implementing exponential backoff as you have done is the correct course of action.
Google is not perfect at counting the requests so counting them yourself isn't really going to help. Sometimes you can get away with 15 request a second other times you can only get 7.
You should also remember that you are completing with the other people using the server if there is a lot of load on the server one of your request may take longer while an other may not. Don't run on the hour that's when most people have cron jobs set up to extract.
Note: If you go to google developer console under where you have enabled the drive API got to quota tab click the pencil icon next to
Queries per 100 seconds per user
and
Queries per 100 seconds
You can increase them both. One is User based the other is project based. Each user can make X requests in 100 seconds your project can make Y request per 100 seconds.
Note: No idea how high you can set yours. This is my dev account so it may have some beta access I cant remember.
See 403 rate limit after only 1 insert per second and 403 rate limit on insert sometimes succeeds
The key points are:-
backoff but do not implement exponential backoff!. This will simply kill your application throughput
instead you need to proactively throttle your requests to avoid the 304's from occurring. In my testing I've found that the maximum sustainable throughput is about 1 transaction every 1.5 seconds.
batching makes the problem worse because the batch is unpacked before the 304 is raised. Ie. a batch of 10 is interpreted as 10 rapid transactions, not 1.
Try this algorithm
delay=0 // start with no backoff
while (haveFilesInUploadQueue) {
sleep(delay) // backoff
upload(file) // try the upload
if (403 Rate Limit) { // if rejected with a rate limit
delay += 2s // add 2s to delay to guarantee the next attempt will be OK
} else {
removeFileFromQueue(file) // if not rejected, mark file as done
if (delay > 0) { // if we are currently backing off
delay -= 0.2s // reduce the backoff delay
}
}
}
// You can play with the 2s and 0.2s to optimise throughput. The key is to do all you can to avoid the 403's
One thing to be aware of is that there was (is?) a bug with Drive that sometimes an upload gets rejected with a 403, but, despite sending the 403, Drive goes ahead and creates the file. The symptom will be duplicated files. So to be extra safe, after a 403 you should somehow check if the file is actually there. Easiest way to do this is use pre-allocated ID's, or to add your own opaque ID to a property.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With