Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

quotas on appengine search api for java

I am testing the new app engine search api for java and I have the following code that tries to add ~3000 documents on an index:

List<Document> documents = new ArrayList<Document>();
    for (FacebookAlbum album: user.listAllAlbums()) {
        Document doc = Document.newBuilder()
                .setId(album.getId())
                .addField(Field.newBuilder().setName("name").setText(album.getFullName()))
                .addField(Field.newBuilder().setName("albumId").setText(album.getAlbumId()))
                .addField(Field.newBuilder().setName("createdTime").setDate(Field.date(album.getCreatedTime())))
                .addField(Field.newBuilder().setName("updatedTime").setDate(Field.date(album.getUpdatedTime())))
                .build();
        documents.add(doc);
    }     

    try {
        // Add all the documents.
        getIndex(facebookId).add(documents);
    } catch (AddException e) {
        if (StatusCode.TRANSIENT_ERROR.equals(e.getOperationResult().getCode())) {
            // retry adding document
        }
    }

However, I am getting the following exception:

Uncaught exception from servlet
java.lang.IllegalArgumentException: number of documents, 3433, exceeds maximum 200
at com.google.appengine.api.search.IndexImpl.addAsync(IndexImpl.java:196)
at com.google.appengine.api.search.IndexImpl.add(IndexImpl.java:380)
at photomemories.buildIndexServlet.doGet(buildIndexServlet.java:47)

Is there a quota on the number of documents I can insert with an add call set to 200?

If I try to insert one document at a time to the index with the following code:

 for (FacebookAlbum album: user.listAllAlbums()) {
        Document doc = Document.newBuilder()
                .setId(album.getId())
                .addField(Field.newBuilder().setName("name").setText(album.getFullName()))
                .addField(Field.newBuilder().setName("albumId").setText(album.getAlbumId()))
                .addField(Field.newBuilder().setName("createdTime").setDate(Field.date(album.getCreatedTime())))
                .addField(Field.newBuilder().setName("updatedTime").setDate(Field.date(album.getUpdatedTime())))
                .build();

         try {
            // Add the document.
            getIndex(facebookId).add(doc);
        } catch (AddException e) {
            if (StatusCode.TRANSIENT_ERROR.equals(e.getOperationResult().getCode())) {
                // retry adding document
            }
        }

    }     

I am getting the following exception:

com.google.apphosting.api.ApiProxy$OverQuotaException: The API call search.IndexDocument() required more quota than is available.
at com.google.apphosting.runtime.ApiProxyImpl$AsyncApiFuture.success(ApiProxyImpl.java:479)
at com.google.apphosting.runtime.ApiProxyImpl$AsyncApiFuture.success(ApiProxyImpl.java:382)
at com.google.net.rpc3.client.RpcStub$RpcCallbackDispatcher$1.runInContext(RpcStub.java:786)
at com.google.tracing.TraceContext$TraceContextRunnable$1.run(TraceContext.java:455)

I thought the quota on the api calls was 20k/day (see here: https://developers.google.com/appengine/docs/java/search/overview#Quotas).

Any ideas on what is going on ?

like image 776
Ioannis Antonellis Avatar asked May 12 '12 20:05

Ioannis Antonellis


3 Answers

There are a few things going on here. Most importantly, and this is something that will be clarified in the documentation very soon, the Search API Call quota also accounts for the number of documents being added/updated. So a single Add call that inserts 10 documents will reduce your daily Search API Call quota by 10.

Yes, the maximum number of documents that may be indexed in a single add call is 200. However, at this stage there is also a short term burst quota in place that limits you to about 100 API calls per minute.

All the above means that, for now at least, it's safest to not add more than 100 documents per Add request. Doing so via Task Queue as recommended by Shay is also a very good idea.

like image 102
Peter McKenzie Avatar answered Nov 14 '22 20:11

Peter McKenzie


I think (can't find a validation for it) that there is a per minute quota limit, you should index your documents using a queue to make sure you gradually index them.

like image 28
Shay Erlichmen Avatar answered Nov 14 '22 19:11

Shay Erlichmen


Docs mention a per minute quota also, 20k is only 13.9 per minute.

https://developers.google.com/appengine/docs/quotas

like image 1
Larry Cadden Avatar answered Nov 14 '22 20:11

Larry Cadden