Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Slow performance on Azure DocumentDB

Tags:

I'm currently facing quite slow response times from Azure DocumentDB (first time trying it).

There are 31 objects in a collection, which I am going to fetch and return to the caller. The code I am using is this:

public async Task<List<dynamic>> Get(string collectionName = null)
{
    // Lookup from Dictionary, takes literally no time
    var collection = await GetCollectionAsync(collectionName);

    var sw = Stopwatch.StartNew();

    var query = await
        _client.CreateDocumentQuery(collection.DocumentsLink, 
            new FeedOptions { MaxItemCount = 1000 })
            .AsDocumentQuery()
            .ExecuteNextAsync();

    Trace.WriteLine($"Get documents: {sw.ElapsedMilliseconds} ms");

    return query.ToList();
}

To instantiate the client, I'm using the following code:

_client = new DocumentClient(new Uri(endpoint), authKey, new ConnectionPolicy
{
    ConnectionMode = ConnectionMode.Direct,
    ConnectionProtocol = Protocol.Tcp
});

The response times I am getting from the Stopwatch is between 360ms and 1200ms to return 31 objects. For me, that is quite slow. Without the custom ConnectionPolicy the average response time is ca 950ms.

Am I doing something wrong here? Is it possible to speed these requests up somehow?

Here is the output from the Trace, printing out the Stopwatch's elapsed time:

Get documents: 1984 ms
Get documents: 1252 ms
Get documents: 1246 ms
Get documents: 359 ms
Get documents: 356 ms
Get documents: 356 ms
Get documents: 351 ms
Get documents: 1248 ms
Get documents: 1314 ms
Get documents: 1250 ms
like image 297
Andre Andersen Avatar asked Aug 29 '15 00:08

Andre Andersen


1 Answers

Updated to reflect latest service changes (1/22/2017): DocumentDB guarantees p99 read latency < 10 ms and p99 write latency < 15 ms with SLAs on the database side. The tips below still apply to achieve low latency reads using the SDKs**

Updated to reflect latest service changes (6/14/2016): There is no need to cache self-links when using routing via user-defined ids. Also added a few more tips.**

Reads typically take <1 ms on the DocumentDB storage partition itself; and the bottleneck is often the network latency between the application and the database. Thus, it is best to have the application running in the same datacenter as the database.

Here are some general tips on SDK usage:

Tip #1: Use a singleton DocumentDB client for the lifetime of your application

Note that each DocumentClient instance is thread-safe and performs efficient connection management and address caching when operating in Direct Mode. To allow efficient connection management and better performance by DocumentClient, it is recommended to use a single instance of DocumentClient per AppDomain for the lifetime of the application.

Tip #2: Cache document and collection SelfLinks for lower read latency

In Azure DocumentDB, each document has a system-generated selfLink. These selfLinks are guaranteed to be unique and immutable for the lifetime of the document. Reading a single document using a selfLink is the most efficient way to get a single document. Due to the immutability of the selfLink, you should cache selfLinks whenever possible for best read performance.

Document document = await client.ReadDocumentAsync("/dbs/1234/colls/1234354/docs/2332435465");

Having said that, it may not be always possible for the application to work with a document’s selfLink for read scenarios; in this case, the next most efficient way to retrieve a document is to query by the document’s user provided Id property. For example:

IDocumentQuery<Document> query = (from doc in client.CreateDocumentQuery(colSelfLink) where doc.Id == "myId" select document).AsDocumentQuery(); 
            Document myDocument = null;
            while (query.HasMoreResults)
            {
                FeedResponse<Document> res = await query.ExecuteNextAsync<Document>();
                if (res.Count != 0) {
                    myDocument = res.Single();
                    break;
                }
           }

Tip #3: Tune page size for queries/read feeds for better performance

When performing a bulk read of documents using read feed functionality (i.e. ReadDocumentFeedAsync) or when issuing a DocumentDB SQL query, the results are returned in a segmented fashion if the result set is too large. By default, results are returned in chunks of 100 items or 1 MB, whichever limit is hit first.

In order to reduce the number of network round trips required to retrieve all applicable results, you can increase the page size using x-ms-max-item-count request header to up to 1000. In cases where you need to display only a few results, e.g., if your user interface or application API returns only ten results a time, you can also decrease the page size to 10 in order to reduce the throughput consumed for reads and queries.

You may also set the page size using the available DocumentDB SDKs. For example:

IQueryable<dynamic> authorResults =
client.CreateDocumentQuery(documentCollection.SelfLink, "SELECT p.Author FROM Pages p WHERE p.Title = 'About Seattle'", new FeedOptions { MaxItemCount = 1000 });

A few more tips (6/14/2016):

  • Use point-reads (e.g. read document instead of query document) for lookup by id
  • Configure the DocumentDB client (using ConnectionPolicy) to use direct connectivity over gateway
  • Collocate clients in the same Azure Region as your database
  • Call OpenAsync() to prevent higher first call latency
  • You can debug LINQ queries by calling ToString() on the queryable to see the SQL query sent over the wire

For more performance tips, check out this blog post.

like image 145
Andrew Liu Avatar answered Oct 03 '22 15:10

Andrew Liu