Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Azure Search .net SDK- How to use "FindFailedActionsToRetry"?

Using the Azure Search .net SDK, when you try to index documents you might get an exception IndexBatchException.

From the documentation here:

        try
        {
            var batch = IndexBatch.Upload(documents);
            indexClient.Documents.Index(batch);
        }
        catch (IndexBatchException e)
        {
            // Sometimes when your Search service is under load, indexing will fail for some of the documents in
            // the batch. Depending on your application, you can take compensating actions like delaying and
            // retrying. For this simple demo, we just log the failed document keys and continue.
            Console.WriteLine(
                "Failed to index some of the documents: {0}",
                String.Join(", ", e.IndexingResults.Where(r => !r.Succeeded).Select(r => r.Key)));
        }

How should e.FindFailedActionsToRetry be used to create a new batch to retry the indexing for failed actions?

I've created a function like this:

    public void UploadDocuments<T>(SearchIndexClient searchIndexClient, IndexBatch<T> batch, int count) where T : class, IMyAppSearchDocument
    {
        try
        {
            searchIndexClient.Documents.Index(batch);
        }
        catch (IndexBatchException e)
        {
            if (count == 5) //we will try to index 5 times and give up if it still doesn't work.
            {
                throw new Exception("IndexBatchException: Indexing Failed for some documents.");
            }

            Thread.Sleep(5000); //we got an error, wait 5 seconds and try again (in case it's an intermitent or network issue

            var retryBatch = e.FindFailedActionsToRetry<T>(batch, arg => arg.ToString());
            UploadDocuments(searchIndexClient, retryBatch, count++);
        }
    }

But I think this part is wrong:

var retryBatch = e.FindFailedActionsToRetry<T>(batch, arg => arg.ToString());
like image 716
richard Avatar asked Oct 13 '16 05:10

richard


2 Answers

The second parameter to FindFailedActionsToRetry, named keySelector, is a function that should return whatever property on your model type represents your document key. In your example, your model type is not known at compile time inside UploadDocuments, so you'll need to change UploadsDocuments to also take the keySelector parameter and pass it through to FindFailedActionsToRetry. The caller of UploadDocuments would need to specify a lambda specific to type T. For example, if T is the sample Hotel class from the sample code in this article, the lambda must be hotel => hotel.HotelId since HotelId is the property of Hotel that is used as the document key.

Incidentally, the wait inside your catch block should not wait a constant amount of time. If your search service is under heavy load, waiting for a constant delay won't really help to give it time to recover. Instead, we recommend exponentially backing off (e.g. -- the first delay is 2 seconds, then 4 seconds, then 8 seconds, then 16 seconds, up to some maximum).

like image 103
Bruce Johnston Avatar answered Oct 01 '22 18:10

Bruce Johnston


I've taken Bruce's recommendations in his answer and comment and implemented it using Polly.

  • Exponential backoff up to one minute, after which it retries every other minute.
  • Retry as long as there is progress. Timeout after 5 requests without any progress.
  • IndexBatchException is also thrown for unknown documents. I chose to ignore such non-transient failures since they are likely indicative of requests which are no longer relevant (e.g., removed document in separate request).
int curActionCount = work.Actions.Count();
int noProgressCount = 0;

await Polly.Policy
    .Handle<IndexBatchException>() // One or more of the actions has failed.
    .WaitAndRetryForeverAsync(
        // Exponential backoff (2s, 4s, 8s, 16s, ...) and constant delay after 1 minute.
        retryAttempt => TimeSpan.FromSeconds( Math.Min( Math.Pow( 2, retryAttempt ), 60 ) ),
        (ex, _) =>
        {
            var batchEx = ex as IndexBatchException;
            work = batchEx.FindFailedActionsToRetry( work, d => d.Id );

            // Verify whether any progress was made.
            int remainingActionCount = work.Actions.Count();
            if ( remainingActionCount == curActionCount ) ++noProgressCount;
            curActionCount = remainingActionCount;
        } )
    .ExecuteAsync( async () =>
    {
        // Limit retries if no progress is made after multiple requests.
        if ( noProgressCount > 5 )
        {
            throw new TimeoutException( "Updating Azure search index timed out." );
        }

        // Only retry if the error is transient (determined by FindFailedActionsToRetry).
        // IndexBatchException is also thrown for unknown document IDs;
        // consider them outdated requests and ignore.
        if ( curActionCount > 0 )
        {
            await _search.Documents.IndexAsync( work );
        }
    } );
like image 24
Steven Jeuris Avatar answered Oct 01 '22 18:10

Steven Jeuris