I have an Azure Function (v2) that accesses Cosmos DB, but not through a binding (we need to use custom serialization settings). I've followed the example here for setting up an object that should then be available to all instances of the activity function. Mine is a little different because our custom CosmosDb
object requires an await
for setup.
public static class AnalyzeActivityTrigger
{
private static readonly Lazy<Task<CosmosDb>> LazyCosmosDb = new Lazy<Task<CosmosDb>>(InitializeDocumentClient);
private static Task<CosmosDb> CosmosDb => LazyCosmosDb.Value;
private static Task<CosmosDb> InitializeDocumentClient()
{
return StorageFramework.CosmosDb.GetCosmosDb(DesignUtilities.Storage.CosmosDbContainerDefinitions, DesignUtilities.Storage.CosmosDbMigrations);
}
[FunctionName(nameof(AnalyzeActivityTrigger))]
public static async Task<Guid> Run(
[ActivityTrigger]DurableActivityContext context,
ILogger log)
{
var analyzeActivityRequestString = context.GetInput<string>();
var analyzeActivityRequest = StorageFramework.Storage.Deserialize<AnalyzeActivityRequest>(analyzeActivityRequestString);
var componentDesign = StorageFramework.Storage.Deserialize<ComponentDesign>(analyzeActivityRequest.ComponentDesignString);
var (analysisSet, _, _) = await AnalysisUtilities.AnalyzeComponentDesignAndUploadArtifacts(componentDesign,
LogVariables.Off, new AnalysisLog(), Stopwatch.StartNew(), analyzeActivityRequest.CommitName, await CosmosDb);
return analysisSet.AnalysisReport.Guid;
}
}
We fan out, calling this activity function in parallel. Our documents are fairly large, so updating them is expensive, and that happens as part of this code.
I sometimes get this error when container.ReplaceItemAsync
is called:
Response status code does not indicate success: 408 Substatus: 0 Reason: (Message: Request timed out. ...
The obvious thing to do seems to be to increase the timeout, but could this be indicative of some other problem? Increasing the timeout seems like addressing the symptom rather than the problem. We have code that scales up our RUs before all this happens, too. I'm wondering if it has to do with Azure Functions fanning out and that putting too much load on it. So I've also played around with adjusting the host.json settings for durableTask
like maxConcurrentActivityFunctions
and maxConcurrentOrchestratorFunctions
, but to no avail so far.
How should I approach this 408 error? What steps can I consider to mitigate it other than increasing the request timeout?
Update 1: I increased the default request timeout to 5 minutes and now I'm getting 503 responses.
Update 2: Pointing to a clone published to an Azure Function on the Premium plan seems to work after multiple tests.
Update 3: We weren't testing it hard enough. The problem is exhibited on the Premium plan as well. GitHub Issue forthcoming.
Update 4: We seem to have solved this by a combination of using Gateway mode in connecting to Cosmos and increasing RUs.
If we want to use another Cosmos DB API in our Azure Functions, we’ll have to create a static client or as we’ll do next, create a Singleton instance of the client for the API that we’re using. By default, the Cosmos DB bindings use version 2 of the .NET SDK.
The HTTP 408 error occurs if the SDK was unable to complete the request before the timeout limit occurred. Customize the timeout on the Azure Cosmos DB.NET SDK The SDK has two distinct alternatives to control timeouts, each with a different scope.
Users sometimes see elevated latency or request timeouts because their collections are provisioned insufficiently, the back-end throttles requests, and the client retries internally. Check the portal metrics. Azure Cosmos DB distributes the overall provisioned throughput evenly across physical partitions.
You've updated your HTTP triggered function to write JSON documents to an Azure Cosmos DB container. Now you can learn more about developing Functions using Visual Studio Code: Azure Functions triggers and bindings.
A timeout can indeed signal issues regarding instance resources. Reference: https://learn.microsoft.com/azure/cosmos-db/troubleshoot-dot-net-sdk#request-timeouts
If you are running on Functions, take a look at the Connections. Also verify CPU usage in the instances. If CPU is high, it can affect requests latency and end up getting timeouts.
For Functions, you can certainly use DI to avoid the whole Lazy declaration: https://github.com/Azure/azure-cosmos-dotnet-v3/tree/master/Microsoft.Azure.Cosmos.Samples/Usage/AzureFunctions
Create a Startup.cs
file with:
using System;
using Microsoft.Azure.Cosmos;
using Microsoft.Azure.Functions.Extensions.DependencyInjection;
using Microsoft.Extensions.Configuration;
using Microsoft.Extensions.DependencyInjection;
[assembly: FunctionsStartup(typeof(YourNameSpace.Startup))]
namespace YourNameSpace
{
public class Startup : FunctionsStartup
{
public override void Configure(IFunctionsHostBuilder builder)
{
builder.Services.AddSingleton((s) => {
CosmosClient cosmosClient = new CosmosClient("connection string");
return cosmosClient;
});
}
}
}
And then you can make your Functions not static and inject it:
public class AnalyzeActivityTrigger
{
private readonly CosmosClient cosmosClient;
public AnalyzeActivityTrigger(CosmosClient cosmosClient)
{
this.cosmosClient = cosmosClient;
}
[FunctionName(nameof(AnalyzeActivityTrigger))]
public async Task<Guid> Run(
[ActivityTrigger]DurableActivityContext context,
ILogger log)
{
var analyzeActivityRequestString = context.GetInput<string>();
var analyzeActivityRequest = StorageFramework.Storage.Deserialize<AnalyzeActivityRequest>(analyzeActivityRequestString);
var componentDesign = StorageFramework.Storage.Deserialize<ComponentDesign>(analyzeActivityRequest.ComponentDesignString);
var (analysisSet, _, _) = await AnalysisUtilities.AnalyzeComponentDesignAndUploadArtifacts(componentDesign,
LogVariables.Off, new AnalysisLog(), Stopwatch.StartNew(), analyzeActivityRequest.CommitName, this.cosmosClient);
return analysisSet.AnalysisReport.Guid;
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With