I was watching The zen of async: Best practices for best performance and Stephen Toub started to talk about Task caching, where instead of caching the results of task jobs you cache the tasks themselves. As far as i understood starting a new task for every job is expensive and it should be minimized as much as possible. At around 28:00 he showed this method: <pre class="prettyprint"><code>private static ConcurrentDictionary<string, string> s_urlToContents; public static async Task<string> GetContentsAsync(string url) { string contents; if(!s_urlToContents.TryGetValue(url, out contents)) { var response = await new HttpClient().GetAsync(url); contents = response.EnsureSuccessStatusCode().Content.ReadAsString(); s_urlToContents.TryAdd(url, contents); } return contents; } </code></pre> Which at a first look looks like a good thought out method where you cache results, i didn't event think about caching the job of getting the contents. And than he showed this method: <pre class="prettyprint"><code>private static ConcurrentDictionary<string, Task<string>> s_urlToContents; public static Task<string> GetContentsAsync(string url) { Task<string> contents; if(!s_urlToContents.TryGetValue(url, out contents)) { contents = GetContentsAsync(url); contents.ContinueWith(t => s_urlToContents.TryAdd(url, t); }, TaskContinuationOptions.OnlyOnRanToCompletion | TaskContinuationOptions.ExecuteSynchronously, TaskScheduler.Default); } return contents; } private static async Task<string> GetContentsAsync(string url) { var response = await new HttpClient().GetAsync(url); return response.EnsureSuccessStatusCode().Content.ReadAsString(); } </code></pre> I have trouble understanding how this actually helps more than just storing the results. Does this mean that you're using less Tasks to get the data? And also, how do we know when to cache tasks? As far as i understand if you're caching in the wrong place you just get a load of overhead and stress the system too much

<blockquote> I have trouble understanding how this actually helps more than just storing the results. </blockquote> When a method is marked with the <code>async</code> modifier, the compiler will automatically transform the underlying method into a state-machine, as Stephan demonstrates in previous slides. This means that the use of the first method will always trigger a creation of a <code>Task</code>. In the second example, notice Stephan removed the <code>async</code> modifier and the signature of the method is now <code>public static Task<string> GetContentsAsync(string url)</code>. This now means that the responsibility of creating the <code>Task</code> is on the implementer of the method and not the compiler. By caching <code>Task<string></code>, the only "penalty" of creating the <code>Task</code> (actually, two tasks, as <code>ContinueWith</code> will also create one) is when it's unavailable in the cache, and not foreach method call. In this particular example, IMO, wasn't to re-use the network operation that is already ongoing when the first task executes, it was simply to reduce the amount of allocated <code>Task</code> objects. <blockquote> how do we know when to cache tasks? </blockquote> Think of caching a <code>Task</code> as if it were anything else, and this question can be viewed from a more broad perspective: When should I cache something? The answer to this question is broad, but I think the most common use case is when you have an expensive operation which is on the hotpath of your application. Should you always be caching tasks? definitely not. The overhead of the state-machine allocation is usually neglectable. If needed, profile your app, and then (and only then) think if caching would be of use in your particular use case.

Let's assume you are talking to a remote service which takes the name of a city and returns its zip codes. The service is remote and under load so we are talking to a method with an asynchronous signature: <pre class="prettyprint"><code>interface IZipCodeService { Task<ICollection<ZipCode>> GetZipCodesAsync(string cityName); } </code></pre> Since the service needs a while for every request we would like to implement a local cache for it. Naturally the cache will also have an asynchronous signature maybe even implementing the same interface (see Facade pattern). A synchronous signature would break the best-practice of never calling asynchronous code synchronously with .Wait(), .Result or similar. At least the cache should leave that up to the caller. So let's do a first iteration on this: <pre class="prettyprint"><code>class ZipCodeCache : IZipCodeService { private readonly IZipCodeService realService; private readonly ConcurrentDictionary<string, ICollection<ZipCode>> zipCache = new ConcurrentDictionary<string, ICollection<ZipCode>>(); public ZipCodeCache(IZipCodeService realService) { this.realService = realService; } public Task<ICollection<ZipCode>> GetZipCodesAsync(string cityName) { ICollection<ZipCode> zipCodes; if (zipCache.TryGetValue(cityName, out zipCodes)) { // Already in cache. Returning cached value return Task.FromResult(zipCodes); } return this.realService.GetZipCodesAsync(cityName).ContinueWith((task) => { this.zipCache.TryAdd(cityName, task.Result); return task.Result; }); } } </code></pre> As you can see the cache does not cache Task objects but the returned values of ZipCode collections. But by doing so it has to construct a Task for every cache hit by calling Task.FromResult and I think that is exactly what Stephen Toub tries to avoid. A Task object comes with overhead especially for the garbage collector because you are not only creating garbage but also every Task has a Finalizer which needs to be considered by the runtime. The only option to work around this is by caching the whole Task object: <pre class="prettyprint"><code>class ZipCodeCache2 : IZipCodeService { private readonly IZipCodeService realService; private readonly ConcurrentDictionary<string, Task<ICollection<ZipCode>>> zipCache = new ConcurrentDictionary<string, Task<ICollection<ZipCode>>>(); public ZipCodeCache2(IZipCodeService realService) { this.realService = realService; } public Task<ICollection<ZipCode>> GetZipCodesAsync(string cityName) { Task<ICollection<ZipCode>> zipCodes; if (zipCache.TryGetValue(cityName, out zipCodes)) { return zipCodes; } return this.realService.GetZipCodesAsync(cityName).ContinueWith((task) => { this.zipCache.TryAdd(cityName, task); return task.Result; }); } } </code></pre> As you can see the creation of Tasks by calling Task.FromResult is gone. Furthermore it is not possible to avoid this Task creation when using the async/await keywords because internally they will create a Task to return no matter what your code has cached. Something like: <pre class="prettyprint"><code> public async Task<ICollection<ZipCode>> GetZipCodesAsync(string cityName) { Task<ICollection<ZipCode>> zipCodes; if (zipCache.TryGetValue(cityName, out zipCodes)) { return zipCodes; } </code></pre> will not compile. Don't get confused by Stephen Toub's ContinueWith flags TaskContinuationOptions.OnlyOnRanToCompletion and TaskContinuationOptions.ExecuteSynchronously. They are (only) another performance optimization which is not related to the main objective of caching Tasks. As with every cache you should consider some mechanism which clean the cache from time to time and remove entries which are too old or invalid. You could also implement a policy which limits the cache to n entries and trys to cache the items requested most by introducing some counting. I did some benchmarking with and without caching of Tasks. You can find the code here http://pastebin.com/SEr2838A and the results look like this on my machine (w/ .NET4.6) <pre class="prettyprint"><code>Caching ZipCodes: 00:00:04.6653104 Gen0: 3560 Gen1: 0 Gen2: 0 Caching Tasks: 00:00:03.9452951 Gen0: 1017 Gen1: 0 Gen2: 0 </code></pre>

When to cache Tasks?

Tags:

c#

caching

async-await

task-parallel-library

task

I was watching The zen of async: Best practices for best performance and Stephen Toub started to talk about Task caching, where instead of caching the results of task jobs you cache the tasks themselves. As far as i understood starting a new task for every job is expensive and it should be minimized as much as possible. At around 28:00 he showed this method:

Click to copy

private static ConcurrentDictionary<string, string> s_urlToContents;  public static async Task<string> GetContentsAsync(string url) {     string contents;     if(!s_urlToContents.TryGetValue(url, out contents))     {         var response = await new HttpClient().GetAsync(url);         contents = response.EnsureSuccessStatusCode().Content.ReadAsString();         s_urlToContents.TryAdd(url, contents);     }     return contents; }

Which at a first look looks like a good thought out method where you cache results, i didn't event think about caching the job of getting the contents.

And than he showed this method:

Click to copy

private static ConcurrentDictionary<string, Task<string>> s_urlToContents;  public static Task<string> GetContentsAsync(string url) {     Task<string> contents;     if(!s_urlToContents.TryGetValue(url, out contents))     {         contents = GetContentsAsync(url);         contents.ContinueWith(t => s_urlToContents.TryAdd(url, t); },         TaskContinuationOptions.OnlyOnRanToCompletion |         TaskContinuationOptions.ExecuteSynchronously, TaskScheduler.Default);     }     return contents; }  private static async Task<string> GetContentsAsync(string url) {     var response = await new HttpClient().GetAsync(url);     return response.EnsureSuccessStatusCode().Content.ReadAsString(); }

I have trouble understanding how this actually helps more than just storing the results.

Does this mean that you're using less Tasks to get the data?

And also, how do we know when to cache tasks? As far as i understand if you're caching in the wrong place you just get a load of overhead and stress the system too much

955

asked Mar 18 '16 12:03

Nikola.Lukovic

2 Answers

I have trouble understanding how this actually helps more than just storing the results.

When a method is marked with the async modifier, the compiler will automatically transform the underlying method into a state-machine, as Stephan demonstrates in previous slides. This means that the use of the first method will always trigger a creation of a Task.

In the second example, notice Stephan removed the async modifier and the signature of the method is now public static Task<string> GetContentsAsync(string url). This now means that the responsibility of creating the Task is on the implementer of the method and not the compiler. By caching Task<string>, the only "penalty" of creating the Task (actually, two tasks, as ContinueWith will also create one) is when it's unavailable in the cache, and not foreach method call.

In this particular example, IMO, wasn't to re-use the network operation that is already ongoing when the first task executes, it was simply to reduce the amount of allocated Task objects.

how do we know when to cache tasks?

Think of caching a Task as if it were anything else, and this question can be viewed from a more broad perspective: When should I cache something? The answer to this question is broad, but I think the most common use case is when you have an expensive operation which is on the hotpath of your application. Should you always be caching tasks? definitely not. The overhead of the state-machine allocation is usually neglectable. If needed, profile your app, and then (and only then) think if caching would be of use in your particular use case.

161

answered Sep 19 '22 18:09

Yuval Itzchakov

Let's assume you are talking to a remote service which takes the name of a city and returns its zip codes. The service is remote and under load so we are talking to a method with an asynchronous signature:

Click to copy

interface IZipCodeService {     Task<ICollection<ZipCode>> GetZipCodesAsync(string cityName); }

Since the service needs a while for every request we would like to implement a local cache for it. Naturally the cache will also have an asynchronous signature maybe even implementing the same interface (see Facade pattern). A synchronous signature would break the best-practice of never calling asynchronous code synchronously with .Wait(), .Result or similar. At least the cache should leave that up to the caller.

So let's do a first iteration on this:

Click to copy

class ZipCodeCache : IZipCodeService {     private readonly IZipCodeService realService;     private readonly ConcurrentDictionary<string, ICollection<ZipCode>> zipCache = new ConcurrentDictionary<string, ICollection<ZipCode>>();      public ZipCodeCache(IZipCodeService realService)     {         this.realService = realService;     }      public Task<ICollection<ZipCode>> GetZipCodesAsync(string cityName)     {         ICollection<ZipCode> zipCodes;         if (zipCache.TryGetValue(cityName, out zipCodes))         {             // Already in cache. Returning cached value             return Task.FromResult(zipCodes);         }         return this.realService.GetZipCodesAsync(cityName).ContinueWith((task) =>         {             this.zipCache.TryAdd(cityName, task.Result);             return task.Result;         });     } }

As you can see the cache does not cache Task objects but the returned values of ZipCode collections. But by doing so it has to construct a Task for every cache hit by calling Task.FromResult and I think that is exactly what Stephen Toub tries to avoid. A Task object comes with overhead especially for the garbage collector because you are not only creating garbage but also every Task has a Finalizer which needs to be considered by the runtime.

The only option to work around this is by caching the whole Task object:

Click to copy

class ZipCodeCache2 : IZipCodeService {     private readonly IZipCodeService realService;     private readonly ConcurrentDictionary<string, Task<ICollection<ZipCode>>> zipCache = new ConcurrentDictionary<string, Task<ICollection<ZipCode>>>();      public ZipCodeCache2(IZipCodeService realService)     {         this.realService = realService;     }      public Task<ICollection<ZipCode>> GetZipCodesAsync(string cityName)     {         Task<ICollection<ZipCode>> zipCodes;         if (zipCache.TryGetValue(cityName, out zipCodes))         {             return zipCodes;         }         return this.realService.GetZipCodesAsync(cityName).ContinueWith((task) =>         {             this.zipCache.TryAdd(cityName, task);             return task.Result;         });     } }

As you can see the creation of Tasks by calling Task.FromResult is gone. Furthermore it is not possible to avoid this Task creation when using the async/await keywords because internally they will create a Task to return no matter what your code has cached. Something like:

Click to copy

    public async Task<ICollection<ZipCode>> GetZipCodesAsync(string cityName)     {         Task<ICollection<ZipCode>> zipCodes;         if (zipCache.TryGetValue(cityName, out zipCodes))         {             return zipCodes;         }

will not compile.

Don't get confused by Stephen Toub's ContinueWith flags TaskContinuationOptions.OnlyOnRanToCompletion and TaskContinuationOptions.ExecuteSynchronously. They are (only) another performance optimization which is not related to the main objective of caching Tasks.

As with every cache you should consider some mechanism which clean the cache from time to time and remove entries which are too old or invalid. You could also implement a policy which limits the cache to n entries and trys to cache the items requested most by introducing some counting.

I did some benchmarking with and without caching of Tasks. You can find the code here http://pastebin.com/SEr2838A and the results look like this on my machine (w/ .NET4.6)

Click to copy

Caching ZipCodes: 00:00:04.6653104 Gen0: 3560 Gen1: 0 Gen2: 0 Caching Tasks: 00:00:03.9452951 Gen0: 1017 Gen1: 0 Gen2: 0

answered Sep 20 '22 18:09

Thomas Zeman

Related questions
                            
                                Why can't I do ++i++ in C-like languages?
                            
                                Convert "var" to explicit type in Visual Studio? [duplicate]
                            
                                Application that uses WebBrowser control crashes after installing IE9
                            
                                Using System.Speech with Kinect
                            
                                Why does the default string comparer fail to maintain transitive consistency?
                            
                                Compiler generated incorrect code for anonymous methods [MS BUG FIXED]
                            
                                NullReferenceException when setting AutoSizeMode to AllCells in DataGridView
                            
                                Why doesn't C# have support for first pass exception filtering?
                            
                                How to get the exact text margins used by TextRenderer
                            
                                EF 6 vs EF 5 relative performance issue when deploying to IIS8
                            
                                how could someone make a c# incremental compiler like Java?
                            
                                What is a "mostly complete" (im)mutability approach for C#? [closed]
                            
                                Using BlockingCollection<T>: OperationCanceledException, is there a better way?
                            
                                Can an ASP.NET 5 application be published such that the target machine doesn't need DNX installed?
                            
                                Using ReSharper to Sort Members by Type then Name
                            
                                Constructor accessibility C# compiler error CS0122 vs CS1729
                            
                                Event and delegate contravariance in .NET 4.0 and C# 4.0
                            
                                MVVM and IOC: Handling View Model's Class Invariants
                            
                                Which is the "best" data access framework/approach for C# and .NET?
                            
                                C# iPhone push server?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

When to cache Tasks?

Tags:

c#

caching

async-await

task-parallel-library

task

Nikola.Lukovic

People also ask

2 Answers

Yuval Itzchakov

Thomas Zeman

Recent Activity

Donate For Us