Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Problems with adding a `lazy` keyword to C#

I would love to write code like this:

class Zebra {     public lazy int StripeCount     {         get { return ExpensiveCountingMethodThatReallyOnlyNeedsToBeRunOnce(); }     } } 

EDIT: Why? I think it looks better than:

class Zebra {     private Lazy<int> _StripeCount;      public Zebra()     {         this._StripeCount = new Lazy(() => ExpensiveCountingMethodThatReallyOnlyNeedsToBeRunOnce());     }      public lazy int StripeCount     {         get { return this._StripeCount.Value; }     } } 

The first time you call the property, it would run the code in the get block, and afterward would just return the value from it.

My questions:

  1. What costs would be involved with adding this kind of keyword to the library?
  2. What situations would this be problematic in?
  3. Would you find this useful?

I'm not starting a crusade to get this into the next version of the library, but I am curious what kind of considerations a feature such as this should have to go through.

like image 506
Nick Larsen Avatar asked May 11 '11 14:05

Nick Larsen


1 Answers

I am curious what kind of considerations a feature such as this should have to go through.

First off, I write a blog about this subject, amongst others. See my old blog:

http://blogs.msdn.com/b/ericlippert/

and my new blog:

http://ericlippert.com

for many articles on various aspects of language design.

Second, the C# design process is now open for view to the public, so you can see for yourself what the language design team considers when vetting new feature suggestions. See https://github.com/dotnet/roslyn/ for details.

What costs would be involved with adding this kind of keyword to the library?

It depends on a lot of things. There are, of course, no cheap, easy features. There are only less expensive, less difficult features. In general, the costs are those involving designing, specifying, implementing, testing, documenting and maintaining the feature. There are more exotic costs as well, like the opportunity cost of not doing a better feature, or the cost of choosing a feature that interacts poorly with future features we might want to add.

In this case the feature would probably be simply making the "lazy" keyword a syntactic sugar for using Lazy<T>. That's a pretty straightforward feature, not requiring a lot of fancy syntactic or semantic analysis.

What situations would this be problematic in?

I can think of a number of factors that would cause me to push back on the feature.

First off, it is not necessary; it's merely a convenient sugar. It doesn't really add new power to the language. The benefits don't seem to be worth the costs.

Second, and more importantly, it enshrines a particular kind of laziness into the language. There is more than one kind of laziness, and we might choose wrong.

How is there more than one kind of laziness? Well, think about how it would be implemented. Properties are already "lazy" in that their values are not calculated until the property is called, but you want more than that; you want a property that is called once, and then the value is cached for the next time. By "lazy" essentially you mean a memoized property. What guarantees do we need to put in place? There are many possibilities:

Possibility #1: Not threadsafe at all. If you call the property for the "first" time on two different threads, anything can happen. If you want to avoid race conditions, you have to add synchronization yourself.

Possibility #2: Threadsafe, such that two calls to the property on two different threads both call the initialization function, and then race to see who fills in the actual value in the cache. Presumably the function will return the same value on both threads, so the extra cost here is merely in the wasted extra call. But the cache is threadsafe, and doesn't block any thread. (Because the threadsafe cache can be written with low-lock or no-lock code.)

Code to implement thread safety comes at a cost, even if it is low-lock code. Is that cost acceptable? Most people write what are effectively single-threaded programs; does it seem right to add the overhead of thread safety to every single lazy property call whether it's needed or not?

Possibility #3: Threadsafe such that there is a strong guarantee that the initialization function will only be called once; there is no race on the cache. The user might have an implicit expectation that the initialization function is only called once; it might be very expensive and two calls on two different threads might be unacceptable. Implementing this kind of laziness requires full-on synchronization where it is possible that one thread blocks indefinitely while the lazy method is running on another thread. It also means there could be deadlocks if there's a lock-ordering problem with the lazy method.

That adds even more cost to the feature, a cost that is borne equally by people who do not take advantage of it (because they are writing single-threaded programs).

So how do we deal with this? We could add three features: "lazy not threadsafe", "lazy threadsafe with races" and "lazy threadsafe with blocking and maybe deadlocks". And now the feature just got a whole lot more expensive and way harder to document. This produces an enormous user education problem. Every time you give a developer a choice like this, you present them with an opportunity to write terrible bugs.

Third, the feature seems weak as stated. Why should laziness be applied merely to properties? It seems like this could be applied generally through the type system:

lazy int x = M(); // doesn't call M() lazy int y = x + x; // doesn't add x + x int z = y * y; // now M() is called once and cached.                // x + x is computed and cached                // y * y is computed 

We try to not do small, weak features if there is a more general feature that is a natural extension of it. But now we're talking about really serious design and implementation costs.

Would you find this useful?

Personally? Not really useful. I write lots of simple low-lock lazy code mostly using Interlocked.Exchange. (I don't care if the lazy method gets run twice and one of the results discarded; my lazy methods are never that expensive.) The pattern is straightforward, I know it to be safe, there are never extra objects allocated for the delegate or the locks, and if I have something a little more complex I can always use Lazy<T> to do the work for me. It would be a small convenience.

like image 194
Eric Lippert Avatar answered Oct 04 '22 00:10

Eric Lippert