Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can an immutable data structure NOT be thread safe?

In a post called What is this thing you call "thread safe"?, Eric Lippert says:

Thread safety of immutable data structures is all about ensuring that use of the data across all operations is logically consistent, at the expense of the fact that you're looking at an immutable snapshot that might be out-of-date.

I thought the whole point of immutable data structures is that they do not change and therefore cannot be out of date, and that therefore they are intrinsically thread-safe

What does Lippert mean here?

like image 673
Joshua Frank Avatar asked Dec 07 '20 16:12

Joshua Frank


People also ask

Is immutable data thread-safe?

To put it simply, a class instance is immutable when its internal state can't be modified after it has been constructed. A MessageService object is effectively immutable since its state can't change after its construction. So, it's thread-safe.

What is an immutable data structure?

Immutable data structure are data structures, like lists, arrays, sets etc., which cannot be changed, meaning that the values inside them can't be added, removed, moved or swapped. Instead of changing the data structure you make a new version of the data structure which is a separate value.

Why are immutable data structures good?

Immutable data structures provides referential transparency which makes it easier to reason about our program locally. Another way to think about it is that every time we execute a pure (referentially transparent) function with the same input, we get the same output.

When would you want to use an immutable data structure?

One of the advantages of immutability is that you can optimize your application by making use of reference and value equality. This makes it easy to identify if anything has changed. You can consider the example of state change in the React component.


2 Answers

What does Lippert mean here?

I agree that the way I wrote that particular bit was not as clear as it could be.

Back in 2009 we were designing the data structures for Roslyn -- the "C# and VB compiler as a service" -- and therefore were considering how to do analysis in the IDE, a world where the code is almost never correct -- if the code were correct, why are you editing it? -- and where it can be changing several times a second as you type.

I thought the whole point of immutable data structures is that they do not change and therefore cannot be out of date, and that therefore they are intrinsically thread-safe.

It is the fact that they do not change that makes them possibly out of date. Consider a common scenario in the IDE:

using System;
class P
{
  static void Main()
  {
    Console.E
  }
}

We have immutable data structures which represent the world at the moment before you typed "E", and we have an immutable data structure which represents the edit you've just made -- striking the letter E -- and now a whole bunch of stuff happens.

The lexer, knowing that the previous lex state is immutable and matches the world before the "E" re-lexes the world just around the E, rather than re-lexing the entire token stream. Similarly, the parser works out what the new (ill-formed!) parse tree is for this edit. That creates a new immutable parse tree that is an edit of the old immutable parse tree, and then the real fun starts. The semantic analyzer tries to figure out what Console means and then what you could possibly mean by E so that it can do an IntelliSense dropdown centered on members of System.Console that begin with E, if there are any. (And we're also starting an error-reporting workflow since there are now many semantic and syntactic errors in the program.)

Now what happens if while we are working all that out on a background thread, you hit "backspace" and then "W"?

All that analysis, which is still in flight will be correct, but it will be correct for Console.E and not Console.W. The analysis is out-of-date. It belongs to a world that is no longer relevant, and we have to start over again with analyzing the backspace and the W.

In short, it is perfectly safe to do analysis of immutable data structures on another thread, but stuff perhaps continues to happen on the UI thread that invalidates that work; this is one of the prices you pay for farming work on immutable data out to worker threads.

Remember, these invalidations can happen extremely quickly; we budgeted 30ms for a re-lex, re-parse and IntelliSense update because a fast typist can get in well over ten keystrokes per second; having an lexer and parser that re-used the immutable state of past lexes and parses was a key part of this strategy, but you then have to plan for an invalidation that throws away your current analysis to happen just as quickly.

Incidentally, the mechanisms we needed to invent to efficiently track those invalidations were themselves quite interesting, and led to some insights into cancellation-based workflows -- but that's a subject for another day.

like image 85
Eric Lippert Avatar answered Oct 20 '22 18:10

Eric Lippert


He means that you might be looking at a different snapshot than someone else. Consider how cons lists work: after adding another element to the head of a list, there are effectively two lists (snapshots). Both of them are immutable but not the same.

like image 31
michid Avatar answered Oct 20 '22 19:10

michid