Looking at the code for Contains
in the HashSet<T>
class in the .NET source code, I cannot find any reason why Contains
is not thread safe?
I am loading a HashSet<T>
with values ahead of time, and then checking Contains
in a multi threaded .AsParallel()
loop.
Is there any reason why this would not be safe.
I am loath to use ConcurrentDictionary
when I don't actually require storing values.
Thread Safe HashSet Using Collections Utility Class Collections to create a thread-safe HashSet instance: Set<Integer> syncNumbers = Collections. synchronizedSet(new HashSet<>()); syncNumbers.
The collection classes that are thread-safe in Java are Stack, Vector, Properties, Hashtable, etc.
If a writer may be writing at the same time, List. Contains is definitely not thread safe. You'll need to wrap it and any other reads and writes with a lock.
Collections. Concurrent namespace. This has several collection classes that are thread-safe and scalable. These collections are called concurrent collections because they can be accessed by multiple threads at a time.
Normally (normally) collections that are used only for reading are "unofficially" thread safe (there is no collection in .NET that I know that modifies itself during reading). There are some caveats:
HashSet<T>
this problem should be minimized, because you can't extract items from it. Still the GetHashCode()
and the Equals()
must be thread-safe. If, for example, they access lazy objects that are loaded on-demand, they could be not-thread safe, or perhaps they cache/memoize some data to speed-up subsequent operations)Thread.MemoryBarrier()
(done in the same thread as the write) or equivalent, otherwise a read on another thread could read incomplete dataThread.MemoryBarrier()
. Note that if the HashSet<T>
was "prepared" (with the Thread.MemoryBarrier() at the end) before creating/starting the other threads, then the Thread.MemoryBarrier()
isn't necessary, because the threads can't have a stale read of the memory (because they didn't exist). Various operations cause an implicit Thread.MemoryBarrier()
. For example if the threads where created before the HashSet<T>
was filled, entered a Wait()
and were un-Waited
after the HashSet<T>
was filled (plus its Thread.MemoryBarrier()
), exiting a Wait()
causes an implicit Thread.MemoryBarrier()
A simple example of a class that uses memoization/lazy loading/whatever you want to call it and in that way can break the thread safety.
public class MyClass
{
private long value2;
public int Value1 { get; set; }
// Value2 is lazily loaded in a very primitive
// way (note that Lazy<T> *can* be used thread-safely!)
public long Value2
{
get
{
if (value2 == 0)
{
// value2 is a long. If the .NET is running at 32 bits,
// the assignment of a long (64 bits) isn't atomic :)
value2 = LoadFromServer();
// If thread1 checks and see value2 == 0 and loads it,
// and then begin writing value2 = (value), but after
// writing the first 32 bits of value2 we have that
// thread2 reads value2, then thread2 will read an
// "incomplete" data. If this "incomplete" data is == 0
// then a second LoadFromServer() will be done. If the
// operation was repeatable then there won't be any
// problem (other than time wasted). But if the
// operation isn't repeatable, or if the incomplete
// data that is read is != 0, then there will be a
// problem (for example an exception if the operation
// wasn't repeatable, or different data if the operation
// wasn't deterministic, or incomplete data if the read
// was != 0)
}
return value2;
}
}
private long LoadFromServer()
{
// This is a slow operation that justifies a lazy property
return 1;
}
public override int GetHashCode()
{
// The GetHashCode doesn't use Value2, because it
// wants to be fast
return Value1;
}
public override bool Equals(object obj)
{
MyClass obj2 = obj as MyClass;
if (obj2 == null)
{
return false;
}
// The equality operator uses Value2, because it
// wants to be correct.
// Note that probably the HashSet<T> doesn't need to
// use the Equals method on Add, if there are no
// other objects with the same GetHashCode
// (and surely, if the HashSet is empty and you Add a
// single object, that object won't be compared with
// anything, because there isn't anything to compare
// it with! :-) )
// Clearly the Equals is used by the Contains method
// of the HashSet
return Value1 == obj2.Value1 && Value2 == obj2.Value2;
}
}
Given that you are loading your set with values ahead of time, you can use the ImmutableHashSet<T>
from the System.Collections.Immutable
library. The immutable collections advertise themselves as thread safe, so we don't have to worry about the "unofficial" thread safety of the HashSet<T>
.
var builder = ImmutableHashSet.CreateBuilder<string>(); // The builder is not thread safe
builder.Add("value1");
builder.Add("value2");
ImmutableHashSet<string> set = builder.ToImmutable();
...
if (set.Contains("value1")) // Thread safe operation
{
...
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With