Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is HashSet<T> thread safe as a value of ConcurrentDictionary<TKey, HashSet<T>>?

If I have the following code:

var dictionary = new ConcurrentDictionary<int, HashSet<string>>();

foreach (var user in users)
{
   if (!dictionary.ContainsKey(user.GroupId))
   {
       dictionary.TryAdd(user.GroupId, new HashSet<string>());
   }

   dictionary[user.GroupId].Add(user.Id.ToString());
}

Is the act of adding an item into the HashSet inherently thread safe because HashSet is a value property of the concurrent dictionary?

like image 864
Marko Avatar asked Nov 20 '18 00:11

Marko


2 Answers

No. Putting a container in a thread-safe container does not make the inner container thread safe.

dictionary[user.GroupId].Add(user.Id.ToString());

is calling HashSet's add after retrieving it from the ConcurrentDictionary. If this GroupId is looked up from two threads at once this would break your code with strange failure modes. I saw the result of one of my teammates making the mistake of not locking his sets, and it wasn't pretty.

This is a plausible solution. I'd do something different myself but this is closer to your code.

if (!dictionary.ContainsKey(user.GroupId))
{
    dictionary.TryAdd(user.GroupId, new HashSet<string>());
}
var groups = dictionary[user.GroupId];
lock(groups)
{
    groups.Add(user.Id.ToString());
}
like image 75
Joshua Avatar answered Sep 28 '22 05:09

Joshua


No, the collection (the dictionary itself) is thread-safe, not whatever you put in it. You have a couple of options:

  1. Use AddOrUpdate as @TheGeneral mentioned:

    dictionary.AddOrUpdate(user.GroupId,  new HashSet<string>(), (k,v) => v.Add(user.Id.ToString());
    
  2. Use a concurrent collection, like the ConcurrentBag<T>:

    ConcurrentDictionary<int, ConcurrentBag<string>>
    

Whenever you are building the Dictionary, as in your code, you should be better off accessing it as little as possible. Think of something like this:

var dictionary = new ConcurrentDictionary<int, ConcurrentBag<string>>();
var grouppedUsers = users.GroupBy(u => u.GroupId);

foreach (var group in grouppedUsers)
{
    // get the bag from the dictionary or create it if it doesn't exist
    var currentBag = dictionary.GetOrAdd(group.Key, new ConcurrentBag<string>());

    // load it with the users required
    foreach (var user in group)
    {
        if (!currentBag.Contains(user.Id.ToString())
        {
            currentBag.Add(user.Id.ToString());
        }
    }
}
  1. If you actually want a built-in concurrent HashSet-like collection, you'd need to use ConcurrentDictionary<int, ConcurrentDictionary<string, string>>, and care either about the key or the value from the inner one.
like image 21
Camilo Terevinto Avatar answered Sep 28 '22 06:09

Camilo Terevinto