Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why doesn't Dictionary have AddRange?

A comment to the original question sums this up pretty well:

because no one ever designed, specified, implemented, tested, documented and shipped that feature. - @Gabe Moothart

As to why? Well, likely because the behavior of merging dictionaries can't be reasoned about in a manner that fits with the Framework guidelines.

AddRange doesn't exist because a range doesn't have any meaning to an associative container, as the range of data allows for duplicate entries. E.g if you had an IEnumerable<KeyValuePair<K,T>> that collection does not guard against duplicate entries.

The behavior of adding a collection of key-value pairs, or even merging two dictionaries is straight-forward. The behavior of how to deal with multiple duplicate entries, however, is not.

What should be the behavior of the method when it deals with a duplicate?

There are at least three solutions I can think of:

  1. throw an exception for the first entry that is a duplicate
  2. throw an exception that contains all the duplicate entries
  3. Ignore duplicates

When an exception is thrown, what should be the state of the original dictionary?

Add is almost always implemented as an atomic operation: it succeeds and updates the state of the collection, or it fails, and the state of the collection is left unchanged. As AddRange can fail due to duplicate errors, the way to keep its behavior consistent with Add would be to also make it atomic by throwing an exception on any duplicate, and leave the state of the original dictionary as unchanged.

As an API consumer, it would be tedious to have to iteratively remove duplicate elements, which implies that the AddRange should throw a single exception that contains all the duplicate values.

The choice then boils down to:

  1. Throw an exception with all duplicates, leaving the original dictionary alone.
  2. Ignore duplicates and proceed.

There are arguments for supporting both use cases. To do that, do you add a IgnoreDuplicates flag to the signature?

The IgnoreDuplicates flag (when set to true) would also provide a significant speed up, as the underlying implementation would bypass the code for duplicate checking.

So now, you have a flag that allows the AddRange to support both cases, but has an undocumented side effect (which is something that the Framework designers worked really hard to avoid).

Summary

As there is no clear, consistent and expected behavior when it comes to dealing with duplicates, it's easier to not deal with them all together, and not provide the method to begin with.

If you find yourself continually having to merge dictionaries, you can of course write your own extension method to merge dictionaries, which will behave in a manner that works for your application(s).


I've got some solution:

Dictionary<string, string> mainDic = new Dictionary<string, string>() { 
    { "Key1", "Value1" },
    { "Key2", "Value2.1" },
};
Dictionary<string, string> additionalDic= new Dictionary<string, string>() { 
    { "Key2", "Value2.2" },
    { "Key3", "Value3" },
};
mainDic.AddRangeOverride(additionalDic); // Overrides all existing keys
// or
mainDic.AddRangeNewOnly(additionalDic); // Adds new keys only
// or
mainDic.AddRange(additionalDic); // Throws an error if keys already exist
// or
if (!mainDic.ContainsKeys(additionalDic.Keys)) // Checks if keys don't exist
{
    mainDic.AddRange(additionalDic);
}

...

namespace MyProject.Helper
{
  public static class CollectionHelper
  {
    public static void AddRangeOverride<TKey, TValue>(this IDictionary<TKey, TValue> dic, IDictionary<TKey, TValue> dicToAdd)
    {
        dicToAdd.ForEach(x => dic[x.Key] = x.Value);
    }

    public static void AddRangeNewOnly<TKey, TValue>(this IDictionary<TKey, TValue> dic, IDictionary<TKey, TValue> dicToAdd)
    {
        dicToAdd.ForEach(x => { if (!dic.ContainsKey(x.Key)) dic.Add(x.Key, x.Value); });
    }

    public static void AddRange<TKey, TValue>(this IDictionary<TKey, TValue> dic, IDictionary<TKey, TValue> dicToAdd)
    {
        dicToAdd.ForEach(x => dic.Add(x.Key, x.Value));
    }

    public static bool ContainsKeys<TKey, TValue>(this IDictionary<TKey, TValue> dic, IEnumerable<TKey> keys)
    {
        bool result = false;
        keys.ForEachOrBreak((x) => { result = dic.ContainsKey(x); return result; });
        return result;
    }

    public static void ForEach<T>(this IEnumerable<T> source, Action<T> action)
    {
        foreach (var item in source)
            action(item);
    }

    public static void ForEachOrBreak<T>(this IEnumerable<T> source, Func<T, bool> func)
    {
        foreach (var item in source)
        {
            bool result = func(item);
            if (result) break;
        }
    }
  }
}

Have fun.


In case someone comes across this question like myself - it's possible to achieve "AddRange" by using IEnumerable extension methods:

var combined =
    dict1.Union(dict2)
        .GroupBy(kvp => kvp.Key)
        .Select(grp => grp.First())
        .ToDictionary(kvp => kvp.Key, kvp => kvp.Value);

The main trick when combining dictionaries is dealing with the duplicate keys. In the code above it's the part .Select(grp => grp.First()). In this case it simply takes the first element from the group of duplicates but you can implement more sophisticated logic there if needed.


My guess is lack of proper output to the user as to what happened. As you can't have repeating keys in a dictionaries, how would you handle merging two dictionary where some keys intersect? Sure you could say: "I don't care" but that's breaking the convention of returning false / throwing an exception for repeating keys.