Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tuple vs string as a Dictionary key in C#

I have a cache that I implement using a ConcurrentDictionary, The data that I need to keep depends on 5 parameters. So the Method to get it from the cache is: (I show only 3 parameters here for simplicity, and I changed the data type to represent CarData for clearity)

public CarData GetCarData(string carModel, string engineType, int year);

I wonder what type of key will be better to use in my ConcurrentDictionary, I can do it like this:

var carCache = new ConcurrentDictionary<string, CarData>();
// check for car key
bool exists = carCache.ContainsKey(string.Format("{0}_{1}_{2}", carModel, engineType, year);

Or like this:

var carCache = new ConcurrentDictionary<Tuple<string, string, int>, CarData>();
// check for car key
bool exists = carCache.ContainsKey(new Tuple(carModel, engineType, year));

I don't use these parameters together any other place, so there is no justification to create a class just to keep them together.

I want to know which approach is a better in terms of performance and maintainability.

like image 283
Shahar Avatar asked Jan 30 '17 09:01

Shahar


People also ask

Can tuples be used as dictionary keys?

A tuple containing a list cannot be used as a key in a dictionary. Answer: True. A list is mutable. Therefore, a tuple containing a list cannot be used as a key in a dictionary.

Which is faster tuple or dictionary in C#?

The Tuple method is similar to the above snippets, and while it is faster than the Dictionary<int, KeyValuePair<string, string>> , it is still nowhere near as fast as indexing directly into the collection to the desired value based on a hashed key, as is done in the MultiKeyDictionary class.

Can a tuple be a dictionary key C#?

NET 4's Tuple implements equals so it can be used in a dictionary. Your GetHashCode implementation isn't very good. It's invariant under permutation of the fields. Tuple should not be a struct.

What is the difference between dictionary and tuple?

A tuple is an ordered collection of data. A set is an unordered collection. A dictionary is an unordered collection of data that stores data in key-value pairs.


3 Answers

I want to know which approach is a better in terms of performance and maintainability.

As always, you have the tools to figure it out. Code both possible solutions and make them race. The one that wins is the winner, you don't need anyone here to answer this particular question.

About maintenance, the solution that autodocuments itself better and has better scalability should be the winner. In this case, the code is so trivial that autodocumentation isn't that much of an issue. From a scalability point of view, IMHO, the best solution is to use Tuple<T1, T2, ...>:

  • You get free equality semantics which you don't need to maintain.
  • Collisions are not possible, something that is not true if you choose the string concatenation solution:

    var param1 = "Hey_I'm a weird string";
    var param2 = "!"
    var param3 = 1;
    key = "Hey_I'm a weird string_!_1";
    
    var param1 = "Hey";
    var param2 = "I'm a weird string_!"
    var param3 = 1;
    key = "Hey_I'm a weird string_!_1";
    

    Yeah, far fetched, but, in theory, entirely possible and your question is precisely about unknown events in the future, so...

  • And last, but not least, the compiler helps you maintain the code. If, for example, tomorrow you have to add param4 to your key, Tuple<T1, T2, T3, T4> will strongly type your key. Your string concatenation algorithm on the other hand can live on blissfully happy generating keys without param4 and you wont know whats happening until your client calls you up because their software is not working as expected.

like image 180
InBetween Avatar answered Oct 11 '22 12:10

InBetween


If performance is really important, then the answer is that you shouldn't use either option, because both unnecessarily allocate an object on every access.

Instead, you should use a struct, either a custom one, or ValueTuple from the System.ValueTuple package:

var myCache = new ConcurrentDictionary<ValueTuple<string, string, int>, CachedData>();
bool exists = myCache.ContainsKey(ValueTuple.Create(param1, param2, param3));

C# 7.0 also contais syntax sugar to make this code easier to write (but you don't need to wait for C# 7.0 to start using ValueTuple without the sugar):

var myCache = new ConcurrentDictionary<(string, string, int), CachedData>();
bool exists = myCache.ContainsKey((param1, param2, param3));
like image 29
svick Avatar answered Oct 11 '22 12:10

svick


You could create a class (doesn't matter that its only used here) that overrides GetHashCode and Equals:

Thanks theDmi (and others) for improvements...

public class CarKey : IEquatable<CarKey>
{
    public CarKey(string carModel, string engineType, int year)
    {
        CarModel = carModel;
        EngineType= engineType;
        Year= year;
    }

    public string CarModel {get;}
    public string EngineType {get;}
    public int Year {get;}

    public override int GetHashCode()
    {
        unchecked // Overflow is fine, just wrap
        {
            int hash = (int) 2166136261;

            hash = (hash * 16777619) ^ CarModel?.GetHashCode() ?? 0;
            hash = (hash * 16777619) ^ EngineType?.GetHashCode() ?? 0;
            hash = (hash * 16777619) ^ Year.GetHashCode();
            return hash;
        }
    }

    public override bool Equals(object other)
    {
        if (ReferenceEquals(null, other)) return false;
        if (ReferenceEquals(this, other)) return true;
        if (other.GetType() != GetType()) return false;
        return Equals(other as CarKey);
    }

    public bool Equals(CarKey other)
    {
        if (ReferenceEquals(null, other)) return false;
        if (ReferenceEquals(this, other)) return true;
        return string.Equals(CarModel,obj.CarModel) && string.Equals(EngineType, obj.EngineType) && Year == obj.Year;
    }
}

If you don't override those, ContainsKey does a reference equals.

Note: the Tuple class does have its own equality functions that would basically do the same as above. Using a bespoke class makes it clear that is what is intended to happen - and is therefore better for maintainability. It also has the advantage that you can name the properties so it is clear

Note 2: the class is immutable as dictionary keys need to be to avoid potential bugs with hashcodes changing after the object is added to the dictionary See here

GetHashCode taken from here

like image 13
Tim Rutter Avatar answered Oct 11 '22 12:10

Tim Rutter