I have a cache that I implement using a ConcurrentDictionary, The data that I need to keep depends on 5 parameters. So the Method to get it from the cache is: (I show only 3 parameters here for simplicity, and I changed the data type to represent CarData for clearity)
public CarData GetCarData(string carModel, string engineType, int year);
I wonder what type of key will be better to use in my ConcurrentDictionary, I can do it like this:
var carCache = new ConcurrentDictionary<string, CarData>();
// check for car key
bool exists = carCache.ContainsKey(string.Format("{0}_{1}_{2}", carModel, engineType, year);
Or like this:
var carCache = new ConcurrentDictionary<Tuple<string, string, int>, CarData>();
// check for car key
bool exists = carCache.ContainsKey(new Tuple(carModel, engineType, year));
I don't use these parameters together any other place, so there is no justification to create a class just to keep them together.
I want to know which approach is a better in terms of performance and maintainability.
A tuple containing a list cannot be used as a key in a dictionary. Answer: True. A list is mutable. Therefore, a tuple containing a list cannot be used as a key in a dictionary.
The Tuple method is similar to the above snippets, and while it is faster than the Dictionary<int, KeyValuePair<string, string>> , it is still nowhere near as fast as indexing directly into the collection to the desired value based on a hashed key, as is done in the MultiKeyDictionary class.
NET 4's Tuple implements equals so it can be used in a dictionary. Your GetHashCode implementation isn't very good. It's invariant under permutation of the fields. Tuple should not be a struct.
A tuple is an ordered collection of data. A set is an unordered collection. A dictionary is an unordered collection of data that stores data in key-value pairs.
I want to know which approach is a better in terms of performance and maintainability.
As always, you have the tools to figure it out. Code both possible solutions and make them race. The one that wins is the winner, you don't need anyone here to answer this particular question.
About maintenance, the solution that autodocuments itself better and has better scalability should be the winner. In this case, the code is so trivial that autodocumentation isn't that much of an issue. From a scalability point of view, IMHO, the best solution is to use Tuple<T1, T2, ...>
:
Collisions are not possible, something that is not true if you choose the string concatenation solution:
var param1 = "Hey_I'm a weird string";
var param2 = "!"
var param3 = 1;
key = "Hey_I'm a weird string_!_1";
var param1 = "Hey";
var param2 = "I'm a weird string_!"
var param3 = 1;
key = "Hey_I'm a weird string_!_1";
Yeah, far fetched, but, in theory, entirely possible and your question is precisely about unknown events in the future, so...
And last, but not least, the compiler helps you maintain the code. If, for example, tomorrow you have to add param4
to your key, Tuple<T1, T2, T3, T4>
will strongly type your key. Your string concatenation algorithm on the other hand can live on blissfully happy generating keys without param4
and you wont know whats happening until your client calls you up because their software is not working as expected.
If performance is really important, then the answer is that you shouldn't use either option, because both unnecessarily allocate an object on every access.
Instead, you should use a struct
, either a custom one, or ValueTuple
from the System.ValueTuple package:
var myCache = new ConcurrentDictionary<ValueTuple<string, string, int>, CachedData>();
bool exists = myCache.ContainsKey(ValueTuple.Create(param1, param2, param3));
C# 7.0 also contais syntax sugar to make this code easier to write (but you don't need to wait for C# 7.0 to start using ValueTuple
without the sugar):
var myCache = new ConcurrentDictionary<(string, string, int), CachedData>();
bool exists = myCache.ContainsKey((param1, param2, param3));
You could create a class (doesn't matter that its only used here) that overrides GetHashCode and Equals:
Thanks theDmi (and others) for improvements...
public class CarKey : IEquatable<CarKey>
{
public CarKey(string carModel, string engineType, int year)
{
CarModel = carModel;
EngineType= engineType;
Year= year;
}
public string CarModel {get;}
public string EngineType {get;}
public int Year {get;}
public override int GetHashCode()
{
unchecked // Overflow is fine, just wrap
{
int hash = (int) 2166136261;
hash = (hash * 16777619) ^ CarModel?.GetHashCode() ?? 0;
hash = (hash * 16777619) ^ EngineType?.GetHashCode() ?? 0;
hash = (hash * 16777619) ^ Year.GetHashCode();
return hash;
}
}
public override bool Equals(object other)
{
if (ReferenceEquals(null, other)) return false;
if (ReferenceEquals(this, other)) return true;
if (other.GetType() != GetType()) return false;
return Equals(other as CarKey);
}
public bool Equals(CarKey other)
{
if (ReferenceEquals(null, other)) return false;
if (ReferenceEquals(this, other)) return true;
return string.Equals(CarModel,obj.CarModel) && string.Equals(EngineType, obj.EngineType) && Year == obj.Year;
}
}
If you don't override those, ContainsKey does a reference equals.
Note: the Tuple
class does have its own equality functions that would basically do the same as above. Using a bespoke class makes it clear that is what is intended to happen - and is therefore better for maintainability. It also has the advantage that you can name the properties so it is clear
Note 2: the class is immutable as dictionary keys need to be to avoid potential bugs with hashcodes changing after the object is added to the dictionary See here
GetHashCode taken from here
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With