Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# HashSet with struct and string

Tags:

c#

I created to following code to verify the uniqueness of a series of "tuples":

struct MyTuple
{
   public MyTuple(string a, string b, string c)
   {
      ValA = a; ValB = b; ValC = c;
   }
   private string ValA;
   private string ValB;
   private string ValC;
}

...

HashSet<MyTuple> tupleList = new HashSet<MyTuple>();

If I'm correct, I will not end up with two tuples with the same values in my HashSet thanks to the fact that I'm using a struct. I could not have the same behavior with a class without implementing IEquatable or something like that (I didn't dig too much how to do that).

I want to know if there is some gotcha about what I do. Performance wise, I wouldn't expect the use of a struct to be a problem considering that the string inside are reference types.

Edit: I want my HashSet to never contains two tuples having string with the same values. In other words, I want the string to behave like values types.

like image 291
Simon T. Avatar asked Jan 27 '10 20:01

Simon T.


2 Answers

The gotcha is that it will not work. If two strings are "a", they can still be different references. That case would break your implementation.

Implement Equals() and GetHashCode() properly (e.g. using the ones from the supplied strings, and take care with NULL references in your struct), and possibly IEquatable<MyTuple> to make it even nicer.

Edit: The default implementation is explicitly not suitable to be used in hash tables and sets. This is clearly stated in the ValueType.GetHashCode() implementation (added emphasis):

The GetHashCode method applies to types derived from ValueType. One or more fields of the derived type is used to calculate the return value. If you call the derived type's GetHashCode method, the return value is not likely to be suitable for use as a key in a hash table. Additionally, if the value of one or more of those fields changes, the return value might become unsuitable for use as a key in a hash table. In either case, consider writing your own implementation of the GetHashCode method that more closely represents the concept of a hash code for the type.

You should always implement Equals() and GetHashCode() as "pair", and this is even more obvious since the ValueType.Equals() is terribly inefficient and unreliable (using reflection, unknown method of equality comparison). Also, there is the performance problem when not overriding those two (structs will get boxed when calling the default implementations).

like image 73
Lucero Avatar answered Nov 05 '22 12:11

Lucero


Your approach should work, but you should make the string values read-only, as Lucero said.

You could also take a look at the new .NET 4.0 Tuple types. Although they are implemented as classes (because of supporting up to quite many parameters), they implement the new interface IStructuralEquatable which is intended exactly for your purpose.

like image 37
herzmeister Avatar answered Nov 05 '22 11:11

herzmeister