My understanding is that C# has in some sense HashSet
and set
types. I understand what HashSet
is. But why set
is a separate word? Why not every set is HashSet<Object>
?
Why does C# has no generic Set
type, similar to Dictionary
type? From my point of view, I would like to have a set with standard lookup/addition/deletion performance. I wouldn't care much whether it is realized with hashes or something else. So why not make a set class that would actually be implemented as a HashSet
in this version of C# but perhaps somewhat different in a future version?
Or why not at least interface ISet
?
Learned thanks to everyone who answered below: ICollection
implements a lot of what you'd expect from ISet
. From my point of view, though, ICollection
implements IEnumerable
while sets don't have to be enumerable --- example: set of real numbers between 1 and 2 (even more, sets can be generated dynamically). I agree this is a minor rant, as 'normal programmers' rarely need uncountable sets.
Ok, I think I get it. HashSet
was absolutely meant to be called Set
but the word Set
is reserved in some sense. More specifically, creators of .NET architecture wanted to have a consistent set (sic!) of classes for different languages. This means that every name of the standard class must not coincide with any keyword in the .NET languages. The word Set
, however, is used in VB.NET which is actually case-insensitive (is it?) so unfortunately there is no room for maneuvre there.
Mystery solved :)
The new answer by Alex Y. links to the MSDN page which describes the upcoming .NET 4.0 interface ISet
which behaves pretty much as I thought it should and is implemented by HashedSet
. Happy end.
A Set is a generic set of values with no duplicate elements. A TreeSet is a set where the elements are sorted. A HashSet is a set where the elements are not sorted or ordered.
A HashSet is usually used for high-performance operations involving a set of unique data. Since HashSet contains only unique elements, its internal structure is optimized for faster searches.
The result clearly shows that the HashSet provides faster lookup for the element than the List. This is because of no duplicate data in the HashSet. The HashSet maintains the Hash for each item in it and arranges these in separate buckets containing hash for each character of item stored in HashSet.
ArrayList maintains the insertion order i.e order of the object in which they are inserted. HashSet is an unordered collection and doesn't maintain any order. ArrayList allows duplicate values in its collection. On other hand duplicate elements are not allowed in Hashset.
(Your original question about set
has been answered. IIRC, "set" is the word with the most different meanings in the English language... obviously this has an impact in computing too.)
I think it's fine to have HashSet<T>
with that name, but I'd certainly welcome an ISet<T>
interface. Given that HashSet<T>
only arrived in .NET 3.5 (which in itself was surprising) I suspect we may eventually get a more complete collection of set-based types. In particular, the equivalent of Java's LinkedHashSet
, which maintains insertion order, would be useful in some cases.
To be fair, the ICollection<T>
interface actually covers most of what you'd want in ISet<T>
, so maybe that isn't required. However, you could argue that the core purpose of a set (which is mostly about containment, and only tangentially about being able to iterate over the elements) isn't quite the same as a collection. It's tricky. In fact, a truly mathematical set may not be iterable or countable - for instance, you could have "the set of real numbers between 1 and 2." If you had an arbitrary-precision numeric type, the count would be infinite and iterating over it wouldn't make any sense.
Likewise the idea of "adding" to a set doesn't always make sense. Mutability is a tricky business when naming collections :(
EDIT: Okay, responding to the comment: the keyword set
is in no way a legacy to do with Visual Basic. It's the operation which sets the value of a property, vs get
which retrieves the operation. This has nothing to do with the idea of a set as an operation.
Imagine that instead the keywords were actually fetch
and assign
, e.g.
// Not real code!
public int Foo
{
fetch
{
return fooField;
}
assign
{
fooField = value;
}
}
Is the purpose clear there? Now the real equivalent of that in C# is just
public int Foo
{
get
{
return fooField;
}
set
{
fooField = value;
}
}
So if you write:
x = y.Foo;
that will use the get
part of the property. If you write:
y.Foo = x;
that will use the set
part.
Is that any clearer?
The only reason for this seems lack of resources to implement this ideally in .NET 3.5.
.NET 4.0 will include ISet, as well as its new implementation in addition to HashSet - SortedSet. Check out the provided links to MSDN library - they're already available in .NET 4.0 beta1.
There is no Set<T>
. This BCL team Blog post has lot's of details on HashSet including a not entirely conclusive discussion on including hash in the name. I suspect not everyone on the BCL team liked the decision to use the name HashSet<T>
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With