I've got a HashSet,
var universe = new HashSet<int>();
And a bunch of subsets,
var sets = new List<HashSet<int>>(numSets);
I want to subtract a chunk, which I can do like this:
var remaining = universe.ExceptWith(sets[0]);
But ExceptWith
works in-place. I don't want to modify the universe
. Should I clone it first, or is there a better way?
HashSet is a class that extends AbstractSet and implements the Set interface in Java. It is a very useful tool that allows you to store unique items and access them in constant time (on average). No duplicate values are stored.
A HashSet is usually used for high-performance operations involving a set of unique data. Since HashSet contains only unique elements, its internal structure is optimized for faster searches. Note that you can store a single null value in a HashSet.
util. HashSet. clone() method is used to return a shallow copy of the mentioned hash set. It just creates a copy of the set.
Creating a HashSet Once we import the package, here is how we can create hash sets in Java. // HashSet with 8 capacity and 0.75 load factor HashSet<Integer> numbers = new HashSet<>(8, 0.75); Here, we have created a hash set named numbers . Notice, the part new HashSet<>(8, 0.75) .
I guess I should clone it first? How do I do that?
var universe = new HashSet<int>();
var subset = new HashSet<int>();
...
// clone the universe
var remaining = new HashSet<int>(universe);
remaining.ExceptWith(subset);
Not as simple as with the Except
extension method, but probably faster (you should run a few performance tests to make sure)
How about Except()
?
var x = new HashSet<int>();
var y = new HashSet<int>();
var xminusy = new HashSet<int>(x.Except(y));
I benchmarked Linq's Except
method against cloning and using the HashSet-native function ExceptWith
. Here are the results.
static class Program
{
public static HashSet<T> ToSet<T>(this IEnumerable<T> collection)
{
return new HashSet<T>(collection);
}
public static HashSet<T> Subtract<T>(this HashSet<T> set, IEnumerable<T> other)
{
var clone = set.ToSet();
clone.ExceptWith(other);
return clone;
}
static void Main(string[] args)
{
var A = new HashSet<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
var B = new HashSet<int> { 2, 4, 6, 8, 10 };
var sw = new Stopwatch();
sw.Restart();
for (int i = 0; i < 1000000; ++i)
{
var C = A.Except(B).ToSet();
}
sw.Stop();
Console.WriteLine("Linq: {0} ms", sw.ElapsedMilliseconds);
sw.Restart();
for (int i = 0; i < 1000000; ++i)
{
var C = A.Subtract(B);
}
sw.Stop();
Console.WriteLine("Native: {0} ms", sw.ElapsedMilliseconds);
Console.ReadLine();
}
}
Linq: 1297 ms
Native: 762 ms
http://programanddesign.com/cs/subtracting-sets/
A hash set has to track its hash algorithm constants, and its overflow bins. The elements in the set are held by reference. Creating a new hash with the copy constructor, as Thomas Levesque suggests, creates a shallow copy of this overhead and should be quite fast. Using Except() in the way that James McNellis suggests first creates an anonymous copy and then passes that to the copy constructor which uses the fields in the anonymous to initialize its own fields. As Thomas said, you might do a few performance tests, but theoretically his answer should beat James' answer. And by the way, to my way of thinking, a shallow copy is not a clone since I believe a clone implies that the underlying elements are also copied. Hash sets with common elements use a copy when modified strategy.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With