Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

what is the difference between HashSet and Set and when should each one be used?

Tags:

scala

What is the difference between HashSet and Set and when should each one be used? Here's Map vs HashMap:

val hashSet = HashSet("Tomatoes", "Chilies")
val set = Set("Tomatoes", "Chilies")
set == hashSet // res: Boolean = true
like image 328
igx Avatar asked Sep 12 '13 09:09

igx


2 Answers

Set is a trait. You can create an instance of a Set by invoking apply method of its companion object, which returns an instance of a default, immutable Set. For example:

val defaultSet = Set("A", "B")

HashSet is a concrete implementation of a Set which can be instantiated as follows:

val hashSet = HashSet("A", "B")

Have a look at the cite from "Programming in Scala" that explains the differences between various implementations:

The scala.collection.mutable.Set() factory method, for example, returns a scala.collection.mutable.HashSet, which uses a hash table internally. Similarly, the scala.collection.mutable.Map() factory returns a scala.collection.mutable.HashMap.

The story for immutable sets and maps is a bit more involved. The class returned by the scala.collection.immutable.Set() factory method, for example, depends on how many elements you pass to it, as shown in the table below. For sets with fewer than five elements, a special class devoted exclusively to sets of each particular size is used, to maximize performance. Once you request a set that has five or more elements in it, however, the factory method will return an implementation that uses hash tries.

Number of elements  Implementation
0                   scala.collection.immutable.EmptySet
1                   scala.collection.immutable.Set1
2                   scala.collection.immutable.Set2
3                   scala.collection.immutable.Set3
4                   scala.collection.immutable.Set4
5 or more           scala.collection.immutable.HashSet

It means that for an immutable Set with 5 or more elements, both of your calls should return an instance of the same Set subclass.

The same goes for Maps. See this link.

like image 200
rarry Avatar answered Oct 28 '22 06:10

rarry


When you call the Set or Map functions, you're actually calling the .apply method of the Set or Map object. These are factory methods which choose appropriate types as documented in Rarry's answer. In contrast, when you directly instantiate a HashSet, you're making the choice yourself.

like image 26
itsbruce Avatar answered Oct 28 '22 07:10

itsbruce