Clojure set vs distinct vs dedupe?

Question

So if we want a collection of unique items we can use a 'set'.

If we already have a collection of items that we want to dedupe, we could pass them to the set function, or alternatively we could use the distinct or dedupe functions.

What are the situations for using each of these (pros/cons)?

Thanks.

Piotrek Bzdyl · Accepted Answer

The differences are:

set will create a new set collection eagerly.
distinct will create a lazy sequence with duplicates from the input collection removed. It has an advantage over set if you process big collections and lazyness might save you from eagerly evaluating the input collection (e.g. with take)
dedupe removes consecutive duplicates from the input collection so it has a different semantics than set and distinct. For example it will return (1 2 3 1 2 3) when applied to (1 1 1 2 3 3 1 1 2 2 2 3 3)

Set and lazy seq have different APIs available (e.g. disj, get vs nth) and performance characteristics (e.g. O(log32 n) look up for set and O(n) for lazy seq) and they should be chosen depending on how you would like to use their results.

Additionally distinct and dedupe return a transducer when called without argument.

Clojure set vs distinct vs dedupe?

Tags:

clojure

Integralist

1 Answers

Piotrek Bzdyl

Recent Activity

Donate For Us

Clojure set vs distinct vs dedupe?

Tags:

clojure

Integralist

1 Answers

Piotrek Bzdyl

Related questions

Recent Activity

Donate For Us