Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove duplicate elements in a list while preserving order?

I just saw a short video from Seth Ladd on Collections.

A Set has only unique elements (are not ordered), but sometimes i need an ordered list and i want to remove all duplicates (the 2nd occurrence of an element e.g. String should be removed from the list)

original input to a list: A, B, C, B, D, A should result in A, B, C, D. I need to keep the order. A result like B, A, D, C would not help me.

like image 389
Gero Avatar asked Feb 02 '13 20:02

Gero


People also ask

How will you remove duplicates without changing the order?

To remove duplicates from a Python list while preserving the order of the elements, use the code list(dict. fromkeys(list)) that goes through two phases: (1) Convert the list to a dict using the dict. fromkeys() function with the list elements as keys and None as dict values.

How will you remove duplicate elements from a list?

To remove the duplicates from a list, you can make use of the built-in function set(). The specialty of the set() method is that it returns distinct elements.


3 Answers

Justin Fagnani already gave a good answer. Here is another one:

Iterable distinct(Iterable i) {
  var map = new LinkedHashMap();
  i.forEach((x) { map[x] = true; });
  return map.keys;  // map.keys.toList() would free the map for GC.
}
like image 23
Florian Loitsch Avatar answered Nov 15 '22 09:11

Florian Loitsch


Use toSet and then toList

  var ids2 = ["A", "B", "C", "B", "D", "A"];
  var result = ids2.toSet().toList();

[A, B, C, D]
like image 148
atreeon Avatar answered Nov 15 '22 10:11

atreeon


It's fairly easy to implement on your own:

Iterable distinct(Iterable i) {
  var set = new Set();
  return i.where((e) {
    var isNew = !set.contains(e);
    set.add(e);
    return isNew;
  });

It'd be even nicer if Set.add() returned a bool that indicated whether the set was modified:

Iterable distinct(Iterable i) {
  var set = new Set();
  return i.where((e) => set.add(e));
}

You can file feature request bugs of course.

Edit: As Florian points out, the above solution only works if the returned Iterable is only used once. Subsequent uses will return Iterators with no elements, because even element has been seen already on the first use.

To solve this we need to keep a visited set for every Iterator created from the returned Iterable, not just one for the Iterable. We can do that by creating Iterable and Iterator subclasses like with WhereIterable/WhereIterator:

Iterable distinct(Iterable i) => new DistinctIterable(i);

class DistinctIterable<E> extends Iterable<E> {
  final Iterable<E> _iterable;

  DistinctIterable(this._iterable);

  Iterator<E> get iterator {
    return new DistinctIterator<E>(_iterable.iterator);
  }
}

class DistinctIterator<E> extends Iterator<E> {
  final Iterator<E> _iterator;
  final Set<E> _visited = new Set<E>();

  DistinctIterator(this._iterator);

  bool moveNext() {
    while (_iterator.moveNext()) {
      if (!_visited.contains(_iterator.current)) {
        _visited.add(_iterator.current);
        return true;
      }
    }
    return false;
  }

  E get current => _iterator.current;
}

Yes, this is much longer, but it'll work correctly with many-use finite Iterables and one-use infinite Iterables. The infinite iterable use case could easily have problems with memory, which is an argument for not including it in the core lib and forcing developers to make some decisions about what exactly they need.

like image 26
Justin Fagnani Avatar answered Nov 15 '22 09:11

Justin Fagnani