Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Iterator Object for Removing Duplicates in Python

Hi so I'm trying to figure out how to create an iterator object using Python that would remove duplicates or more so omit duplicates.

For example I have a list (1, 2, 3, 3, 4, 4, 5) and I get (1, 2, 3, 4, 5)

I understand that in order to get an iterator object I have to create it. So:

Class Unique:
    def __init__(self, n):
         self.i = 0
         self.n = n  

    def __iter__(self):
         return self

    def __next__(self):
        if self.i < self.n:

I'm actually not entirely sure what to do next in this problem. Thanks in advance for any comments or help!

like image 281
d'chang Avatar asked Jun 04 '26 14:06

d'chang


1 Answers

Better create a generator function, like this

>>> def unique_values(iterable):
...     seen = set()
...     for item in iterable:
...         if item not in seen:
...             seen.add(item)
...             yield item
... 

And then you can create a tuple of unique values, like this

>>> tuple(unique_values((1, 2, 3, 3, 4, 4, 5)))
(1, 2, 3, 4, 5)

If you know for sure that the data will be always sorted, then you can avoid creating the set and keep track of the previous data only, like this

>>> def unique_values(iterable):
...     it = iter(iterable)
...     previous = next(it)
...     yield previous
...     for item in it:
...         if item != previous:
...             previous = item
...             yield item
>>> tuple(unique_values((1, 2, 3, 3, 4, 4, 5)))
(1, 2, 3, 4, 5)

You can write an iterator object, with a class, like this

>>> class Unique:
...     def __init__(self, iterable):
...         self.__it = iter(iterable)
...         self.__seen = set()
... 
...     def __iter__(self):
...         return self
... 
...     def __next__(self):
...         while True:
...             next_item = next(self.__it)
...             if next_item not in self.__seen:
...                 self.__seen.add(next_item)
...                 return next_item
... 
>>> for item in Unique((1, 2, 3, 3, 4, 4, 5)):
...     print(item)
... 
1
2
3
4
5

You can refer this answer, and the Iterator Types section in Python 3 Data Model documentation

like image 80
thefourtheye Avatar answered Jun 07 '26 23:06

thefourtheye