Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Remove adjacent duplicate elements from a list [duplicate]

Tags:

python

Google Python Class | List Exercise -

Given a list of numbers, return a list where all adjacent == elements have been reduced to a single element, so [1, 2, 2, 3] returns [1, 2, 3]. You may create a new list or modify the passed in list.

My solution using a new list is -

def remove_adjacent(nums):
  a = []
  for item in nums:
    if len(a):
      if a[-1] != item:
        a.append(item)
    else: a.append(item)        
  return a

The question even suggests that it could be done by modifying the passed in list. However, the python documentation warned against modifying elements while iterating a list using the for loop.

I am wondering what else can I try apart from iterating over the list, to get this done. I am not looking for the solution, but maybe a hint that can take me into a right direction.

UPDATE

-updated the above code with suggested improvements.

-tried the following with a while loop using suggested hints -

def remove_adjacent(nums):
  i = 1
  while i < len(nums):    
    if nums[i] == nums[i-1]:
      nums.pop(i)
      i -= 1  
    i += 1
  return nums
like image 833
Vaibhav Bajpai Avatar asked Aug 11 '10 15:08

Vaibhav Bajpai


2 Answers

Here's the traditional way, deleting adjacent duplicates in situ, while traversing the list backwards:

Python 1.5.2 (#0, Apr 13 1999, 10:51:12) [MSC 32 bit (Intel)] on win32
Copyright 1991-1995 Stichting Mathematisch Centrum, Amsterdam
>>> def dedupe_adjacent(alist):
...     for i in xrange(len(alist) - 1, 0, -1):
...         if alist[i] == alist[i-1]:
...             del alist[i]
...
>>> data = [1,2,2,3,2,2,4]; dedupe_adjacent(data); print data
[1, 2, 3, 2, 4]
>>> data = []; dedupe_adjacent(data); print data
[]
>>> data = [2]; dedupe_adjacent(data); print data
[2]
>>> data = [2,2]; dedupe_adjacent(data); print data
[2]
>>> data = [2,3]; dedupe_adjacent(data); print data
[2, 3]
>>> data = [2,2,2,2,2]; dedupe_adjacent(data); print data
[2]
>>>

Update: If you want a generator but (don't have itertools.groupby or (you can type faster than you can read its docs and understand its default behaviour)), here's a six-liner that does the job:

Python 2.3.5 (#62, Feb  8 2005, 16:23:02) [MSC v.1200 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> def dedupe_adjacent(iterable):
...     prev = object()
...     for item in iterable:
...         if item != prev:
...             prev = item
...             yield item
...
>>> data = [1,2,2,3,2,2,4]; print list(dedupe_adjacent(data))
[1, 2, 3, 2, 4]
>>>

Update 2: Concerning the baroque itertools.groupby() and the minimalist object() ...

To get the dedupe_adjacent effect out of itertools.groupby(), you need to wrap a list comprehension around it to throw away the unwanted groupers:

>>> [k for k, g in itertools.groupby([1,2,2,3,2,2,4])]
[1, 2, 3, 2, 4]
>>>

... or muck about with itertools.imap and/or operators.itemgetter, as seen in another answer.

Expected behaviour with object instances is that none of them compares equal to any other instance of any class, including object itself. Consequently they are extremely useful as sentinels.

>>> object() == object()
False

It's worth noting that the Python reference code for itertools.groupby uses object() as a sentinel:

self.tgtkey = self.currkey = self.currvalue = object()

and that code does the right thing when you run it:

>>> data = [object(), object()]
>>> data
[<object object at 0x00BBF098>, <object object at 0x00BBF050>]
>>> [k for k, g in groupby(data)]
[<object object at 0x00BBF098>, <object object at 0x00BBF050>]

Update 3: Remarks on forward-index in-situ operation

The OP's revised code:

def remove_adjacent(nums):
  i = 1
  while i < len(nums):    
    if nums[i] == nums[i-1]:
      nums.pop(i)
      i -= 1  
    i += 1
  return nums

is better written as:

def remove_adjacent(seq): # works on any sequence, not just on numbers
  i = 1
  n = len(seq)
  while i < n: # avoid calling len(seq) each time around
    if seq[i] == seq[i-1]:
      del seq[i]
      # value returned by seq.pop(i) is ignored; slower than del seq[i]
      n -= 1
    else:
      i += 1
  #### return seq #### don't do this
  # function acts in situ; should follow convention and return None
like image 142
John Machin Avatar answered Oct 22 '22 00:10

John Machin


Use a generator to iterate over the elements of the list, and yield a new one only when it has changed.

itertools.groupby does exactly this.

You can modify the passed-in list if you iterate over a copy:

for elt in theList[ : ]:
    ...
like image 23
Katriel Avatar answered Oct 22 '22 00:10

Katriel