Given a Python list, I want to remove consecutive 'duplicates'. The duplicate value however is a attribute of the list item (In this example, the <code>tuple</code>'s first element). Input: <pre class="prettyprint"><code>[(1, 'a'), (2, 'b'), (2, 'b'), (2, 'c'), (3, 'd'), (2, 'e')] </code></pre> Desired Output: <pre class="prettyprint"><code>[(1, 'a'), (2, 'b'), (3, 'd'), (2, 'e')] </code></pre> Cannot use <code>set</code> or <code>dict</code>, because order is important. Cannot use list comprehension <code>[x for x in somelist if not determine(x)]</code>, because the check depends on predecessor. What I want is something like: <pre class="prettyprint lang-py prettyprint-override"><code>mylist = [...] for i in range(len(mylist)): if mylist[i-1].attr == mylist[i].attr: mylist.remove(i) </code></pre> What is the preferred way to solve this in Python?

You can use <code>itertools.groupby</code> (demonstration with more data): <pre class="prettyprint"><code>from itertools import groupby from operator import itemgetter data = [(1, 'a'), (2, 'a'), (2, 'b'), (3, 'a'), (4, 'a'), (2, 'a'), (2, 'a'), (3, 'a'), (3, 'a')] [next(group) for key, group in groupby(data, key=itemgetter(0))] </code></pre> Output: <pre class="prettyprint"><code>[(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a'), (2, 'a'), (3, 'a')] </code></pre> For completeness, an iterative approach based on other answers: <pre class="prettyprint"><code>result = [] for first, second in zip(data, data[1:]): if first[0] != second[0]: result.append(first) result </code></pre> Output: <pre class="prettyprint"><code>[(1, 'a'), (2, 'b'), (3, 'a'), (4, 'a'), (2, 'a')] </code></pre> Note that this keeps the last duplicate, instead of the first.

In order to remove consecutive duplicates, you could use <code>itertools.groupby</code>: <pre class="prettyprint"><code>l = [(1, 'a'), (2, 'a'), (2, 'a'), (3, 'a'), (4, 'a')] from itertools import groupby [tuple(k) for k, _ in groupby(l)] # [(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a')] </code></pre>

How to remove list items depending on predecessor in python

Tags:

python

list

Given a Python list, I want to remove consecutive 'duplicates'. The duplicate value however is a attribute of the list item (In this example, the tuple's first element).

Input:

[(1, 'a'), (2, 'b'), (2, 'b'), (2, 'c'), (3, 'd'), (2, 'e')]

Desired Output:

[(1, 'a'), (2, 'b'), (3, 'd'), (2, 'e')]

Cannot use set or dict, because order is important.

Cannot use list comprehension [x for x in somelist if not determine(x)], because the check depends on predecessor.

What I want is something like:

mylist = [...]

for i in range(len(mylist)):
    if mylist[i-1].attr == mylist[i].attr:
        mylist.remove(i)

What is the preferred way to solve this in Python?

276

asked Apr 17 '19 08:04

Sparkofska

3 Answers

You can use itertools.groupby (demonstration with more data):

from itertools import groupby
from operator import itemgetter

data = [(1, 'a'), (2, 'a'), (2, 'b'), (3, 'a'), (4, 'a'), (2, 'a'), (2, 'a'), (3, 'a'), (3, 'a')]

[next(group) for key, group in groupby(data, key=itemgetter(0))]

Output:

[(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a'), (2, 'a'), (3, 'a')]

For completeness, an iterative approach based on other answers:

result = []

for first, second in zip(data, data[1:]):
    if first[0] != second[0]:
        result.append(first)

result

Output:

[(1, 'a'), (2, 'b'), (3, 'a'), (4, 'a'), (2, 'a')]

Note that this keeps the last duplicate, instead of the first.

answered Oct 16 '22 04:10

gmds

In order to remove consecutive duplicates, you could use itertools.groupby:

l = [(1, 'a'), (2, 'a'), (2, 'a'), (3, 'a'), (4, 'a')]
from itertools import groupby
[tuple(k) for k, _ in groupby(l)]
# [(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a')]

answered Oct 16 '22 02:10

yatu

If I am not mistaken, you only need to lookup the last value.

test = [(1, 'a'), (2, 'a'), (2, 'a'), (3, 'a'), (4, 'a'),(3, 'a'),(4,"a"),(4,"a")]

result = []

for i in test:
    if result and i[0] == result[-1][0]: #edited since OP considers (1,"a") and (1,"b") as duplicate
    #if result and i == result[-1]:
        continue
    else:
        result.append(i)

print (result)

Output:

[(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a'), (3, 'a'), (4, 'a')]

answered Oct 16 '22 02:10

Henry Yik

Related questions
                            
                                Py2Exe: DLL load failed
                            
                                Turtle Graphics Not Responding
                            
                                TypeError: ‘DoesNotExist’ object is not callable
                            
                                How to maintain state in Python without classes?
                            
                                Where is BeautifulSoup4 hiding?
                            
                                Python Progress Bar THROUGH Logging Module
                            
                                LDA model generates different topics everytime i train on the same corpus
                            
                                No handlers could be found for logger "apscheduler.scheduler"
                            
                                Why does pressing Ctrl-backslash result in core dump?
                            
                                pip, proxy authentication and "Not supported proxy scheme"
                            
                                Django custom command error: unrecognized arguments
                            
                                sklearn: how to get coefficients of polynomial features
                            
                                How do I use Python and lxml to parse a local html file?
                            
                                Add newline to string, cross-platform
                            
                                How to install python module extras with pip requirements.txt file
                            
                                What is a practical difference between check_call check_output call, and Popen methods in the subprocess module?
                            
                                django TypeError: get() got multiple values for keyword argument 'invoice_id'
                            
                                How do I print the local and remote address and port of a connected socket?
                            
                                Scrapy: HTTP status code is not handled or not allowed?
                            
                                How to check if a pandas dataframe contains only numeric column wise?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to remove list items depending on predecessor in python

Tags:

python

list

Sparkofska

People also ask

3 Answers

gmds

yatu

Henry Yik

Recent Activity

Donate For Us