I wanted to find the non-unique elements in the list, but I am not able to figure out why this is not happening in the below code section.
>>> d = [1, 2, 1, 2, 4, 4, 5, 'a', 'b', 'a', 'b', 'c', 6,'f',3]
>>> for i in d:
... if d.count(i) == 1:
... d.remove(i)
...
>>> d
[1, 2, 1, 2, 4, 4, 'a', 'b', 'a', 'b', 6, 3]
6 and 3 should have been removed. where as, if I use
d = [1, 2, 1, 2, 4, 4, 5, 'a', 'b', 'a', 'b', 'c']
I am getting correct answer. Please explain what is happening, I am confused !!!
I am using python 2.7.5.
Removing elements of a list while iterating over it is never a good idea. The appropriate way to do this would be to use a collections.Counter
with a list comprehension:
>>> from collections import Counter
>>> d = [1, 2, 1, 2, 4, 4, 5, 'a', 'b', 'a', 'b', 'c', 6, 'f', 3]
>>> # Use items() instead of iteritems() in Python 3
>>> [k for (k,v) in Counter(d).iteritems() if v > 1]
['a', 1, 2, 'b', 4]
If you want keep the duplicate elements in the order in which they appear in your list:
>>> keep = {k for (k,v) in Counter(d).iteritems() if v > 1}
>>> [x for x in d if x in keep]
[1, 2, 1, 2, 4, 4, 'a', 'b', 'a', 'b']
I'll try to explain why your approach doesn't work. To understand why some elements aren't removed as they should be, imagine that we want to remove all b
s from the list [a, b, b, c]
while looping over it. It'll look something like this:
+-----------------------+ | a | b | b | c | +-----------------------+ ^ (first iteration) +-----------------------+ | a | b | b | c | +-----------------------+ ^ (next iteration: we found a 'b' -- remove it) +-----------------------+ | a | | b | c | +-----------------------+ ^ (removed b) +-----------------+ | a | b | c | +-----------------+ ^ (shift subsequent elements down to fill vacancy) +-----------------+ | a | b | c | +-----------------+ ^ (next iteration)
Notice that we skipped the second b
! Once we removed the first b
, elements were shifted down and our for
-loop consequently failed to touch every element of the list. The same thing happens in your code.
Better use collections.Counter():
>>> d = [1, 2, 1, 2, 4, 4, 5, 'a', 'b', 'a', 'b', 'c', 6,'f',3]
>>> from collections import Counter
>>> [k for k, v in Counter(d).iteritems() if v > 1]
['a', 1, 2, 'b', 4]
Also see relevant thread:
I just thought I would add my method with set comprehension if anyone was interested.
>>> d = [1, 2, 1, 2, 4, 4, 5, 'a', 'b', 'a', 'b', 'c', 6,'f',3]
>>> d = list({x for x in d if d.count(x) > 1})
>>> print d
['a', 1, 2, 'b', 4]
Python 2.7 and up I believe for the set comprehension functionality.
Thanks for all the answers and comments !
Thought for a while and got another answer in my previous way I have written the code. So, I am posting it.
d = [1, 2, 1, 2, 4, 4, 5, 'a', 'b', 'a', 'b', 'c', 6,'f',3]
e = d[:] # just a bit of trick/spice
>>> for i in d:
... if d.count(i) == 1:
... e.remove(i)
...
>>> e
[1, 2, 1, 2, 4, 4, 'a', 'b', 'a', 'b']
@arshajii, Your explanation led me to this trick. Thanks !
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With