Operator. countOf() is used for counting the number of occurrences of b in a. It counts the number of occurrences of value. It returns the Count of a number of occurrences of value.
Just for information, In python 2.7+, we can use Counter
import collections
x=[1, 2, 3, 5, 6, 7, 5, 2]
>>> x
[1, 2, 3, 5, 6, 7, 5, 2]
>>> y=collections.Counter(x)
>>> y
Counter({2: 2, 5: 2, 1: 1, 3: 1, 6: 1, 7: 1})
Unique List
>>> list(y)
[1, 2, 3, 5, 6, 7]
Items found more than 1 time
>>> [i for i in y if y[i]>1]
[2, 5]
Items found only one time
>>> [i for i in y if y[i]==1]
[1, 3, 6, 7]
Use the in
operator instead of calling __contains__
directly.
What you have almost works (but is O(n**2)):
for i in xrange(len(list_a)):
for j in xrange(i + 1, len(list_a)):
if list_a[i] == list_a[j]:
print "duplicate:", list_a[i]
But it's far easier to use a set (roughly O(n) due to the hash table):
seen = set()
for n in list_a:
if n in seen:
print "duplicate:", n
else:
seen.add(n)
Or a dict, if you want to track locations of duplicates (also O(n)):
import collections
items = collections.defaultdict(list)
for i, item in enumerate(list_a):
items[item].append(i)
for item, locs in items.iteritems():
if len(locs) > 1:
print "duplicates of", item, "at", locs
Or even just detect a duplicate somewhere (also O(n)):
if len(set(list_a)) != len(list_a):
print "duplicate"
You could always use a list comprehension:
dups = [x for x in list_a if list_a.count(x) > 1]
Before Python 2.3, use dict() :
>>> lst = [1, 2, 3, 5, 6, 7, 5, 2]
>>> stats = {}
>>> for x in lst : # count occurrences of each letter:
... stats[x] = stats.get(x, 0) + 1
>>> print stats
{1: 1, 2: 2, 3: 1, 5: 2, 6: 1, 7: 1} # filter letters appearing more than once:
>>> duplicates = [dup for (dup, i) in stats.items() if i > 1]
>>> print duplicates
So a function :
def getDuplicates(iterable):
"""
Take an iterable and return a generator yielding its duplicate items.
Items must be hashable.
e.g :
>>> sorted(list(getDuplicates([1, 2, 3, 5, 6, 7, 5, 2])))
[2, 5]
"""
stats = {}
for x in iterable :
stats[x] = stats.get(x, 0) + 1
return (dup for (dup, i) in stats.items() if i > 1)
With Python 2.3 comes set(), and it's even a built-in after than :
def getDuplicates(iterable):
"""
Take an iterable and return a generator yielding its duplicate items.
Items must be hashable.
e.g :
>>> sorted(list(getDuplicates([1, 2, 3, 5, 6, 7, 5, 2])))
[2, 5]
"""
try: # try using built-in set
found = set()
except NameError: # fallback on the sets module
from sets import Set
found = Set()
for x in iterable:
if x in found : # set is a collection that can't contain duplicate
yield x
found.add(x) # duplicate won't be added anyway
With Python 2.7 and above, you have the collections
module providing the very same function than the dict one, and we can make it shorter (and faster, it's probably C under the hood) than solution 1 :
import collections
def getDuplicates(iterable):
"""
Take an iterable and return a generator yielding its duplicate items.
Items must be hashable.
e.g :
>>> sorted(list(getDuplicates([1, 2, 3, 5, 6, 7, 5, 2])))
[2, 5]
"""
return (dup for (dup, i) in collections.counter(iterable).items() if i > 1)
I'd stick with solution 2.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With