Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find duplicate values in a list and merge them

So basically for example of you have a list like:

l = ['a','b','a','b','c','c']

The output should be:

[['a','a'],['b','b'],['c','c']]

So basically put together the values that are duplicated into a list,

I tried:

l = ['a','b','a','b','c','c']
it=iter(sorted(l))
next(it)
new_l=[]
for i in sorted(l):
   new_l.append([])
   if next(it,None)==i:
      new_l[-1].append(i)
   else:
      new_l.append([])

But doesn't work, and if it does work it is not gonna be efficient

like image 571
U12-Forward Avatar asked Jan 28 '26 13:01

U12-Forward


2 Answers

Use collections.Counter:

from collections import Counter

l = ['a','b','a','b','c','c']
c = Counter(l)

print([[x] * y for x, y in c.items()])
# [['a', 'a'], ['b', 'b'], ['c', 'c']]
like image 158
Austin Avatar answered Jan 31 '26 03:01

Austin


You can use collections.Counter:

from collections import Counter
[[k] * c for k, c in Counter(l).items()]

This returns:

[['a', 'a'], ['b', 'b'], ['c', 'c']]

%%timeit comparison

  • Given a sample dataset of 100000 values, this answer is the fastest approach.

enter image description here

like image 40
blhsing Avatar answered Jan 31 '26 01:01

blhsing



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!