Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to group identical values from a list into their own lists?

Say I have a list [2, 3, 7, 2, 3, 8, 7, 3]

I would like to produce lists that contain identical values from the list above.

Expected Output something like:

[2, 2]
[3, 3, 3]
[7, 7]
[8]

The order that these lists are produced doesn't matter.

like image 486
Oamar Kanji Avatar asked Nov 29 '22 08:11

Oamar Kanji


1 Answers

Best approach is an O(n) solution with collections.defaultdict:

>>> l = [2, 3, 7, 2, 3, 8, 7, 3]
>>> d = defaultdict(list)
>>> for e in l:
...     d[e].append(e)
... 
>>> d
defaultdict(<class 'list'>, {2: [2, 2], 3: [3, 3, 3], 7: [7, 7], 8: [8]})
>>> d.values()
dict_values([[2, 2], [3, 3, 3], [7, 7], [8]])

Alternatively you can use itertools.groupby with sorted list:

>>> for _, l in itertools.groupby(sorted(l)):
...     print(list(l))
... 
[2, 2]
[3, 3, 3]
[7, 7]
[8]

Or a list comprehension with collections.Counter:

>>> from collections import Counter
>>> [[i]*n for i,n in Counter(l).items()]
[[2, 2], [3, 3, 3], [7, 7], [8]]

As I post, the defaultdict solution is O(n) and faster than the other aproaches. Here are the tests:

from timeit import timeit


setup = (
"from collections import Counter, defaultdict;"
"from itertools import groupby;"
"l = [2, 3, 7, 2, 3, 8, 7, 3];"
)

defaultdict_call = (
"d = defaultdict(list); "
"\nfor e in l: d[e].append(e);"
)
groupby_call = "[list(g) for _,g in groupby(sorted(l))]"
counter_call = "[[i]*n for i,n in Counter(l).items()]"


for call in (defaultdict_call, groupby_call, counter_call):
  print(call)
  print(timeit(call, setup))

Results:

d = defaultdict(list); 
for e in l: d[e].append(e);
7.02662614302244
[list(g) for _,g in groupby(sorted(l))]
10.126392606005538
[[i]*n for i,n in Counter(l).items()]
19.55539561196929

Here is the live test

like image 161
Netwave Avatar answered Dec 04 '22 00:12

Netwave