Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the faster way to count occurrences of equal sublists in a nested list?

Tags:

python

I have a list of lists in Python and I want to (as fastly as possible : very important...) append to each sublist the number of time it appear into the nested list.

I have done that with some pandas data-frame, but this seems to be very slow and I need to run this lines on very very large scale. I am completely willing to sacrifice nice-reading code to efficient one.

So for instance my nested list is here:

l = [[1, 3, 2], [1, 3, 2] ,[1, 3, 5]]

I need to have:

res = [[1, 3, 2, 2], [1, 3, 5, 1]]

EDIT

Order in res does not matter at all.

like image 980
Léo Joubert Avatar asked Jan 25 '19 10:01

Léo Joubert


People also ask

How do you count nested lists?

A nested list, l , is taken with sub-lists. Count variable is used to count the number of even numbers in the list and is initialized to zero. The nested for loop is used to access the elements in the list. Variable i is used to access sub-lists in the list, and variable j is used to access elements in sub-list i .

How do you count occurrences of a value in a list?

Operator. countOf() is used for counting the number of occurrences of b in a. It counts the number of occurrences of value. It returns the Count of a number of occurrences of value.


3 Answers

If order does not matter you could use collections.Counter with extended iterable unpacking, as a variant of @Chris_Rands solution:

from collections import Counter

l = [[1, 3, 2], [1, 3, 2] ,[1, 3, 5]]

result = [[*t, count] for t, count in Counter(map(tuple, l)).items()]
print(result)

Output

[[1, 3, 5, 1], [1, 3, 2, 2]]
like image 177
Dani Mesejo Avatar answered Oct 27 '22 09:10

Dani Mesejo


This is quite an odd output to want but it is of course possible. I suggest using collections.Counter(), no doubt others will make different suggestions and a timeit style comparison would reveal the fastest of course for particular data sets:

>>> from collections import Counter
>>> l = [[1, 3, 2], [1, 3, 2] ,[1, 3, 5]]
>>> [list(k) + [v] for k, v in Counter(map(tuple,l)).items()]
[[1, 3, 2, 2], [1, 3, 5, 1]]

Note to preserve the insertion order prior to CPython 3.6 / Python 3.7, use the OrderedCounter recipe.

like image 8
Chris_Rands Avatar answered Oct 27 '22 09:10

Chris_Rands


If numpy is an option, you could use np.unique setting axis to 0 and return_counts to True, and concatenate the unique rows and counts using np.vstack:

l = np.array([[1, 3, 2], [1, 3, 2] ,[1, 3, 5]])
x, c = np.unique(l, axis=0, return_counts=True)
np.vstack([x.T,c]).T

array([[1, 3, 2, 2],
       [1, 3, 5, 1]])
like image 1
yatu Avatar answered Oct 27 '22 08:10

yatu