How to get the index and occurance of each item using itertools.groupby()

Question

Here's the story I have two lists:

list_one=[1,2,9,9,9,3,4,9,9,9,9,2]
list_two=["A","B","C","D","A","E","F","G","H","Word1","Word2"]

I want to find the indicies of consecutive 9's in list_one so that I can get corresponding string from list_two, I've tried:

group_list_one= [(k, sum(1 for i in g),pdn.index(k)) for k,g in groupby(list_one)]

I was hoping to get the index of the first 9 in each tuple and then try to go from there, but that did not work..

What can I do here?? P.S.: I've looked at the documentation of itertools but it seems very vague to me.. Thanks in advance

EDIT: Expected output is (key,occurances,index_of_first_occurance) something like

[(9, 3, 2), (9, 4, 7)]

tmrlvi · Accepted Answer

Okay, I have oneliner solution. It is ugly, but bear with me.

Let's consider the problem. We have a list that we want to sum up using itertools.groupby. groupby gives us a list of keys and iteration of their repetition. In this stage we can't calculate the index, but we can easily find the number of occurances.

[(key, len(list(it))) for (key, it) in itertools.groupby(list_one)]

Now, the real problem is that we want to calculate the indexes in relation to older data. In most oneliner common functions, we are only examining the current state. However, there is one function that let us take a glimpse at the past - reduce.

What reduce does, is to go over the iterator and execute a function with the last result of the function and the new item. For example reduce(lambda x,y: x*y, [2,3,4]) will calculate 2*3 = 6, and then 6*4=24 and return 24. In addition, you can choose another initial for x instead of the first item.

Let's use it here - for each item, the index will be the last index + the last number of occurences. In order to have a valid list, we'll use [(0,0,0)] as the initial value. (We get rid of it in the end).

reduce(lambda lst,item: lst + [(item[0], item[1], lst[-1][1] + lst[-1][-1])], 
       [(key, len(list(it))) for (key, it) in itertools.groupby(list_one)], 
       [(0,0,0)])[1:]

If we don't won't to add initial value, we can sum the numbers of occurrences that appeared so far.

reduce(lambda lst,item: lst + [(item[0], item[1], sum(map(lambda i: i[1], lst)))],
       [(key, len(list(it))) for (key, it) in itertools.groupby(list_one)], [])

Of course it gives us all the numbers. If we want only the 9's, we can wrap the whole thing in filter:

filter(lambda item: item[0] == 9, ... )

Steinar Lima · Answer

Judging by your expected output, give this a try:

from itertools import groupby

list_one=[1,2,9,9,9,3,4,9,9,9,9,2]
list_two=["A","B","C","D","A","E","F","G","H","Word1","Word2"]
data = zip(list_one, list_two)
i = 0
out = []

for key, group in groupby(data, lambda x: x[0]):
        number, word = next(group)
        elems = len(list(group)) + 1
        if number == 9 and elems > 1:
            out.append((key, elems, i))
        i += elems

print out

Output:

[(9, 3, 2), (9, 4, 7)]

But if you really wanted an output like this:

[(9, 3, 'C'), (9, 4, 'G')]

then look at this snippet:

from itertools import groupby

list_one=[1,2,9,9,9,3,4,9,9,9,9,2]
list_two=["A","B","C","D","A","E","F","G","H","Word1","Word2"]
data = zip(list_one, list_two)
out = []

for key, group in groupby(data, lambda x: x[0]):
    number, word = next(group)
    elems = len(list(group)) + 1
    if number == 9 and elems > 1:
        out.append((key, elems, word))

print out

How to get the index and occurance of each item using itertools.groupby()

Tags:

python

functional-programming

list-comprehension

zip

itertools

Aous1000

2 Answers

tmrlvi

Steinar Lima

Recent Activity

Donate For Us

How to get the index and occurance of each item using itertools.groupby()

Tags:

python

functional-programming

list-comprehension

zip

itertools

Aous1000

2 Answers

tmrlvi

Steinar Lima

Related questions

Recent Activity

Donate For Us