Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split a list into increasing sequences using itertools

I have a list with mixed sequences like

[1,2,3,4,5,2,3,4,1,2]

I want to know how I can use itertools to split the list into increasing sequences cutting the list at decreasing points. For instance the above would output

[[1, 2, 3, 4, 5], [2, 3, 4], [1, 2]]

this has been obtained by noting that the sequence decreases at 2 so we cut the first bit there and another decrease is at one cutting again there.

Another example is with the sequence

[3,2,1]

the output should be

[[3], [2], [1]]

In the event that the given sequence is increasing we return the same sequence. For example

[1,2,3]

returns the same result. i.e

[[1, 2, 3]]

For a repeating list like

[ 1, 2,2,2, 1, 2, 3, 3, 1,1,1, 2, 3, 4, 1, 2, 3, 4, 5, 6]

the output should be

[[1, 2, 2, 2], [1, 2, 3, 3], [1, 1, 1, 2, 3, 4], [1, 2, 3, 4, 5, 6]]

What I did to achieve this is define the following function

def splitter (L):
    result = []
    tmp = 0
    initialPoint=0
    for i in range(len(L)):
       if (L[i] < tmp):
           tmpp = L[initialPoint:i]
           result.append(tmpp)
           initialPoint=i
       tmp = L[i]
    result.append(L[initialPoint:])
    return result

The function is working 100% but what I need is to do the same with itertools so that I can improve efficiency of my code. Is there a way to do this with itertools package to avoid the explicit looping?

like image 961
Mafeni Alpha Avatar asked Feb 14 '17 22:02

Mafeni Alpha


3 Answers

With numpy, you can use numpy.split, this requires the index as split positions; since you want to split where the value decreases, you can use numpy.diff to calculate the difference and check where the difference is smaller than zero and use numpy.where to retrieve corresponding indices, an example with the last case in the question:

import numpy as np
lst = [ 1, 2,2,2, 1, 2, 3, 3, 1,1,1, 2, 3, 4, 1, 2, 3, 4, 5, 6]
np.split(lst, np.where(np.diff(lst) < 0)[0] + 1)

# [array([1, 2, 2, 2]),
#  array([1, 2, 3, 3]),
#  array([1, 1, 1, 2, 3, 4]),
#  array([1, 2, 3, 4, 5, 6])]
like image 196
Psidom Avatar answered Oct 01 '22 19:10

Psidom


Psidom already has you covered with a good answer, but another NumPy solution would be to use scipy.signal.argrelmax to acquire the local maxima, then np.split.

from scipy.signal import argrelmax
arr = np.random.randint(1000, size=10**6)
splits = np.split(arr, argrelmax(arr)[0]+1)
like image 41
miradulo Avatar answered Oct 01 '22 18:10

miradulo


Assume your original input array:

a = [1, 2, 3, 4, 5, 2, 3, 4, 1, 2]

First find the places where the splits shall occur:

p = [ i+1 for i, (x, y) in enumerate(zip(a, a[1:])) if x > y ]

Then create slices for each such split:

print [ a[m:n] for m, n in zip([ 0 ] + p, p + [ None ]) ]

This will print this:

[[1, 2, 3, 4, 5], [2, 3, 4], [1, 2]]

I propose to use more speaking names than p, n, m, etc. ;-)

like image 38
Alfe Avatar answered Oct 01 '22 17:10

Alfe