Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract subarray between certain value in Python

I have a list of values that are the result of merging many files. I need to pad some of the values. I know that each sub-section begins with the value -1. I am trying to basically extract a sub-array between -1's in the main array via iteration.

For example supposed this is the main list:

-1 1 2 3 4 5 7 -1 4 4 4 5 6 7 7 8 -1 0 2 3 5 -1

I would like to extract the values between the -1s:

list_a = 1 2 3 4 5 7
list_b = 4 4 4 5 6 7 7 8
list_c = 0 2 3 5 ...
list_n = a1 a2 a3 ... aM

I have extracted the indices for each -1 by searching through the main list:

minus_ones = [i for i, j in izip(count(), q) if j == -1]

I also assembled them as pairs using a common recipe:

def pairwise(iterable):
    a, b = tee(iterable)
    next(b, None)
    return izip(a,b)

for index in pairwise(minus_ones):
    print index

The next step I am trying to do is grab the values between the index pairs, for example:

 list_b: (7 , 16) -> 4 4 4 5 6 7 7 8 

so I can then do some work to those values (I will add a fixed int. to each value in each sub-array).

like image 652
user2221667 Avatar asked Jan 30 '14 23:01

user2221667


2 Answers

You mentioned numpy in the tags. If you're using it, have a look at np.split.

For example:

import numpy as np

x = np.array([-1, 1, 2, 3, 4, 5, 7, -1, 4, 4, 4, 5, 6, 7, 7, 8, -1, 0, 2,
               3, 5, -1])
arrays = np.split(x, np.where(x == -1)[0])
arrays = [item[1:] for item in arrays if len(item) > 1]

This yields:

[array([1, 2, 3, 4, 5, 7]),
 array([4, 4, 4, 5, 6, 7, 7, 8]),
 array([0, 2, 3, 5])]

What's going on is that where will yield an array (actually a tuple of arrays, therefore the where(blah)[0]) of the indicies where the given expression is true. We can then pass these indicies to split to get a sequence of arrays.

However, the result will contain the -1's and an empty array at the start, if the sequence starts with -1. Therefore, we need to filter these out.

If you're not already using numpy, though, your (or @DSM's) itertools solution is probably a better choice.

like image 89
Joe Kington Avatar answered Sep 18 '22 20:09

Joe Kington


If you only need the groups themselves and don't care about the indices of the groups (you could always reconstruct them, after all), I'd use itertools.groupby:

>>> from itertools import groupby
>>> seq = [-1, 1, 2, 3, 4, 5, 7, -1, 4, 4, 4, 5, 6, 7, 7, 8, -1, 0, 2, 3, 5, -1]
>>> groups = [list(g) for k,g in groupby(seq, lambda x: x != -1) if k]
>>> groups
[[1, 2, 3, 4, 5, 7], [4, 4, 4, 5, 6, 7, 7, 8], [0, 2, 3, 5]]

I missed the numpy tags, though: if you're working with numpy arrays, using np.split/np.where is a better choice.

like image 39
DSM Avatar answered Sep 18 '22 20:09

DSM