Working in pandas, I have created a list of tuples representing a range of rows around a given set of index points:
mask = df.loc[df['Illustration']=='Example'].index
idxlist = [(i-1,i+10) for i in mask]
idxlist
[(2, 13), (48, 59), (120, 131),...]
I want to use the values from this list of tuples as the range slice indices to call np.r_
, which takes a list of this type:
df.iloc[np.r_[2:13, 48:59, 120:131,...]
I can pass my list of tuples through the slice
function:
slicelist = [slice(*(idxlist[j])) for j in range(len(idxlist))]
BUT slice
and np.r_
are not (as far as I gather) mutually compatible.
So I'm looking for either a way to convert a list of tuples into a list of slice ranges OR a way to generate a list of slice ranges using a list comprehension, similar to what I did to make idxslice
above. I know I can find some very unelegant ways of doing this, but I'm looking for the most pythonic way, and preferably without a loop. Thanks.
In [208]: alist = [(2, 13), (48, 59), (120, 131)]
r_
turns a list of slices into indices, using index notation (it's actually a class instance with an __getitem__
method. The interpreter converts the n:m
into slice(n,m)
, but r_
then converts that into arange(n,m)
.
In [209]: np.r_[2:13, 48:59, 120:131]
Out[209]:
array([ 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 48, 49,
50, 51, 52, 53, 54, 55, 56, 57, 58, 120, 121, 122, 123,
124, 125, 126, 127, 128, 129, 130])
s_
can use the same input, but makes slice objects:
In [211]: np.s_[2:13, 48:59, 120:131]
Out[211]: (slice(2, 13, None), slice(48, 59, None), slice(120, 131, None))
which is the same as (and with the same iteration):
In [212]: [slice(i,j) for i,j in alist]
Out[212]: [slice(2, 13, None), slice(48, 59, None), slice(120, 131, None)]
replacing slice
with arange
:
In [213]: [np.arange(i,j) for i,j in alist]
Out[213]:
[array([ 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]),
array([48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58]),
array([120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130])]
and joining them produces the same thing as r_
:
In [214]: np.hstack(_)
Out[214]:
array([ 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 48, 49,
50, 51, 52, 53, 54, 55, 56, 57, 58, 120, 121, 122, 123,
124, 125, 126, 127, 128, 129, 130])
r_
is pretty, but computationally it is just the same. There's nothing in-elegant or un-pythonic about a list comprehension like this.
Since each range has the same length (11 values), we could also use linspace
:
In [220]: np.linspace((2,48,120),(13,59,131),11,endpoint=False, dtype=int)
Out[220]:
array([[ 2, 48, 120],
[ 3, 49, 121],
[ 4, 50, 122],
[ 5, 51, 123],
[ 6, 52, 124],
[ 7, 53, 125],
[ 8, 54, 126],
[ 9, 55, 127],
[ 10, 56, 128],
[ 11, 57, 129],
[ 12, 58, 130]])
In [221]: np.hstack(_.T)
Out[221]:
array([ 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 48, 49,
50, 51, 52, 53, 54, 55, 56, 57, 58, 120, 121, 122, 123,
124, 125, 126, 127, 128, 129, 130])
You could still use r_
and alist
(but using arange
is more direct):
In [225]: np.r_.__getitem__(tuple([slice(i,j) for i,j in alist]))
Out[225]:
array([ 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 48, 49,
50, 51, 52, 53, 54, 55, 56, 57, 58, 120, 121, 122, 123,
124, 125, 126, 127, 128, 129, 130])
np.r_
is just a concatenate
disguised as indexing (with some added bells):
np.r_[tuple([np.arange(i,j) for i,j in alist])]
np.hstack([np.arange(i,j) for i,j in alist])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With