I'm fairly new to Python, and very new to Numpy.
So far I have an ndarray of data where is a list of lists, and I have an array of indexes. How can I remove every row who's index is inside of the array of indexes and put that row into a new ndarray?
For example, my data looks like
[[1 1 1 1]
[2 3 4 5]
[6 7 8 9]
[2 2 2 2]]
and my index array is
[0 2]
I would want two get two arrays, one of
[[1 1 1 1]
[6 7 8 9]]
and
[[2 3 4 5]
[2 2 2 2]]
Extended example, for clarity: For example, my data looks like
[[1 1 1 1]
[2 3 4 5]
[6 7 8 9]
[2 2 2 2]
[3 3 3 3]
[4 4 4 4]
[5 5 5 5]
[6 6 6 6]
[7 7 7 7]]
and my index array is
[0 2 3 5]
I would want two get two arrays, one of
[[1 1 1 1]
[6 7 8 9]
[2 2 2 2]
[4 4 4 4]]
and
[[2 3 4 5]
[3 3 3 3]
[5 5 5 5]
[6 6 6 6]
[7 7 7 7]]
I have looked into numpy.take() and numpy.choose() but I could not figure it out. Thanks!
edit: I should also add that my input data and index array are of variable length, depending on the data-sets. I would like a solution that would work for variable sizes.
Use the hsplit() method to split the 2-D array into three 2-D arrays along rows. Note: Similar alternates to vstack() and dstack() are available as vsplit() and dsplit() .
The numpy. array_split() method in Python is used to split an array into multiple sub-arrays of equal size. In Python, an array is a data structure that is used to store multiple items of the same type together.
The vsplit() function is used to split an array into multiple sub-arrays vertically (row-wise). Note: vsplit is equivalent to split with axis=0 (default), the array is always split along the first axis regardless of the array dimension.
Sorry, so you already have take
and basically need the opposite of take
, you can get that with some indexing nicely:
a = np.arange(16).reshape((8,2))
b = [2, 6, 7]
mask = np.ones(len(a), dtype=bool)
mask[b,] = False
x, y = a[b], a[mask] # instead of a[b] you could also do a[~mask]
print x
array([[ 4, 5],
[12, 13],
[14, 15]])
print y
array([[ 0, 1],
[ 2, 3],
[ 6, 7],
[ 8, 9],
[10, 11]])
So you just create a boolean mask that is True wherever b
would not select from a
.
There is actually already np.split
which handles this (its pure python code, but that should not really bother you):
>>> a = np.arange(16).reshape((8,2))
>>> b = [2, 6]
>>> print np.split(a, b, axis=0) # plus some extra formatting
[array([[0, 1],
[2, 3]]),
array([[ 4, 5],
[ 6, 7],
[ 8, 9],
[10, 11]]),
array([[12, 13],
[14, 15]])]
split always includes the slice from 0:b[0]
and b[0]:
, I guess you can just slice them out of the results for simplicity. If you have regular splits of course (all the same size), you may just be better of with using reshape
.
Note also that this returns views. So if you change those arrays you change the original unless you call .copy
first.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With