Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

np.delete and np.s_. What's so special about np_s?

I don't really understand why regular indexing can't be used for np.delete. What makes np.s_ so special?

For example with this code, used to delete the some of the rows of this array..

inlet_names = np.delete(inlet_names, np.s_[1:9], axis = 0)

Why can't I simply use regular indexing and do..

inlet_names = np.delete(inlet_names, [1:9], axis = 0)

or

inlet_names = np.delete(inlet_names, inlet_names[1:9], axis = 0)

From what I can gather, np.s_ is the same as np.index_exp except it doesn't return a tuple, but both can be used anywhere in Python code.

Then when I look into the np.delete function, it indicates that you can use something like [1,2,3] to delete those specific indexes along the entire array. So whats preventing me from using something similar to delete certain rows or columns from the array?

I'm simply assuming that this type of indexing is read as something else in np.delete so you need to use np.s_ in order to specify, but I can't get to the bottom of what exactly it would be reading it as because when I try the second piece of code it simply returns "invalid syntax". Which is weird because this code works...

inlet_names = np.delete(inlet_names, [1,2,3,4,5,6,7,8,9], axis = 0)   

So I guess the answer could possibly be that np.delete only accepts a list of the indexes that you would like to delete. And that np._s returns a list of the indexes that you specify for the slice.

Just could use some clarification and some corrections on anything I just said about the functions that may be wrong, because a lot of this is just my take, the documents don't exactly explain everything that I was trying to understand. I think I'm just overthinking this, but I would like to actually understand it, if someone could explain it.

like image 698
Greg Castaldi Avatar asked Sep 20 '15 18:09

Greg Castaldi


People also ask

How do you remove the first element of a NumPy array?

Now to remove the first element from the array create a index array with elements as indexes of all the elements except for the first element. Then pass this index array as index to the given array, This will give an array with first element removed.

How does NP delete work?

The numpy. delete() function returns a new array with the deletion of sub-arrays along with the mentioned axis. Return : An array with sub-array being deleted as per the mentioned object along a given axis.

How do I delete multiple rows in NumPy?

delete() – The numpy. delete() is a function in Python which returns a new array with the deletion of sub-arrays along with the mentioned axis. By keeping the value of the axis as zero, there are two possible ways to delete multiple rows using numphy. delete().

How do you remove a column from a NumPy array?

Using the NumPy function np. delete() , you can delete any row and column from the NumPy array ndarray . Specify the axis (dimension) and position (row number, column number, etc.). It is also possible to select multiple rows and columns using a slice or a list.


1 Answers

np.delete is not doing anything unique or special. It just returns a copy of the original array with some items missing. Most of the code just interprets the inputs in preparation to make this copy.

What you are asking about is the obj parameter

obj : slice, int or array of ints

In simple terms, np.s_ lets you supply a slice using the familiar : syntax. The x:y notation cannot be used as a function parameter.

Let's try your alternatives (you allude to these in results and errors, but they are buried in the text):

In [213]: x=np.arange(10)*2   # some distinctive values

In [214]: np.delete(x, np.s_[3:6])
Out[214]: array([ 0,  2,  4, 12, 14, 16, 18])

So delete with s_ removes a range of values, namely 6 8 10, the 3rd through 5th ones.

In [215]: np.delete(x, [3:6])
  File "<ipython-input-215-0a5bf5cc05ba>", line 1
    np.delete(x, [3:6])
                   ^
SyntaxError: invalid syntax

Why the error? Because [3:4] is an indexing expression. np.delete is a function. Even s_[[3:4]] has problems. np.delete(x, 3:6) is also bad, because Python only accepts the : syntax in an indexing context, where it automatically translates it into a slice object. Note that is is a syntax error, something that the interpreter catches before doing any calculations or function calls.

In [216]: np.delete(x, slice(3,6))
Out[216]: array([ 0,  2,  4, 12, 14, 16, 18])

A slice works instead of s_; in fact that is what s_ produces

In [233]: np.delete(x, [3,4,5])
Out[233]: array([ 0,  2,  4, 12, 14, 16, 18])

A list also works, though it works in different way (see below).

In [217]: np.delete(x, x[3:6])
Out[217]: array([ 0,  2,  4,  6,  8, 10, 14, 18])

This works, but produces are different result, because x[3:6] is not the same as range(3,6). Also the np.delete does not work like the list delete. It deletes by index, not by matching value.

np.index_exp fails for the same reason that np.delete(x, (slice(3,6),)) does. 1, [1], (1,) are all valid and remove one item. Even '1', the string, works. delete parses this argument, and at this level, expects something that can be turned into an integer. obj.astype(intp). (slice(None),) is not a slice, it is a 1 item tuple. So it's handled in a different spot in the delete code. This is TypeError produced by something that delete calls, very different from the SyntaxError. In theory delete could extract the slice from the tuple and proceed as in the s_ case, but the developers did not choose to consider this variation.

A quick study of the code shows that np.delete uses 2 distinct copying methods - by slice and by boolean mask. If the obj is a slice, as in our example, it does (for 1d array):

out = np.empty(7)
out[0:3] = x[0:3]
out[3:7] = x[6:10]

But with [3,4,5] (instead of the slice) it does:

keep = np.ones((10,), dtype=bool)
keep[[3,4,5]] = False
return x[keep]

Same result, but with a different construction method. x[np.array([1,1,1,0,0,0,1,1,1,1],bool)] does the same thing.

In fact boolean indexing or masking like this is more common than np.delete, and generally just as powerful.


From the lib/index_tricks.py source file:

index_exp = IndexExpression(maketuple=True)
s_ = IndexExpression(maketuple=False)

They are slighly different versions of the same thing. And both are just convenience functions.

In [196]: np.s_[1:4]
Out[196]: slice(1, 4, None)
In [197]: np.index_exp[1:4]
Out[197]: (slice(1, 4, None),)
In [198]: np.s_[1:4, 5:10]
Out[198]: (slice(1, 4, None), slice(5, 10, None))
In [199]: np.index_exp[1:4, 5:10]
Out[199]: (slice(1, 4, None), slice(5, 10, None))

The maketuple business applies only when there is a single item, a slice or index.

like image 127
hpaulj Avatar answered Sep 27 '22 18:09

hpaulj