I've been looking for a way to efficiently check for duplicates in a numpy array and stumbled upon a question that contained an answer using this code.
What does this line mean in numpy?
s[s[1:] == s[:-1]]
Would like to understand the code before applying it. Looked in the Numpy doc but had trouble finding this information.
The -1 stands for "unknown dimension" which can be inferred from another dimension. In this case, if you set your matrix like this: a = numpy.matrix([[1, 2, 3, 4], [5, 6, 7, 8]])
Slicing in python means taking elements from one given index to another given index. We pass slice instead of index like this: [start:end] . We can also define the step, like this: [start:end:step] .
Slicing an array You can slice a numpy array is a similar way to slicing a list - except you can do it in more than one dimension. As with indexing, the array you get back when you index or slice a numpy array is a view of the original array. It is the same data, just accessed in a different order.
The slices [1:]
and [:-1]
mean all but the first and all but the last elements of the array:
>>> import numpy as np
>>> s = np.array((1, 2, 2, 3)) # four element array
>>> s[1:]
array([2, 2, 3]) # last three elements
>>> s[:-1]
array([1, 2, 2]) # first three elements
therefore the comparison generates an array of boolean comparisons between each element s[x]
and its "neighbour" s[x+1]
, which will be one shorter than the original array (as the last element has no neighbour):
>>> s[1:] == s[:-1]
array([False, True, False], dtype=bool)
and using that array to index the original array gets you the elements where the comparison is True
, i.e. the elements that are the same as their neighbour:
>>> s[s[1:] == s[:-1]]
array([2])
Note that this only identifies adjacent duplicate values.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With