I want to remove some entries from a numpy array that is about a million entries long.
This code would do it but take a long time:
a = np.array([1,45,23,23,1234,3432,-1232,-34,233])
for element in a:
if element<(-100) or element>100:
some delete command.
Can I do this any other way?
pop() function: This method is used to remove elements from the end of an array. shift() function: This method is used to remove elements from the start of an array. splice() function: This method is used to remove elements from the specific index of an array.
You can use the pop() method to remove an element from the array.
To remove rows containing missing values, use any() method that returns True if there is at least one True in ndarray . With the argument axis=1 , any() tests whether there is at least one True for each row. Use the negation operator ~ to make rows with no missing values True .
I'm assuming you mean a < -100 or a > -100
, the most concise way is to use logical indexing.
a = a[(a >= -100) & (a <= 100)]
This is not exactly "deleting" the entries, rather making a copy of the array minus the unwanted values and assigning it to the variable that was previously assigned to the old array. After this happens the old array has no remaining references and is garbage collected, meaning its memory is freed.
It's worth noting that this method does not use constant memory, since we make a copy of the array it uses memory linear in the size of the array. This could be bad if your array is so huge it reaches the limits of the memory on your machine. The process of actually going through and removing each element in the array "in place", aka using constant memory, would be a very different operation, as elements in the array would need to be swapped around and the block of memory resized. I'm not sure you can do this with a numpy
array, however one thing you can do to avoid copying is to use a numpy
masked array:
import numpy.ma as ma
mx = ma.masked_array(a, mask = ((a < -100) | (a > 100)) )
All operations on the masked array will act as if the elements we "deleted" don't exist, but we didn't really "delete" them, they are still there in memory, there is just a record of which elements to skip now associated with the array, and we don't ever need to make a copy of the array in memory. Also if we ever want our deleted values back, we can just remove the mask like so:
mx.mask = ma.nomask
You can use masked index with inversed condition.
>>> a = np.array([1,45,23,23,1234,3432,-1232,-34,233])
>>> a[~((a < -100) | (a > 100))]
array([ 1, 45, 23, 23, -34])
>>> a[(a >= -100) & (a <= 100)]
array([ 1, 45, 23, 23, -34])
>>> a[abs(a) <= 100]
array([ 1, 45, 23, 23, -34])
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With