Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fastest way to zero out low values in array?

So, lets say I have 100,000 float arrays with 100 elements each. I need the highest X number of values, BUT only if they are greater than Y. Any element not matching this should be set to 0. What would be the fastest way to do this in Python? Order must be maintained. Most of the elements are already set to 0.

sample variables:

array = [.06, .25, 0, .15, .5, 0, 0, 0.04, 0, 0] highCountX = 3 lowValY = .1 

expected result:

array = [0, .25, 0, .15, .5, 0, 0, 0, 0, 0] 
like image 873
David Avatar asked Oct 26 '09 09:10

David


People also ask

How do you set an array to zero value?

1. Using Initializer List. int arr[] = { 1, 1, 1, 1, 1 }; The array will be initialized to 0 if we provide the empty initializer list or just specify 0 in the initializer list.

Why filling an array from the front is slow?

Because filling array from front means the existing objects need to be pushed to the back first before adding the item at index 0? Thanks. That could be the possible reason. Inserting from front would require shifting rest of elements.

How do you count the number of zero elements in an array?

To count all the zeros in an array, simply use the np. count_nonzero() function checking for zeros. It returns the count of elements inside the array satisfying the condition (in this case, if it's zero or not). We get 2 as the output since there are two zero elements in the 1d array arr_1d .

How do you assign an array in C++?

A typical declaration for an array in C++ is: type name [elements]; where type is a valid type (such as int , float ...), name is a valid identifier and the elements field (which is always enclosed in square brackets [] ), specifies the length of the array in terms of the number of elements. int foo [5];


2 Answers

This is a typical job for NumPy, which is very fast for these kinds of operations:

array_np = numpy.asarray(array) low_values_flags = array_np < lowValY  # Where values are low array_np[low_values_flags] = 0  # All low values set to 0 

Now, if you only need the highCountX largest elements, you can even "forget" the small elements (instead of setting them to 0 and sorting them) and only sort the list of large elements:

array_np = numpy.asarray(array) print numpy.sort(array_np[array_np >= lowValY])[-highCountX:] 

Of course, sorting the whole array if you only need a few elements might not be optimal. Depending on your needs, you might want to consider the standard heapq module.

like image 113
Eric O Lebigot Avatar answered Sep 30 '22 03:09

Eric O Lebigot


from scipy.stats import threshold thresholded = threshold(array, 0.5) 

:)

like image 40
omygaudio Avatar answered Sep 30 '22 01:09

omygaudio