So, lets say I have 100,000 float arrays with 100 elements each. I need the highest X number of values, BUT only if they are greater than Y. Any element not matching this should be set to 0. What would be the fastest way to do this in Python? Order must be maintained. Most of the elements are already set to 0.
sample variables:
array = [.06, .25, 0, .15, .5, 0, 0, 0.04, 0, 0] highCountX = 3 lowValY = .1
expected result:
array = [0, .25, 0, .15, .5, 0, 0, 0, 0, 0]
1. Using Initializer List. int arr[] = { 1, 1, 1, 1, 1 }; The array will be initialized to 0 if we provide the empty initializer list or just specify 0 in the initializer list.
Because filling array from front means the existing objects need to be pushed to the back first before adding the item at index 0? Thanks. That could be the possible reason. Inserting from front would require shifting rest of elements.
To count all the zeros in an array, simply use the np. count_nonzero() function checking for zeros. It returns the count of elements inside the array satisfying the condition (in this case, if it's zero or not). We get 2 as the output since there are two zero elements in the 1d array arr_1d .
A typical declaration for an array in C++ is: type name [elements]; where type is a valid type (such as int , float ...), name is a valid identifier and the elements field (which is always enclosed in square brackets [] ), specifies the length of the array in terms of the number of elements. int foo [5];
This is a typical job for NumPy, which is very fast for these kinds of operations:
array_np = numpy.asarray(array) low_values_flags = array_np < lowValY # Where values are low array_np[low_values_flags] = 0 # All low values set to 0
Now, if you only need the highCountX largest elements, you can even "forget" the small elements (instead of setting them to 0 and sorting them) and only sort the list of large elements:
array_np = numpy.asarray(array) print numpy.sort(array_np[array_np >= lowValY])[-highCountX:]
Of course, sorting the whole array if you only need a few elements might not be optimal. Depending on your needs, you might want to consider the standard heapq module.
from scipy.stats import threshold thresholded = threshold(array, 0.5)
:)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With