Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

replace zeroes in numpy array with the median value

I have a numpy array like this:

foo_array = [38,26,14,55,31,0,15,8,0,0,0,18,40,27,3,19,0,49,29,21,5,38,29,17,16] 

I want to replace all the zeros with the median value of the whole array (where the zero values are not to be included in the calculation of the median)

So far I have this going on:

foo_array = [38,26,14,55,31,0,15,8,0,0,0,18,40,27,3,19,0,49,29,21,5,38,29,17,16] foo = np.array(foo_array) foo = np.sort(foo) print "foo sorted:",foo #foo sorted: [ 0  0  0  0  0  3  5  8 14 15 16 17 18 19 21 26 27 29 29 31 38 38 40 49 55] nonzero_values = foo[0::] > 0 nz_values = foo[nonzero_values] print "nonzero_values?:",nz_values #nonzero_values?: [ 3  5  8 14 15 16 17 18 19 21 26 27 29 29 31 38 38 40 49 55] size = np.size(nz_values) middle = size / 2 print "median is:",nz_values[middle] #median is: 26 

Is there a clever way to achieve this with numpy syntax?

Thank you

like image 693
slashdottir Avatar asked Jun 12 '13 01:06

slashdottir


People also ask

How do you replace zeros in NP array?

In Python to replace nan values with zero, we can easily use the numpy. nan_to_num() function. This function will help the user for replacing the nan values with 0 and infinity with large finite numbers.

How do I replace all nan with 0 in NumPy?

You can use numpy. nan_to_num : numpy. nan_to_num(x) : Replace nan with zero and inf with finite numbers.


2 Answers

This solution takes advantage of numpy.median:

import numpy as np foo_array = [38,26,14,55,31,0,15,8,0,0,0,18,40,27,3,19,0,49,29,21,5,38,29,17,16] foo = np.array(foo_array) # Compute the median of the non-zero elements m = np.median(foo[foo > 0]) # Assign the median to the zero elements  foo[foo == 0] = m 

Just a note of caution, the median for your array (with no zeroes) is 23.5 but as written this sticks in 23.

like image 86
bbayles Avatar answered Oct 05 '22 02:10

bbayles


foo2 = foo[:] foo2[foo2 == 0] = nz_values[middle] 

Instead of foo2, you could just update foo if you want. Numpy's smart array syntax can combine a few lines of the code you made. For example, instead of,

nonzero_values = foo[0::] > 0 nz_values = foo[nonzero_values] 

You can just do

nz_values = foo[foo > 0] 

You can find out more about "fancy indexing" in the documentation.

like image 41
Alex Szatmary Avatar answered Oct 05 '22 02:10

Alex Szatmary