Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Efficient way to find the index of the max upper triangular entry in a numpy array?

More specifically, I have a list of rows/columns that need to be ignored when choosing the max entry. In other words, when choosing the max upper triangular entry, certain indices need to skipped. In that case, what is the most efficient way to find the location of the max upper triangular entry?

For example:

>>> a
array([[0, 1, 1, 1],
       [1, 2, 3, 4],
       [4, 5, 6, 6],
       [4, 5, 6, 7]])
>>> indices_to_skip = [0,1,2]

I need to find the index of the min element among all elements in the upper triangle except for the entries a[0,1], a[0,2], and a[1,2].

like image 965
methane Avatar asked Sep 01 '13 18:09

methane


People also ask

How do you find the index of a max value in a NumPy array?

There is argmin() and argmax() provided by numpy that returns the index of the min and max of a numpy array respectively. Note that these will only return the index of the first occurrence.

How do you find the maximum index of an array in Python?

Use the enumerate() function to find out the index of the maximum value in a list. Use the numpy. argmax() function of the NumPy library to find out the index of the maximum value in a list.

Which of the following functions helps find the maximum value number in the NumPy?

max() and numpy. min() functions we can find the maximum and minimum element. Here, we get the maximum and minimum value from the whole array.


1 Answers

You can use np.triu_indices_from:

>>> np.vstack(np.triu_indices_from(a,k=1)).T
array([[0, 1],
       [0, 2],
       [0, 3],
       [1, 2],
       [1, 3],
       [2, 3]])

>>> inds=inds[inds[:,1]>2] #Or whatever columns you want to start from.
>>> inds
array([[0, 3],
       [1, 3],
       [2, 3]])


>>> a[inds[:,0],inds[:,1]]
array([1, 4, 6])

>>> max_index = np.argmax(a[inds[:,0],inds[:,1]])
>>> inds[max_index]
array([2, 3]])

Or:

>>> inds=np.triu_indices_from(a,k=1)
>>> mask = (inds[1]>2) #Again change 2 for whatever columns you want to start at.
>>> a[inds][mask]
array([1, 4, 6])

>>> max_index = np.argmax(a[inds][mask])
>>> inds[mask][max_index]
array([2, 3]])

For the above you can use inds[0] to skip certains rows.

To skip specific rows or columns:

def ignore_upper(arr, k=0, skip_rows=None, skip_cols=None):
    rows, cols = np.triu_indices_from(arr, k=k)

    if skip_rows != None:
        row_mask = ~np.in1d(rows, skip_rows)
        rows = rows[row_mask]
        cols = cols[row_mask]

    if skip_cols != None:
        col_mask = ~np.in1d(cols, skip_cols)
        rows = rows[col_mask]
        cols = cols[col_mask]

    inds=np.ravel_multi_index((rows,cols),arr.shape)
    return np.take(arr,inds)

print ignore_upper(a, skip_rows=1, skip_cols=2) #Will also take numpy arrays for skipping.
[0 1 1 6 7]

The two can be combined and creative use of boolean indexing can help speed up specific cases.

Something interesting that I ran across, a faster way to take upper triu indices:

def fast_triu_indices(dim,k=0):

    tmp_range = np.arange(dim-k)
    rows = np.repeat(tmp_range,(tmp_range+1)[::-1])

    cols = np.ones(rows.shape[0],dtype=np.int)
    inds = np.cumsum(tmp_range[1:][::-1]+1)

    np.put(cols,inds,np.arange(dim*-1+2+k,1))
    cols[0] = k
    np.cumsum(cols,out=cols)
    return (rows,cols)

Its about ~6x faster although it does not work for k<0:

dim=5000
a=np.random.rand(dim,dim)

k=50
t=time.time()
rows,cols=np.triu_indices(dim,k=k)
print time.time()-t
0.913508892059

t=time.time()
rows2,cols2,=fast_triu_indices(dim,k=k)
print time.time()-t
0.16515994072

print np.allclose(rows,rows2)
True

print np.allclose(cols,cols2)
True
like image 194
Daniel Avatar answered Oct 29 '22 11:10

Daniel