In numpy, calculating a matrix where each cell contains the product of all the other entries in that row

Question

I have a matrix

A = np.array([[0.2, 0.4, 0.6],
              [0.5, 0.5, 0.5],
              [0.6, 0.4, 0.2]])

I want a new matrix, where the value of the entry in row i and column j is the product of all the entries of the ith row of A, except for the cell of that row in the jth column.

array([[ 0.24,  0.12,  0.08],
       [ 0.25,  0.25,  0.25],
       [ 0.08,  0.12,  0.24]])

The solution that first occurred to me was

np.repeat(np.prod(A, 1, keepdims = True), 3, axis = 1) / A

But this only works so long as no entries have values zero.

Any thoughts? Thank you!

Edit: I have developed

B = np.zeros((3, 3))
for i in range(3):
    for j in range(3):
        B[i, j] = np.prod(i, A[[x for x in range(3) if x != j]])

but surely there is a more elegant way to accomplish this, which makes use of numpy's efficient C backend instead of inefficient python loops?

John Zwinck · Accepted Answer

If you're willing to tolerate a single loop:

B = np.empty_like(A)
for col in range(A.shape[1]):
    B[:,col] = np.prod(np.delete(A, col, 1), 1)

That computes what you need, a single column at a time. It is not as efficient as theoretically possible because np.delete() creates a copy; if you care a lot about memory allocation, use a mask instead:

B = np.empty_like(A)
mask = np.ones(A.shape[1], dtype=bool)
for col in range(A.shape[1]):
    mask[col] = False
    B[:,col] = np.prod(A[:,mask], 1)
    mask[col] = True

13 revs · Answer

A variation on your solution using repeat, uses [:,None].

np.prod(A,axis=1)[:,None]/A

My 1st stab at handling 0s is:

In [21]: B
array([[ 0.2,  0.4,  0.6],
       [ 0. ,  0.5,  0.5],
       [ 0.6,  0.4,  0.2]])

In [22]: np.prod(B,axis=1)[:,None]/(B+np.where(B==0,1,0))
array([[ 0.24,  0.12,  0.08],
       [ 0.  ,  0.  ,  0.  ],
       [ 0.08,  0.12,  0.24]])

But as the comment pointed out; the [0,1] cell should be 0.25.

This corrects that problem, but now has problems when there are multiple 0s in a row.

In [30]: I=B==0
In [31]: B1=B+np.where(I,1,0)
In [32]: B2=np.prod(B1,axis=1)[:,None]/B1
In [33]: B3=np.prod(B,axis=1)[:,None]/B1
In [34]: np.where(I,B2,B3)
Out[34]: 
array([[ 0.24,  0.12,  0.08],
       [ 0.25,  0.  ,  0.  ],
       [ 0.08,  0.12,  0.24]])

In [55]: C
array([[ 0.2,  0.4,  0.6],
       [ 0. ,  0.5,  0. ],
       [ 0.6,  0.4,  0.2]])
In [64]: np.where(I,sum1[:,None],sum[:,None])/C1
array([[ 0.24,  0.12,  0.08],
       [ 0.5 ,  0.  ,  0.5 ],
       [ 0.08,  0.12,  0.24]])

Blaz Bratanic's epsilon approach is the best non iterative solution (so far):

In [74]: np.prod(C+eps,axis=1)[:,None]/(C+eps)

A different solution iterating over the columns:

def paulj(A):
    P = np.ones_like(A)
    for i in range(1,A.shape[1]):
        P *= np.roll(A, i, axis=1)
    return P

In [130]: paulj(A)
array([[ 0.24,  0.12,  0.08],
       [ 0.25,  0.25,  0.25],
       [ 0.08,  0.12,  0.24]])
In [131]: paulj(B)
array([[ 0.24,  0.12,  0.08],
       [ 0.25,  0.  ,  0.  ],
       [ 0.08,  0.12,  0.24]])
In [132]: paulj(C)
array([[ 0.24,  0.12,  0.08],
       [ 0.  ,  0.  ,  0.  ],
       [ 0.08,  0.12,  0.24]])

I tried some timings on a large matrix

In [13]: A=np.random.randint(0,100,(1000,1000))*0.01

In [14]: timeit paulj(A)
1 loops, best of 3: 23.2 s per loop

In [15]: timeit blaz(A)
10 loops, best of 3: 80.7 ms per loop

In [16]: timeit zwinck1(A)
1 loops, best of 3: 15.3 s per loop

In [17]: timeit zwinck2(A)
1 loops, best of 3: 65.3 s per loop

The epsilon approximation is probably the best speed we can expect, but has some rounding issues. Having to iterate over many columns hurts the speed. I'm not sure why the np.prod(A[:,mask], 1) approach is slowest.

eeclo https://stackoverflow.com/a/22441825/901925 suggested using as_strided. Here's what I think he has in mind (adapted from an overlapping block question, https://stackoverflow.com/a/8070716/901925)

def strided(A):
    h,w = A.shape
    A2 = np.hstack([A,A])
    x,y = A2.strides
    strides = (y,x,y)
    shape = (w, h, w-1)
    blocks = np.lib.stride_tricks.as_strided(A2[:,1:], shape=shape, strides=strides)
    P = blocks.prod(2).T # faster to prod on last dim
    # alt: shape = (w-1, h, w), and P=blocks.prod(0)
    return P

Timing for the (1000,1000) array is quite an improvement over the column iterations, though still much slower than the epsilon approach.

In [153]: timeit strided(A)
1 loops, best of 3: 2.51 s per loop

Another indexing approach, while relatively straight forward, is slower, and produces memory errors sooner.

def foo(A):
    h,w = A.shape
    I = (np.arange(w)[:,None]+np.arange(1,w))
    I1 = np.array(I)%w
    P = A[:,I1].prod(2)
    return P

Eelco Hoogendoorn · Answer

Here is an O(n^2) method without python loops or numerical approximation:

def double_cumprod(A):
    B = np.empty((A.shape[0],A.shape[1]+1),A.dtype)
    B[:,0] = 1
    B[:,1:] = A
    L = np.cumprod(B, axis=1)
    B[:,1:] = A[:,::-1]
    R = np.cumprod(B, axis=1)[:,::-1]
    return L[:,:-1] * R[:,1:]

Note: it appears to be about twice as slow as the numerical approximation method, which is in line with expectation.

Blaz Bratanic · Answer

If you are willing to tolerate small error you could use the solution you first proposed.

A += 1e-10
np.around(np.repeat(np.prod(A, 1, keepdims = True), 3, axis = 1) / A, 9)

In numpy, calculating a matrix where each cell contains the product of all the other entries in that row

Tags:

python

math

matrix

numpy

everybody

4 Answers

John Zwinck

13 revs

Eelco Hoogendoorn

Blaz Bratanic

Recent Activity

Donate For Us

In numpy, calculating a matrix where each cell contains the product of all the other entries in that row

Tags:

python

math

matrix

numpy

everybody

4 Answers

John Zwinck

13 revs

Eelco Hoogendoorn

Blaz Bratanic

Related questions

Recent Activity

Donate For Us