<h3>My Questions</h3> <ul> <li>Is there anyway that I can speed up this calculation? </li> <li>Is there a better algorithm or implementation that I can be use to calculate the same values?</li> </ul> <h3>Describing the algorithm</h3> <p>I have a complex indexing problem that I'm struggling to solve in an efficient way.</p> <p>The goal is to calculate the matrix <code>w_prime</code> using values a combination of values from the equally sized matrices <code>w</code>, <code>dY</code>, and <code>dX</code>. </p> <p>The value of <code>w_prime(i,j)</code> is calculated as <code>mean( w( indY & indX ) )</code>, where <code>indY</code> and <code>indX</code> are the indices of <code>dY</code> and <code>dX</code> that are equal to <code>i</code> and <code>j</code> respectively.</p> <p>Here's a simple implementation in matlab of an algorithm to compute <code>w_prime</code>:</p> <pre class="prettyprint"><code>for i = 1:size(w_prime,1) indY = dY == i; for j = 1:size(w_prime,2) indX = dX == j; w_prime(ind) = mean( w( indY & indX ) ); end end </code></pre> <h3>Performance Problems</h3> <p>This implementation is sufficient in example case below; however, in my actual use case <code>w</code>, <code>dY</code>, <code>dX</code> are ~<code>3000x3000</code> and <code>w_prime</code> is ~<code>60X900</code>. Meaning that each index calculation is happening on a ~9 million elements. Needless this implementation is too slow to be usable. Additionally I'll need to run this code a few dozen times.</p> <h3>Example Calculation</h3> <p>If I want to compute <code>w(1,1)</code></p> <ul> <li>Find the indices of <code>dY</code> that equal 1, save as <code>indY</code> </li> <li>Find the indices of <code>dX</code> that equal 1, save as <code>indX</code> </li> </ul> <p><img src="https://i.stack.imgur.com/tRE7w.png" alt="enter image description here"></p> <ul> <li>Find intersection of <code>indY</code> and <code>indX</code> save as <code>ind</code> </li> </ul> <p><img src="https://i.stack.imgur.com/RUQjX.png" alt="enter image description here"></p> <ul> <li>Save the <code>mean( w(ind) )</code> to <code>w_prime(1,1)</code> </li> </ul> <p><img src="https://i.stack.imgur.com/ykBpj.png" alt="enter image description here"></p> <h3>General Problem Description</h3> <p>I have a set points defined by two vectors <code>X</code>, and <code>T</code>, both are 1XN where N is ~3000. Additionally the values of X and T are integers bound by the intervals (1 60) and (1 900) respectively. </p> <p>The matrices <code>dX</code> and <code>dT</code>, are simply distance matrices, meaning that they contain the pairwise distances between the points. Ie <code>dx(i,j)</code> is equal <code>abs( x(i) - x(j) )</code>.</p> <p>They are calculated using: <code>dx = pdist(x);</code></p> <p>The matrix <code>w</code> can be thought of as a weight matrix that describes how much influence one point has on another. </p> <p>The purpose of calculating <code>w_prime(a,b)</code> is to determine the average weight between the sub-set of points that are separated by <code>a</code> in the <code>X</code> dimension and <code>b</code> in the <code>T</code> dimension.</p> <p>This can be expressed as follows:</p> <p><img src="https://i.stack.imgur.com/XcE75.png" alt="enter image description here"></p>

<p>This is straightforward with ACCUMARRAY:</p> <pre class="prettyprint"><code>nx = max(dX(:)); ny = max(dY(:)); w_prime = accumarray([dX(:),dY(:)],w(:),[nx,ny],@mean,NaN) </code></pre> <p>The output will be a <code>nx</code>-by-<code>ny</code> sized array with NaNs wherever there was no corresponding pair of indices. If you're sure that there will be a full complement of indices all the time, you can simplify the above calculation to</p> <pre class="prettyprint"><code>w_prime = accumarray([dX(:),dY(:)],w(:),[],@mean) </code></pre> <p>So, what does accumarray do? It looks at the rows of <code>[dX(:),dY(:)]</code>. Each row gives the <code>(i,j)</code> coordinate pair in <code>w_prime</code> to which the row contributes. For all pairs <code>(1,1)</code>, it applies the function (<code>@mean</code>) to the corresponding entries in <code>w(:)</code>, and writes the output into <code>w_prime(1,1)</code>. </p>

How can I optimize this indexing algorithm

My Questions

Is there anyway that I can speed up this calculation?
Is there a better algorithm or implementation that I can be use to calculate the same values?

Describing the algorithm

I have a complex indexing problem that I'm struggling to solve in an efficient way.

The goal is to calculate the matrix w_prime using values a combination of values from the equally sized matrices w, dY, and dX.

The value of w_prime(i,j) is calculated as mean( w( indY & indX ) ), where indY and indX are the indices of dY and dX that are equal to i and j respectively.

Here's a simple implementation in matlab of an algorithm to compute w_prime:

for i = 1:size(w_prime,1)
  indY = dY == i;
  for j = 1:size(w_prime,2)
    indX = dX == j; 
    w_prime(ind) = mean( w( indY & indX ) );
  end
end

Performance Problems

This implementation is sufficient in example case below; however, in my actual use case w, dY, dX are ~3000x3000 and w_prime is ~60X900. Meaning that each index calculation is happening on a ~9 million elements. Needless this implementation is too slow to be usable. Additionally I'll need to run this code a few dozen times.

Example Calculation

If I want to compute w(1,1)

Find the indices of dY that equal 1, save as indY
Find the indices of dX that equal 1, save as indX

enter image description here

Find intersection of indY and indX save as ind

enter image description here

Save the mean( w(ind) ) to w_prime(1,1)

enter image description here

General Problem Description

I have a set points defined by two vectors X, and T, both are 1XN where N is ~3000. Additionally the values of X and T are integers bound by the intervals (1 60) and (1 900) respectively.

The matrices dX and dT, are simply distance matrices, meaning that they contain the pairwise distances between the points. Ie dx(i,j) is equal abs( x(i) - x(j) ).

They are calculated using: dx = pdist(x);

The matrix w can be thought of as a weight matrix that describes how much influence one point has on another.

The purpose of calculating w_prime(a,b) is to determine the average weight between the sub-set of points that are separated by a in the X dimension and b in the T dimension.

This can be expressed as follows:

enter image description here

750

asked Sep 12 '12 15:09

slayton

1 Answers

This is straightforward with ACCUMARRAY:

nx = max(dX(:));
ny = max(dY(:));

w_prime = accumarray([dX(:),dY(:)],w(:),[nx,ny],@mean,NaN)

The output will be a nx-by-ny sized array with NaNs wherever there was no corresponding pair of indices. If you're sure that there will be a full complement of indices all the time, you can simplify the above calculation to

w_prime = accumarray([dX(:),dY(:)],w(:),[],@mean)

So, what does accumarray do? It looks at the rows of [dX(:),dY(:)]. Each row gives the (i,j) coordinate pair in w_prime to which the row contributes. For all pairs (1,1), it applies the function (@mean) to the corresponding entries in w(:), and writes the output into w_prime(1,1).

169

answered Sep 19 '22 06:09

Jonas

Related questions
                            
                                Pseudo-code for Network-only-bayes-classifier
                            
                                Looking for ideas how to refactor my algorithm
                            
                                Five Digit Primes in a 5x5 Grid
                            
                                Convex Decomposition of a Complex Polygon?
                            
                                Is there a way to avoid unnecessary recursion?
                            
                                What is the most efficient way of finding the first element of the ith row when A[i,j]=j*(A[i-1,j+1]-A[i-1,j])?
                            
                                Correctness of greedy algorithm
                            
                                Shortest path to visit all nodes
                            
                                Stuck implementing simple neural network
                            
                                Algorithm to find the number of distinct paths in a directed graph [duplicate]
                            
                                A better concurrent prime number sieve in go
                            
                                Which algorithm is being used in Android's spell checker?
                            
                                How to detect squares on a grid which can NEVER be part of a shortest path after adding blocks?
                            
                                Longest repeated (k times) substring
                            
                                Finding k most common words in a file - memory usage
                            
                                Volleyball Player Combination
                            
                                Get permutation with specified degree by index number
                            
                                Algorithm for merging sets that share at least 2 elements
                            
                                Longest Common Subsequence for Multiple Sequences
                            
                                Is there an algorithm for anonymous, changeable, secure voting?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How can I optimize this indexing algorithm

Tags:

algorithm

indexing

matlab

My Questions

Describing the algorithm

Performance Problems

Example Calculation

General Problem Description

slayton

People also ask

1 Answers

Jonas

Recent Activity

Donate For Us