I have a complex indexing problem that I'm struggling to solve in an efficient way.
The goal is to calculate the matrix w_prime using values a combination of values from the equally sized matrices w, dY, and dX.
The value of w_prime(i,j) is calculated as mean( w( indY & indX ) ), where indY and indX are the indices of dY and dX that are equal to i and j respectively.
Here's a simple implementation in matlab of an algorithm to compute w_prime:
for i = 1:size(w_prime,1)
indY = dY == i;
for j = 1:size(w_prime,2)
indX = dX == j;
w_prime(ind) = mean( w( indY & indX ) );
end
end
This implementation is sufficient in example case below; however, in my actual use case w, dY, dX are ~3000x3000 and w_prime is ~60X900. Meaning that each index calculation is happening on a ~9 million elements. Needless this implementation is too slow to be usable. Additionally I'll need to run this code a few dozen times.
If I want to compute w(1,1)
dY that equal 1, save as indY
dX that equal 1, save as indX

indY and indX save as ind

mean( w(ind) ) to w_prime(1,1)

I have a set points defined by two vectors X, and T, both are 1XN where N is ~3000. Additionally the values of X and T are integers bound by the intervals (1 60) and (1 900) respectively.
The matrices dX and dT, are simply distance matrices, meaning that they contain the pairwise distances between the points. Ie dx(i,j) is equal abs( x(i) - x(j) ).
They are calculated using: dx = pdist(x);
The matrix w can be thought of as a weight matrix that describes how much influence one point has on another.
The purpose of calculating w_prime(a,b) is to determine the average weight between the sub-set of points that are separated by a in the X dimension and b in the T dimension.
This can be expressed as follows:

The optimization of SQL indexes can be done by using SQL profiler, running Index Tuning Wizard, using SQL Query Analyzer and by defragmentation of indexes. For a large database, defragment the indexes is the best practice to optimize SQL server indexes.
1. A procedure to build beforehand a data structure or index designed to speed up searches. Learn more in: A Pagination Method for Indexes in Metric Databases.
When a data is inserted, a corresponding row is written to the index, and when a row is deleted, its index row is taken out. This keeps the data and searching index always in sync making the lookup very fast and read-time efficient.
This is straightforward with ACCUMARRAY:
nx = max(dX(:));
ny = max(dY(:));
w_prime = accumarray([dX(:),dY(:)],w(:),[nx,ny],@mean,NaN)
The output will be a nx-by-ny sized array with NaNs wherever there was no corresponding pair of indices. If you're sure that there will be a full complement of indices all the time, you can simplify the above calculation to
w_prime = accumarray([dX(:),dY(:)],w(:),[],@mean)
So, what does accumarray do? It looks at the rows of [dX(:),dY(:)]. Each row gives the (i,j) coordinate pair in w_prime to which the row contributes. For all pairs (1,1), it applies the function (@mean) to the corresponding entries in w(:), and writes the output into w_prime(1,1).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With