Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How is Loop skewing making loop parallelizable?

I am reading about loop tranformation techniques and i have a very hard time understanding how does loop skewing makes a loop parallelizable Here are are two loops, the initial one and the seconde one, what is the difference betwwen the two ? The two of them depends on previous iteration on both i and j, what make the second loop parallisable ? Or why can we make the interchange on the second one and not the first one ? Both of them have dependencies on i and j

for(int i =2; i < 5; i++){
            for(int j =2; j < 5; j++){
                A[i][j] = A[i-1][j] + A[i][j-1];
            }
        }
for(int i =2; i < 5; i++){
            for(int j =2+i; j < 5+i; j++){
                A[i][j-i] = A[i-1][j-i] + A[i][j-1-i];
            }
        }
like image 803
user2133558 Avatar asked Apr 16 '14 00:04

user2133558


1 Answers

I have no credit for this, I just formatted it for you and copied it from another source, and I hope it help you

[source: ECE 1754, Survey of Loop Transformation Techniques, Eric LaForest, March 19, 2010]

It is all about the distance between a two executive iterations, in the first the distance is 1 between one outer loop and inner loop, so there is a dependency between them.

Loop skewing does exactly what it says: it skews the execution of an inner loop relative to an outer one. This is useful if the inner loop has a dependence on the outer loop which prevents it from running in parallel. For example, the following code has a dependency vector of {(1, 0),(0, 1)} .Neither loop can be parallelized since they each carry a dependency. Simply interchanging the loops would merely interchange the indices holding the dependencies, accomplishing nothing.

do i = 2, n-1
do j = 2, m-1
a[i,j] =
      (a[i-1,j] + a[i,j-1] + a[i+1,j] + a[i,j+1]) / 4
end do
end do

Loop skewing is implemented by adding the index of the outer loop, times some skewing factor f, to the bounds of the inner loop and subtracting the same value from all the uses of the inner loop index. The subtraction keeps the indices within the new loop bounds, preserving the correctness of the program. The effect on the inner loop iterations is to shift their position in the array forwards by f relative to the current outer loop, increasing the dependency distance to the outer loop in the same manner. In other words, given a dependency vector (a, b), skewing transforms it to (a, f a + b). Since this transformation preserves the lexicographic order of the dependencies, it is always legal. Applying a skew factor of one to the above inner loop yields the following code:

do i = 2, n-1
do j = 2+i, m-1+i
a[i,j-i] =
(a[i-1,j-i] + a[i,j-1-i] + a[i+1,j-i] + a[i,j+1-i]) / 4
end do
end do

This new code executes in the same manner, but with dependencies of {(1, 1),(0, 1)}. Both loops still carry a dependency. However, interchanging the loops at this point yields a dependence vector {(1, 0),(1, 1)}, as shown in the following code:

do j = 4, m+n-2
do i = max(2, j-m+1), min(n-1, j-2)
a[i,j-i] =
(a[i-1,j-i] + a[i,j-1-i] + a[i+1,j-i] + a[i,j+1-i]) / 4
end do
end do

The inner loop can now be parallelized since it has now no loop-carried dependency on j, and the dependency to i is carried by the outer loop.Note that interchanging skewed loop bounds is no longer straightforward: each loop must take into account the upper and lower bounds of the other loop.

like image 137
mmain Avatar answered Jan 02 '23 08:01

mmain