Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cache Poisoning Issue for deep nested loop

I am writing a code for a mathematical method (Incomplete Cholesky) and I have hit a curious roadblock. Please see the following simplified code.

for(k=0;k<nosUnknowns;k++)
{
    //Pieces of code
    for(i=k+1;i<nosUnknowns;i++)
    {
        // more code
    }
    for(j=k+1;j<nosUnknowns;j++)
    {
        for(i=j;i<nosUnknowns;i++)
        {
            //Some more code 
            if(xOk && yOk && zOk)
            {
                if(xDF == 1 && yDF == 0 && zDF == 0)
                {
                    for(row=0;row<3;row++)
                    {
                        for(col=0;col<3;col++)
                        {
                            // All 3x3 static arrays This is the line
                            statObj->A1_[row][col] -= localFuncArr[row][col];
                        }
                    }
                }
            }
        }//Inner loop i ends here
    }//Inner loop j ends here
}//outer loop k ends here

For context,

statObj is an object containing a number of 3x3 static double arrays. I am initializing statObj by a call to new function. Then I am populating the arrays inside it using some mathematical functions. One such array is A1_. The value of variable nosUnknowns is around 3000. The array localFuncArr is previously generated by matrix multiplication and is a double array.

Now this is my problem:

  1. When I use the line as shown in the code, the code runs extremely sluggishly. Something like 245secs for the whole function.

  2. When I comment out the said line, the code performs extremely fast. It takes something like 6 secs.

  3. Now when I replace the said line with the following line : localFuncArr[row][col] += 3.0, again the code runs with the same speed as that of case(2) above.

Clearly something about the call to statObj->A1_ is making the code run slow.

My question(s):

  1. Is Cache Poisoning the reason why this is happening ?

  2. If so, what could be changed in terms of array initialization/object initialization/loop unrolling or for that matter any form of code optimization that can speed this up ?

Any insights to this from experienced folks is highly appreciated.

EDIT: Changed the description to be more verbose and redress some of the points mentioned in the comments.

like image 770
rajaditya_m Avatar asked Jul 17 '13 23:07

rajaditya_m


1 Answers

If the conditions are mostly true, your line of code is executed 3000x3000x3000x3x3 times. That's about 245 billion times. Depending on your hardware architecture 245 seconds might be a very reasonable timing (that's 1 iteration every 2 cycles - assuming 2GHz processor). In any case there isn't anything in the code that suggests cache poisoning.

like image 152
Come Raczy Avatar answered Oct 19 '22 05:10

Come Raczy