Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

can I optimize my java prog by transposing a 2d array?

Tags:

java

matrix

ram

I have:

final int ROWS = 100000;
final int COLS = 2000;
long[][] m = new long[COLS][ROWS];

and then:

public void xor(int row1, int row2) {
    for (int col=0; col<COLS; col++) {
        m[col][row1] ^= m[col][row2];
    }
}

The above function is, simplified, what takes most of the time in a run. I was wondering if I should invest time to refactor my whole program to read "m = new long[ROWS][COLS]" (instead of the other way around) for better RAM access. Or won't I win much time with it?

I am aware that I could parallellize it with perhaps GPU's, but that's for a later stage.

like image 774
Albert Hendriks Avatar asked Jul 01 '26 15:07

Albert Hendriks


1 Answers

In my opinion, it will definitely help to swap ROWS and COLS.

The layout of this arrays is (roughly) like this: [0][0], [0][1], [0][2],... [1][0], [1][1],... and so on. In your code, each column is a continuous chunk of memory, and a row is not.

Since each column is 800000 bytes, and in your xor method you access all of them, you are forcing more cache misses.

After transposing, each row becomes a continuous piece of memory, and since you tend to operate on rows, it should make it faster.

If you had long[][] m = new long[ROWS][COLS]; and for (int col=0; col<COLS; col++) m[row1][col] ^= m[row2][col];, you'd only need two 16000-byte-long rows to be in the cache during the execution of the xor method.

But since what I said is based mostly on theory, try to benchmark both variants and check which one is really faster.

like image 157
Karol S Avatar answered Jul 03 '26 04:07

Karol S



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!