can I optimize my java prog by transposing a 2d array?

Question

I have:

final int ROWS = 100000;
final int COLS = 2000;
long[][] m = new long[COLS][ROWS];

and then:

public void xor(int row1, int row2) {
    for (int col=0; col<COLS; col++) {
        m[col][row1] ^= m[col][row2];
    }
}

The above function is, simplified, what takes most of the time in a run. I was wondering if I should invest time to refactor my whole program to read "m = new long[ROWS][COLS]" (instead of the other way around) for better RAM access. Or won't I win much time with it?

I am aware that I could parallellize it with perhaps GPU's, but that's for a later stage.

Karol S · Accepted Answer

In my opinion, it will definitely help to swap ROWS and COLS.

The layout of this arrays is (roughly) like this: [0][0], [0][1], [0][2],... [1][0], [1][1],... and so on. In your code, each column is a continuous chunk of memory, and a row is not.

Since each column is 800000 bytes, and in your xor method you access all of them, you are forcing more cache misses.

After transposing, each row becomes a continuous piece of memory, and since you tend to operate on rows, it should make it faster.

If you had long[][] m = new long[ROWS][COLS]; and for (int col=0; col<COLS; col++) m[row1][col] ^= m[row2][col];, you'd only need two 16000-byte-long rows to be in the cache during the execution of the xor method.

But since what I said is based mostly on theory, try to benchmark both variants and check which one is really faster.

can I optimize my java prog by transposing a 2d array?

Tags:

java

matrix

ram

Albert Hendriks

1 Answers

Karol S

Recent Activity

Donate For Us

can I optimize my java prog by transposing a 2d array?

Tags:

java

matrix

ram

Albert Hendriks

1 Answers

Karol S

Related questions

Recent Activity

Donate For Us