A matrix A[i][j]
is given. If we want to add the elements of the matrix, which method is better and why?
From my point of view, row wise is better because in array representation elements are stored in contiguous memory locations so accessing them takes less time.But since in RAM fetching every location takes equal time, does it matter?
Follow the given steps to solve the problem: Run a nested loop, outer loop for the row, and inner loop for the column. Check every element with x and if the element is found then print “element found” If the element is not found, then print “element not found”
The most common way is to explicitly specify the indices of the elements. For example, to access a single element of a matrix, specify the row number followed by the column number of the element. e is the element in the 3,2 position (third row, second column) of A .
The number of elements of a matrix = the number of rows multiplied by the number of columns. For example, if the number of rows is 3 and the number of columns is 4 in a matrix then the number of elements in it is 3 x 4 = 12.
Matrix Definition By convention, rows are listed first; and columns, second. Thus, we would say that the dimension (or order) of the above matrix is 3 x 4, meaning that it has 3 rows and 4 columns. Numbers that appear in the rows and columns of a matrix are called elements of the matrix.
Take advantage of Spatial Locality
In C, matrixes are stored in row-major order. So if you access element a[i][j]
, an access to element a[i][j+1]
will be likely to hit the cache. No access to main memory will be performed. Caches are faster than main memory, so the access pattern does matter.
Of course, more factors must be taken into accout, such as write access / read access, write policy (write through, write back / write allocate , no write allocate), multilevel caches, and so on. But that seems an overkill for this question.
Have some fun with a profiling tool, such as cachegrind, and see it for yourself.
For example, consider a dummy program accesing 4MB matrices. Check out the differences between the miss rates on each access pattern.
column access
$ cat col_major.c
#include <stdio.h>
int main(){
size_t i,j;
const size_t dim = 1024 ;
int matrix [dim][dim];
for (i=0;i< dim; i++){
for (j=0;j <dim;j++){
matrix[j][i]= i*j;
}
}
return 0;
}
$ valgrind --tool=cachegrind ./col_major
==3228== D refs: 10,548,665 (9,482,149 rd + 1,066,516 wr)
==3228== D1 misses: 1,049,704 ( 968 rd + 1,048,736 wr)
==3228== L2d misses: 1,049,623 ( 893 rd + 1,048,730 wr)
==3228== D1 miss rate: 9.9% ( 0.0% + 98.3% )
==3228== L2d miss rate: 9.9% ( 0.0% + 98.3% )
row access
$ cat row_major.c
#include <stdio.h>
int main(){
size_t i,j;
const size_t dim = 1024 ;
int matrix [dim][dim];
for (i=0;i< dim; i++)
for (j=0;j <dim;j++)
matrix[i][j]= i*j;
return 0;
}
$ valgrind --tool=cachegrind ./row_major
==3524== D refs: 10,548,665 (9,482,149 rd + 1,066,516 wr)
==3524== D1 misses: 66,664 ( 968 rd + 65,696 wr)
==3524== L2d misses: 66,583 ( 893 rd + 65,690 wr)
==3524== D1 miss rate: 0.6% ( 0.0% + 6.1% )
==3524== L2d miss rate: 0.6% ( 0.0% + 6.1% )
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With