Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does C++ multiplication with dynamic array work better than std::vector version

I am implementing the C++ multiplication for matrices with different data structures and techniques (vectors , arrays and OpenMP) and I found a strange situation... My dynamic array version is working better:

times:

openmp mult_1: time: 5.882000 s

array mult_2: time: 1.478000 s

My compilation flags are:

/usr/bin/g++ -fopenmp -pthread -std=c++1y -O3

C++ vector version

typedef std::vector<std::vector<float>> matrix_f;
void mult_1 (const matrix_f &  matrixOne, const matrix_f & matrixTwo, matrix_f & result) {
    const int matrixSize = (int)result.size();
    #pragma omp parallel for simd
    for (int rowResult = 0; rowResult < matrixSize; ++rowResult) {
        for (int colResult = 0; colResult < matrixSize; ++colResult) {
            for (int k = 0; k < matrixSize; ++k) {
                result[rowResult][colResult] += matrixOne[rowResult][k] * matrixTwo[k][colResult];  
            }
        }
    }
}

Dynamic array version

void mult_2 ( float *  matrixOne, float * matrixTwo,  float * result, int size)  {
    for (int row = 0; row < size; ++row) {
        for (int col = 0; col < size; ++col) {
            for (int k = 0; k < size; ++k) {
                (*(result+(size*row)+col)) += (*(matrixOne+(size*row)+k)) * (*(matrixTwo+(size*k)+col));
            }
        }
    }
}

tests:

C++ vector version

utils::ChronoTimer timer;
/* set Up simple matrix */
utils::matrix::matrix_f matr1 = std::vector<std::vector<float>>(size,std::vector<float>(size));
fillRandomMatrix(matr1);

utils::matrix::matrix_f matr2 = std::vector<std::vector<float>>(size,std::vector<float>(size));
fillRandomMatrix(matr2);

utils::matrix::matrix_f result = std::vector<std::vector<float>>(size,std::vector<float>(size));    
timer.init();
utils::matrix::mult_1(matr1,matr2,result);
std::printf("openmp mult_1: time: %f ms\n",timer.now() / 1000);

Dynamic array version

utils::ChronoTimer timer;

float *p_matr1 = new float[size*size];
float *p_matr2 = new float[size*size];
float *p_result = new float[size*size];

fillRandomMatrixArray(p_matr1,size);
fillRandomMatrixArray(p_matr2,size);

timer.init();
utils::matrix::mult_2(p_matr1,p_matr2,p_result,size);
std::printf("array mult_2: time: %f ms\n",timer.now() / 1000);

delete [] p_matr1;
delete [] p_matr2;
delete [] p_result;

I was checking some previous posts, but I couldn't find any related with my problem link, link2, link3:

UPDATE: I refactorized tests with the answers, and vector works slighty better :

vector mult: time: 1.194000 s

array mult_2: time: 1.202000 s

C++ vector version

void mult (const std::vector<float> &  matrixOne, const std::vector<float> & matrixTwo, std::vector<float> & result, int size) {
    for (int row = 0; row < size; ++row) {
        for (int col = 0; col < size; ++col) {
            for (int k = 0; k <size; ++k) {
                result[(size*row)+col] += matrixOne[(size*row)+k] * matrixTwo[(size*k)+col];
            }
        }
    }
}

Dynamic array version

void mult_2 ( float *  matrixOne, float * matrixTwo,  float * result, int size)  {
    for (int row = 0; row < size; ++row) {
        for (int col = 0; col < size; ++col) {
            for (int k = 0; k < size; ++k) {
                (*(result+(size*row)+col)) += (*(matrixOne+(size*row)+k)) * (*(matrixTwo+(size*k)+col));
            }
        }
    }
}

Also, my vectorized version is working better(0.803 s);

like image 254
carlos.baez Avatar asked Feb 25 '16 11:02

carlos.baez


People also ask

Which is faster vector or dynamic array C++?

The conclusion is that arrays of integers are faster than vectors of integers (5 times in my example). However, arrays and vectors are arround the same speed for more complex / not aligned data. Save this answer.

Why would we use dynamically allocated arrays vs vectors?

Vector are implemented as dynamic arrays with list interface whereas arrays can be implemented as statically or dynamically with primitive data type interface. Size of arrays are fixed whereas the vectors are resizable i.e they can grow and shrink as vectors are allocated on heap memory.

Are C style arrays faster than vectors?

A std::vector can never be faster than an array, as it has (a pointer to the first element of) an array as one of its data members. But the difference in run-time speed is slim and absent in any non-trivial program. One reason for this myth to persist, are examples that compare raw arrays with mis-used std::vectors.

Is array better than vector C++?

Array is memory efficient data structure. Vector takes more time in accessing elements. Array access elements in constant time irrespective of their location as elements are arranged in a contiguous memory allocation.


2 Answers

A vector of vectors is analogous to a jagged array -- an array where each entry is a pointer, each pointer pointing at another array of floats.

In comparison, the raw array version is one block of memory, where you do math to find the elements.

Use a single vector, not a vector of vectors, and do the math manually. Or, use a vector of fix-sized std::arrays. Or write a helper type that takes the (one-dimensional) vector of float, and gives you a 2 dimensional view of it.

Data in a contiguous buffer is more cache and optimization friendly than data in a bunch of disconnected buffers.

template<std::size_t Dim, class T>
struct multi_dim_array_view_helper {
  std::size_t const* dims;
  T* t;
  std::size_t stride() const {
    return
      multi_dim_array_view_helper<Dim-1, T>{dims+1, nullptr}.stride()
      * *dims;
  }
  multi_dim_array_view_helper<Dim-1, T> operator[](std::size_t i)const{
    return {dims+1, t+i*stride()};
  }
};
template<class T>
struct multi_dim_array_view_helper<1, T> {
  std::size_t stride() const{ return 1; }
  T* t;
  T& operator[](std::size_t i)const{
    return t[i];
  }
  multi_dim_array_view_helper( std::size_t const*, T* p ):t(p) {}
};
template<std::size_t Dims>
using dims_t = std::array<std::size_t, Dims-1>;
template<std::size_t Dims, class T>
struct multi_dim_array_view_storage
{
  dims_t<Dims> storage;
};
template<std::size_t Dims, class T>
struct multi_dim_array_view:
  multi_dim_array_view_storage<Dims, T>,
  multi_dim_array_view_helper<Dims, T>
{
  multi_dim_array_view( dims_t<Dims> d, T* t ):
    multi_dim_array_view_storage<Dims, T>{std::move(d)},
    multi_dim_array_view_helper<Dims, T>{
      this->storage.data(), t
    }
  {}
};

now you can do this:

std::vector<float> blah = {
   01.f, 02.f, 03.f,
   11.f, 12.f, 13.f,
   21.f, 22.f, 23.f,
};

multi_dim_array_view<2, float> view = { {3}, blah.data() };
for (std::size_t i = 0; i < 3; ++i )
{
  std::cout << "[";
  for (std::size_t j = 0; j < 3; ++j )
    std::cout << view[i][j] << ",";
  std::cout << "]\n";
}

live example

No data is copied in the view class. It just provides a view of the flat array that is a multi-dimensional array.

like image 51
Yakk - Adam Nevraumont Avatar answered Oct 28 '22 10:10

Yakk - Adam Nevraumont


Your approaches are quite different:

  • In the "dynamic array" version you allocate a single chunk of memory for each matrix and map the rows of the matrices onto that one dimensional memory range.

  • In the "vector" version you use vectors of vectors which are "real" and "dynamically" two dimensional meaning that the storage position of each row of your matrices is unrelated with respect to the other rows.

What you probably want to do is:

  • Use vector<float>(size*size) and perform the very same mapping you're doing in the "dynamic array" example by hand or

  • Write a class that internally handles the mapping for you and provides a 2-dimensional access interface (T& operator()(size_t, size_t) or some kind of row_proxy operator[](size_t) where row_proxy in turn has T& operator[](size_t))

like image 23
Pixelchemist Avatar answered Oct 28 '22 11:10

Pixelchemist