My question is very simple, and hopefully has a nice answer too: When I have a constructed Eigen::MatrixXd
matrix, can I use multiple threads to populate rows in the matrix at the same time (if I can assure that no rows are being concurrently written), or must I create temporary row objects in each thread, and then copy (ugh...) them into the matrix as a reduce operation?
While it may be thread safe in terms of not writing to the same address from different threads, because Eigen::MatrixXd
is of column major storage, you will likely be wrecking havoc on the cache (basically, it's false sharing). It may be faster to create a temporary row major matrix and then copying it over to the column major matrix.
Alternatively (and better IMO), you can treat the columns in your existing matrix as rows (make sure the dimensions are switched/match) and then do a m.transposeInPlace()
. Depending on the matrix shape and alignment, this may be more efficient than m = m.transpose().eval()
.
Also may be possible to use the threads' IDs if the matrix is large enough and the IDs are zero based and consecutive (e.g. with OMP or similar, not e.g. std::thread
without keeping track of the different IDs on your own).
This also requires padding the matrix so that the number of rows is a multiple of the cache line size and each column starts at an aligned block of memory.
Assume the cache line is 64 bytes. If you treat blocks of an integer multiple of that, then you could avoid false sharing, as each thread only touches its "own" cache lines. If you can do this, then there should be no extra temporaries or copies/swaps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With