Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the relationship between vectorization and embarrasingly parallel?

The question says it all. It seems to me that vectorization is very closely related to embarrassingly parallel problems. In other words, all vectorizeable programs must be embarrassingly parallel programs. Is this correct?

like image 394
Dervin Thunk Avatar asked Jan 10 '13 14:01

Dervin Thunk


2 Answers

A quick summary for embarrassingly parallelism:

A code is embarrassingly parallel if the code can be parallelized without any efforts, especially handling data dependency. Note that embarrassingly parallelism only means that the code will be safely parallelized without effort; it doesn't guarantee any optimal performance.

A simple example would be summation of two vectors.

// A, B, and C are all distinct arrays.    
for (int i = 0; i < N; ++i)
  C[i] = A[i] + B[i];

This code is embarrassingly parallel because there is no data dependency on C. This code can be simply parallelized, for example, by using OpenMP:

#pragma omp parallel for 
for (int i = 0; i < N; ++i)
  C[i] = A[i] + B[i];

Vectorization is a particular form of how parallelism is achieved. In particular, vectorization mostly uses dedicated SIMD execution hardware units in processors using specialized instructions such as x86 SSE/AVX and ARM NEON. Compilers may automatically vectorize your code, or you can manually vectorize using intrinsics and direct assembly code.

I don't think that vectorization necessarily means that the code to be vectorized must be embarrassingly parallel. But, in practice, most vectorizable code is embarrassingly parallel because virtually all SIMD instructions assume that. I can't find any SIMD instructions that allow data dependent operations. In that sense, yes, you can say that vectorizable programs need to be embarrassingly parallel.

However, in a broad sense, vectorization could embrace GPGPU-style SIMD programming such as Nvidia's CUDA and Intel's MIC architecture. They allow more flexible SIMD operations by handling data-dependent operations and branches.


To summarize, in a narrow definition of the vectorization, i.e., vectorization for conventional CPU's SIMD operations, I believe that vectorizable programs should be embarrassingly parallel. However, an advanced form of SIMD/vectorization could enable data-dependent operations and arbitrary branches.

like image 63
minjang Avatar answered Sep 26 '22 15:09

minjang


Embarrassingly parallel problems are tasks that require no effort to write in parallel form. Vectorisation is the process by which a conventional procedure becomes parallel.

So, this isn't a case of one being a logical sub-type of another, but a trend. The closer a problem is to being embarrassingly parallel, the less vectorisation is required.

like image 27
Richard A. Avatar answered Sep 25 '22 15:09

Richard A.