Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does vectorization mean?

Is it a good idea to vectorize the code? What are good practices in terms of when to do it? What happens underneath?

like image 346
vehomzzz Avatar asked Oct 04 '09 15:10

vehomzzz


People also ask

What is meant by vectorization?

Vectorization is the process of converting an algorithm from operating on a single value at a time to operating on a set of values (vector) at one time. Modern CPUs provide direct support for vector operations where a single instruction is applied to multiple data (SIMD).

What is vectorization give an example?

Vectorization, in simple words, means optimizing the algorithm so that it can utilize SIMD instructions in the processors. AVX, AVX2 and AVX512 are the instruction sets (intel) that perform same operation on multiple data in one instruction. for eg. AVX512 means you can operate on 16 integer values(4 bytes) at a time.

What does it mean to vectorize an image?

Vectorization is the process of converting a raster image into a vector line by having a computer program "trace" the image and automatically create vector lines.

Why do we use vectorization?

Therefore, Vectorization or word embedding is the process of converting text data to numerical vectors. Later those vectors are used to build various machine learning models. In this manner, we say this as extracting features with the help of text with an aim to build multiple natural languages, processing models, etc.


2 Answers

Vectorization means that the compiler detects that your independent instructions can be executed as one SIMD instruction. Usual example is that if you do something like

for(i=0; i<N; i++){   a[i] = a[i] + b[i]; } 

It will be vectorized as (using vector notation)

for (i=0; i<(N-N%VF); i+=VF){   a[i:i+VF] = a[i:i+VF] + b[i:i+VF]; } 

Basically the compiler picks one operation that can be done on VF elements of the array at the same time and does this N/VF times instead of doing the single operation N times.

It increases performance, but puts more requirement on the architecture.

like image 170
Zed Avatar answered Sep 21 '22 06:09

Zed


As mentioned above, vectorization is used to make use of SIMD instructions, which can perform identical operations of different data packed into large registers.

A generic guideline to enable a compiler to autovectorize a loop is to ensure that there are no flow- and anti-dependencies b/w data elements in different iterations of a loop.

http://en.wikipedia.org/wiki/Data_dependency

Some compilers like the Intel C++/Fortran compilers are capable of autovectorizing code. In case it was not able to vectorize a loop, the Intel compiler is capable of reporting why it could not do that. There reports can be used to modify the code such that it becomes vectorizable (assuming it's possible)

Dependencies are covered in depth in the book 'Optimizing Compilers for Modern Architectures: A Dependence-based Approach'

like image 33
Gautham Ganapathy Avatar answered Sep 21 '22 06:09

Gautham Ganapathy