Is it a good idea to vectorize the code? What are good practices in terms of when to do it? What happens underneath?
Vectorization is the process of converting an algorithm from operating on a single value at a time to operating on a set of values (vector) at one time. Modern CPUs provide direct support for vector operations where a single instruction is applied to multiple data (SIMD).
Vectorization, in simple words, means optimizing the algorithm so that it can utilize SIMD instructions in the processors. AVX, AVX2 and AVX512 are the instruction sets (intel) that perform same operation on multiple data in one instruction. for eg. AVX512 means you can operate on 16 integer values(4 bytes) at a time.
Vectorization is the process of converting a raster image into a vector line by having a computer program "trace" the image and automatically create vector lines.
Therefore, Vectorization or word embedding is the process of converting text data to numerical vectors. Later those vectors are used to build various machine learning models. In this manner, we say this as extracting features with the help of text with an aim to build multiple natural languages, processing models, etc.
Vectorization means that the compiler detects that your independent instructions can be executed as one SIMD instruction. Usual example is that if you do something like
for(i=0; i<N; i++){ a[i] = a[i] + b[i]; }
It will be vectorized as (using vector notation)
for (i=0; i<(N-N%VF); i+=VF){ a[i:i+VF] = a[i:i+VF] + b[i:i+VF]; }
Basically the compiler picks one operation that can be done on VF elements of the array at the same time and does this N/VF times instead of doing the single operation N times.
It increases performance, but puts more requirement on the architecture.
As mentioned above, vectorization is used to make use of SIMD instructions, which can perform identical operations of different data packed into large registers.
A generic guideline to enable a compiler to autovectorize a loop is to ensure that there are no flow- and anti-dependencies b/w data elements in different iterations of a loop.
http://en.wikipedia.org/wiki/Data_dependency
Some compilers like the Intel C++/Fortran compilers are capable of autovectorizing code. In case it was not able to vectorize a loop, the Intel compiler is capable of reporting why it could not do that. There reports can be used to modify the code such that it becomes vectorizable (assuming it's possible)
Dependencies are covered in depth in the book 'Optimizing Compilers for Modern Architectures: A Dependence-based Approach'
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With