Can someone please tell if the 2 optimization techniques are same or different?
Also, is it responsibility of programmer or compiler to do it?
Loop tiling can target the different levels of memory (including the cache levels) and can be tuned to maximize the reuse of data at a specific level of the memory hierarchy.
Loop unrolling, also known as loop unwinding, is a loop transformation technique that attempts to optimize a program's execution speed at the expense of its binary size, which is an approach known as space–time tradeoff.
Improved floating-point performance - loop unrolling can improve performance by providing the compiler more instructions to schedule across the unrolled iterations. This reduces the number of NOPs generated and also provides the compiler with a greater opportunity to generate parallel instructions.
The two techniques are different. See descriptions for Loop unrolling and Loop tiling.
Loop unrolling is done to eliminate the overhead of looping. It is (usually) only useful for fairly small loops where the number of iterations is small and is known at compile time. It is mostly done by the compiler.
In older times when computers were slower and compilers were more primitive, programmers would do manual loop unrolling but now it would be unusual for a programmer to do it -- except possibly for a very restrictive embedded system.
Loop tiling is commonly done with very large data sets. The object is: to load some data into cache memory and perform all operations on it before paging in some new data.
Depending on the operations being performed and the internal organisation of the data, a simple loop might jump about into different data pages causing a lot of cache misses (and page loads). Careful planning of the order of execution can significantly improve run-times for certain problems.
While it is likely that a compiler might perform loop tiling, there are times when the programmer might do so manually and possibly do a better job than the compiler.
In general, don't try to do these types of optimisation as they add a lot of complexity (and bugs) to the code and usually provide only modest performance gains. However if your code is slow and profiling indicates particular types of bottlenecks, then something like loop tiling should be considered and may lead to large performance gains.
These are two totally different performance optimisations.
Loop unrolling is a code optimisation where code is replicated within a loop and the total number of loop iterations is reduced. The benefit is reduced loop overhead (normally only relevant for very small loops), and better instruction scheduling with reduced dependency stalls in superscalar CPUs. This can be done both manually and/or as a compiler optimisation.
Tiling is a memory optimisation which aims to make better use of cache by processing tiles (small blocks within a larger data structure), typically in the context of an image or other 2D data structure. This is normally implemented at the source code level, as part of the overall design of an algorithm implementation.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With