Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a way to show where LLVM is auto vectorising?

Context: I have several loops in an Objective-C library I am writing which deal with processing large text arrays. I can see that right now it is running in a single threaded manner.

I understand that LLVM is now capable of auto-vectorising loops, as described at Apple's session at WWDC. It is however very cautious in the way it does it, one reason being the possibility of variables being modified due to CPU pipelining.

My question: how can I see where LLVM has vectorised my code, and, more usefully, how can I receive debug messages that explain why it can't vectorise my code? I'm sure if it can see why it can't auto-vectorise it, it could point that out to me and I could make the necessary manual adjustments to make it vectorisable.

I would be remiss if I didn't point out that this question has been more or less asked already, but quite obtusely, here.

like image 889
Swizzlr Avatar asked Nov 10 '13 13:11

Swizzlr


People also ask

What is loop interleaving?

Interleaving means that unrolled iterations are interleaved within a loop. In this example it first load 4 ymmword s (256 bits) from memory, starting all 4 iterations in parallel. Then it makes 4 additions again kinda in parallel.

What is a vectorized loop?

Loop vectorization transforms procedural loops by assigning a processing unit to each pair of operands. Programs spend most of their time within such loops. Therefore, vectorization can significantly accelerate them, especially over large data sets.

What is SIMD vectorization?

Vectorization is the process of converting an algorithm from operating on a single value at a time to operating on a set of values at one time. Modern CPUs provide direct support for vector operations where a single instruction is applied to multiple data (SIMD).


2 Answers

The standard llvm toolchain provided by Xcode doesn't seem to support getting debug info from the optimizer. However, if you roll your own llvm and use that, you should be able to pass flags as mishr suggested above. Here's the workflow I used:

1. Using homebrew, install llvm

brew tap homebrew/versions
brew install llvm33 --with-clang --with-asan

This should install the full and relatively current llvm toolchain. It's linked into /usr/local/bin/*-3.3 (i.e. clang++-3.3). The actual on-disk location is available via brew info llvm33 - probably /usr/local/Cellar/llvm33/3.3/bin.

2. Build the single file you're optimizing, with homebrew llvm and flags

If you've built in Xcode, you can easily copy-paste the build parameters, and use your clang++-3.3 instead of Xcode’s own clang.

Appending -mllvm -debug-only=loop-vectorize will get you the auto-vectorization report. Note: this will likely NOT work with any remotely complex build, e.g. if you've got PCH's, but is a simple way to tweak a single cpp file to make sure it's vectorizing correctly.

3. Create a compiler plugin from the new llvm

I was able to build my entire project with homebrew llvm by:

  1. Grabbing this Xcode compiler plugin: http://trac.seqan.de/browser/trunk/util/xcode/Clang%20LLVM%20MacPorts.xcplugin.zip?order=name
  2. Modifying the clang-related paths to point to my homebrew llvm and clang bin names (by appending '-3.3')
  3. Placing it in /Library/Application Support/Developer/5.0/Xcode/Plug-ins/

Relaunching Xcode should show this plugin in the list of available compilers. At this point, the -mllvm -debug-only=loop-vectorize flag will show the auto-vectorization report.

I have no idea why this isn't exposed in the Apple builds.

UPDATE: This is exposed in current (8.x) versions of Xcode. The only thing required is to enable one or more of the loop-vectorize flags.

like image 156
scztt Avatar answered Oct 12 '22 15:10

scztt


  • Identifies loops that were successfully vectorized:

    clang -Rpass=loop-vectorize
    
  • Identifies loops that failed vectorization and indicates if vectorization was specified:

    clang -Rpass-missed=loop-vectorize 
    
  • Identifies the statements that caused vectorization to fail:

    clang -Rpass-analysis=loop-vectorize
    

Source: http://llvm.org/docs/Vectorizers.html#diagnostics

like image 27
DTharun Avatar answered Oct 12 '22 14:10

DTharun