Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Adding Metadata to Instructions in LLVM IR

Tags:

First up, I am a newbie to LLVM passes.

I am trying to add metadata to instructions in LLVM after a transformation pass (with the C++ API). I intend to store this information for use by another tool in a tool chain. I have two questions regarding this.

  1. I expect the information I store as metadata to feed into another tool which works on the LLVM IR. So is metadata a good idea ? I intend to store strings as metadata with some instructions.

  2. If metadata is the right way to go here, I need some help creating a metadata node. I plan to use the setMedata() function to attach it to an instruction. Which variant of setMetadata() is the right one to use. I am not sure which MDKind should my data be of. I want to create a MDString, attach it to my MDNode and then call setMetadata() with an instruction. What Context should I use in the setMedata(), if I want to attach the metadata to an instruction inside a function. What is the relevance of context to metadata?

I tried reading up a lot of discussions in forums and the llvm doxygen docs but I did not get a clear and complete answer to all my questions. I appreciate your help or some material that could help me understand this.

like image 963
ash Avatar asked Nov 16 '12 23:11

ash


People also ask

What is metadata in LLVM?

The main "client" of metadata in LLVM is currently debug info. It's used by the front-end (e.g. Clang) to tag the LLVM IR it generates with debug information that correlates IR to the source code it came from. This same metadata is later translated to platform specific debug info such as DWARF by the code emitters.

Is Llvm an IR?

LLVM can provide the middle layers of a complete compiler system, taking intermediate representation (IR) code from a compiler and emitting an optimized IR. This new IR can then be converted and linked into machine-dependent assembly language code for a target platform.

What is LLVM IR code?

LLVM IR is a low-level intermediate representation used by the LLVM compiler framework. You can think of LLVM IR as a platform-independent assembly language with an infinite number of function local registers.

What is Bitcast?

A bitwise cast ( bitcast ) reinterprets a given bit pattern without changing any bits in the operand.


1 Answers

In my opinion:

1. Is metadata the right mechanism to use?

If your "other tool" is not a pass in itself, then yes, I think metadata is the best approach - keeps everything in the IR, easy to identify by eye, simple to manually add for testing, and - perhaps most importantly - does not collide with anything else, as long as you don't reuse existing metadata kinds.

However, if your "other tool" is a pass by itself, there's an alternative: you can make one pass dependent on the other, and than use information from the earlier directly in the later pass. The advantage is that you don't have to modify the IR.

2. How to use a custom metadata node?

Use the char* variant of setMetadata, like so:

LLVMContext& C = Inst->getContext();
MDNode* N = MDNode::get(C, MDString::get(C, "my md string content"));
Inst->setMetadata("my.md.name", N);

And if it's the first time the string is used in a setMetadata, it will automatically register my.md.name as a new kind in the module (it's actually consistent in the entire context, I believe). You can later on retrieve the string by using:

cast<MDString>(Inst->getMetadata("my.md.name")->getOperand(0))->getString();

If you want to invoke getMetadata or setMetadata repeatedly from the same scope, though, you can also use Module::getMDKindID to just get the actual kind used, and use the variations of these methods that use the kind value.

Finally, be aware that you can limit the metadata node scope to be inside a function - use the MDNode::get(..., ..., true) variant for that - though I never used it myself.

like image 101
Oak Avatar answered Oct 08 '22 01:10

Oak