Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

The GEP Instruction: i32 vs i64

Tags:

llvm

llvm-ir

I've been trying to understand the LLVM’s GetElementPtr (GEP) instruction and came across this document:

http://llvm.org/docs/GetElementPtr.html

It's very helpful, but there's a few things that I find confusing. In particular, in section 'What is dereferenced by GEP?' (http://llvm.org/docs/GetElementPtr.html#id6) the following code is discussed:

%MyVar = uninitialized global { [40 x i32 ]* }
...
%idx = getelementptr { [40 x i32]* }, { [40 x i32]* }* %MyVar, i64 0, i32 0, i64 0, i64 17

%MyVar is a global variable that is a pointer to a structure containing a pointer to an array of 40 ints. This is clear. I understand that arguments after %MyVar are indices into it, but I don't see why some of them are declared as i64 and others as i32.

My understanding is that this code was written for a 64 bit machine and that pointers are assumed to be 64 bits wide. The contents of the array pointed to by %MyVar are 32 bits wide. Why then is the last index i64 17 rather than i32 17?

I should also point out that this example illustrates illegal usage of GEP (the pointer in the structure must be dereferenced in order to index into the array of 40 ints) and I am trying to get a very good grasp of why this is the case.

like image 495
banach-space Avatar asked Oct 31 '22 03:10

banach-space


1 Answers

The answer to the question, "what is dereferenced by GEP?" is nothing. This means that GEP does never dereference pointers: it only computes new addresses based on a pointer that you pass it. It never reads any memory.

Look at the example:

%idx = getelementptr { [40 x i32]* }, { [40 x i32]* }* %MyVar, i64 0, i32 0, i64 0, i64 17

We start with%MyVar which is a { [40 x i32]* }*, a pointer to a struct containing a pointer to an array.

After indexing with i64 0, we have a reference to a struct { [40 x i32]* }. %MyVar already pointed to this, no dereferencing necessary.

After indexing with the second i32 0, we now refer to the [40 x i32]*, the only member of the struct. It has the same memory location as the struct itself, which is at %MyVar.

The third index i64 0 would now refer to the [40 x i32] array itself. This is illegal. GEP would need to dereference the pointer obtained in the previous step to obtain this memory address. In general, GEP can never index "through" a pointer, with the obvious exception that the initial value you pass to it is always a pointer.

I will also point out that i32 0 and i64 0 are the same for the purposes of indexing, both refer to the first element in a struct/array. The same holds for the constant 17 that you mentioned.

like image 194
cfh Avatar answered Nov 20 '22 11:11

cfh