I'd love to have a clear explanation on, in a Windows environment (PE executables), how do CALL XXXXXXXXXXXXXXX instructions work. I've been studying the PE format but I'm quite confused about the relationship between the CALL ADDRESS instruction, the importing of a function from a dll and how does the CALL ADDRESS reach out the code in a DLL. Besides ASLR and other security functions may move around DLLs, how do executables cope with this?
The CALL instruction interrupts the flow of a program by passing control to an internal or external subroutine. An internal subroutine is part of the calling program. An external subroutine is another program.
When an x86 CALL instruction is executed, the contents of program counter i.e. address of instruction following CALL, are stored in the stack and the program control is transferred to subroutine.
x86 calling conventionsSaves procedure linking information on the stack and branches to the procedure (called procedure) specified with the destination (target) operand. The target operand specifies the address of the first instruction in the called procedure.
The CALL instruction is used to call a subroutine, but the JUMP instruction updates the program counter value and point to another location inside the program.
It (that is, directly calling an import with a normal relative call) doesn't work, and that's why that's not how it's done.
To call an imported function, you go through something called the Import Address Table (IAT). In short, entries in the IAT first point to function names (ie it starts out as a copy of the Import Name Table), and those pointers are changed to point to the actual functions by the loader.
The IAT is at a fixed address, but can be relocated if the image has been rebased, so calling through it only involves a single indirection - so call r/m
is used with a memory operand (which is just a simple constant) to call imported functions, for example call [0x40206C]
.
22 jan 2013: added additional more simple concrete examples and discussion, since (A) an incorrect answer has been selected as solution, and (B) my original answer was evidently not understood by some readers, including the OP. Sorry about that, mea culpa. I just posted an answer in a hurry then, adding a code example that I already had on hand.
You ask,
“I've been studying the PE format but I'm quite confused about the relationship between the CALL ADDRESS instruction, the importing of a function from a dll and how does the CALL ADDRESS reach out the code in a DLL.”
The term CALL ADDRESS does not make much sense at the C++ level, so I’m assuming that you mean CALL ADDRESS at the assembly language or machine code level.
The problem is then, when a DLL is loaded at some address other than the preferred one, how are the call
instructions connected to the DLL functions?
call
with specified address works by calling a minimal forwarding routine that consists of a single jmp
instruction. The jmp
instruction calls the DLL function via a table lookup. Typically an import library for a DLL exports both the DLL function itself, with an __imp__
name prefix, and the wrapper routine without such name prefix, e.g. __imp__MessageBoxA@16
and _MessageBoxA@16
.I.e., except that I’ve invented the names below, the assembler usually translates
call MessageBox
into
call MessageBox_forwarder
;
whatever here
MessageBox_forwarder: jmp ds:[MessageBox_tableEntry]
When the DLL is loaded the loader places the relevant addresses in the table(s).
At the assembly language level a call
with the routine specified as just an identifier can map to either a call
to a forwarder, or a call
directly to the DLL function via a table lookup, depending on the type declared for the identifier.
There can be more than one table of DLL function addresses, even for imports from the same DLL. But in general they’re thought of as one big table, then called “the” Import Address Table, or IAT for short. The IAT table (or more precisely tables) are each at a fixed place in the image, i.e. they’re moved along with the code when it’s loaded somewhere not preferred, and not at a fixed address.
The currently selected solution answer is incorrect in these ways:
The answer maintains that “It doesn't work, and that's why that's not how it's done.”, where presumably the “It” refers to a CALL ADDRESS. But using CALL ADDRESS, in assembly or at the machine code level, works just fine for calling a DLL function. Provided it’s done correctly.
The answer maintains that the IAT is at a fixed address. But it isn’t.
Let’s consider a concrete CALL ADDRESS instruction where the address is of a very well known DLL function, namely a call of the MessageBoxA
Windows API function from the [user32.dll] DLL:
call MessageBoxA
There is no problem with using this instruction.
As you will see below, at the machine code level this call
instruction itself just contains an offset that causes the call to go a jmp
instruction, which looks up the DLL routine address in an Import Address Table of function pointers, which is usually fixed up by the loader when it loads the DLL in question.
In order to be able to inspect the machine code, here’s a complete 32-bit x86 assembly language program using that concrete example instruction:
.model flat, stdcall
option casemap :none ; Case sensitive identifiers, please.
_as32bit textequ <DWord ptr>
public start
ExitProcess proto stdcall :DWord
MessageBoxA_t typedef proto stdcall :DWord, :DWord, :DWord, :DWord
extern MessageBoxA : MessageBoxA_t
extern _imp__MessageBoxA@16 : ptr MessageBoxA_t
MB_ICONINFORMATION equ 0040h
MB_SETFOREGROUND equ 00010000h
infoBoxOptions equ MB_ICONINFORMATION or MB_SETFOREGROUND
.const
boxtitle_1 db "Just FYI 1 (of 3):", 0
boxtitle_2 db "Just FYI 2 (of 3):", 0
boxtitle_3 db "Just FYI 3 (of 3):", 0
boxtext db "There’s intelligence somewhere in the universe", 0
.code
start:
push infoBoxOptions
push offset boxtitle_1
push offset boxtext
push 0
call MessageBoxA ; Call #1 - to jmp to DLL-func.
push infoBoxOptions
push offset boxtitle_2
push offset boxtext
push 0
call ds:[_imp__MessageBoxA@16] ; Call #2 - directly to DLL-func.
push infoBoxOptions
push offset boxtitle_3
push offset boxtext
push 0
call _imp__MessageBoxA@16 ; Call #3 - same as #2, due to type of identifier.
push 0 ; Exit code, 0 indicates success.
call ExitProcess
end
Assembling and linking using Microsoft’s toolchain, where the /debug
linker option asks the linker to produce a PDB debug info file for use with the Visual Studio debugger:
[d:\dev\test\call] > ml /nologo /c asm_call.asm Assembling: asm_call.asm [d:\dev\test\call] > link /nologo asm_call.obj kernel32.lib user32.lib /entry:start /subsystem:windows /debug [d:\dev\test\call] > dir asm* /b asm_call.asm asm_call.exe asm_call.ilk asm_call.obj asm_call.pdb [d:\dev\test\call] > _
One easy way to debug this is now to fire up Visual Studio (the [devenv.exe] program) and in Visual Studio, click [Debug → Step into], or just press F11:
[d:\dev\test\call] > devenv asm_call.exe [d:\dev\test\call] > _
In the figure above, showing the Visual Studio 2012 debugger in action, the leftmost big red arrow shows you the address information within the machine code instruction, namely 0000004E
hex (note: the least significant byte is at lowest address, first in memory), and the other big red arrow shows you that, incredible as it may seem, this rather small magic number somehow designates the _MessageBoxA@16
function that, as far as the debugger knows, resides at address 01161064h
hex.
The address data in the CALL ADDRESS instruction is an offset, which is relative to the address of the next instruction, and so it doesn't need any fixup for changed DLL placement.
The address that the call goes to just contains a jmp ds:[IAT_entry_for_MessageBoxA]
.
This forwarder code comes from the import library, not from the DLL, so it does not need fixups either (but apparently it does get some special treatment, as does the DLL function address).
The second call instruction does directly what the jmp
does for the first, namely looking up the DLL function address in the IAT table.
The third call instruction can now be seen to be identical to the second one at the machine code level. Apparently it is not well known how to emulate Visual C++ declspec( dllimport )
in assembly. The above kind of declaration is one way, perhaps combined with a text equ.
The following C++ program reports the address where it has been loaded, what DLL functions it imports from what modules, and where the various IAT tables reside.
When it’ built with a modern version of Microsoft’s toolchain, just using the defaults, it is generally loaded at a different address each time it’s run.
You can prevent this behavior by using the linker option /dynamicbase:no
.
#include <assert.h> // assert
#include <stddef.h> // ptrdiff_t
#include <sstream>
using std::ostringstream;
#undef UNICODE
#define UNICODE
#include <windows.h>
template< class Result, class SomeType >
Result as( SomeType const p ) { return reinterpret_cast<Result>( p ); }
template< class Type >
class OffsetTo
{
private:
ptrdiff_t offset_;
public:
ptrdiff_t asInteger() const { return offset_; }
explicit OffsetTo( ptrdiff_t const offset ): offset_( offset ) {}
};
template< class ResultPointee, class SourcePointee >
ResultPointee* operator+(
SourcePointee* const p,
OffsetTo<ResultPointee> const offset
)
{
return as<ResultPointee*>( as<char const*>( p ) + offset.asInteger() );
}
int main()
{
auto const pImage =
as<IMAGE_DOS_HEADER const*>( ::GetModuleHandle( nullptr ) );
assert( pImage->e_magic == IMAGE_DOS_SIGNATURE );
auto const pNTHeaders =
pImage + OffsetTo<IMAGE_NT_HEADERS const>( pImage->e_lfanew );
assert( pNTHeaders->Signature == IMAGE_NT_SIGNATURE );
auto const& importDir =
pNTHeaders->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT];
auto const pImportDescriptors = pImage + OffsetTo<IMAGE_IMPORT_DESCRIPTOR const>(
importDir.VirtualAddress //+ importSectionHeader.PointerToRawData
);
ostringstream stream;
stream << "I'm loaded at " << pImage << ", and I'm using...\n";
for( int i = 0; pImportDescriptors[i].Name != 0; ++i )
{
auto const pModuleName = pImage + OffsetTo<char const>( pImportDescriptors[i].Name );
DWORD const offsetNameTable = pImportDescriptors[i].OriginalFirstThunk;
DWORD const offsetAddressTable = pImportDescriptors[i].FirstThunk; // The module "IAT"
auto const pNameTable = pImage + OffsetTo<IMAGE_THUNK_DATA const>( offsetNameTable );
auto const pAddressTable = pImage + OffsetTo<IMAGE_THUNK_DATA const>( offsetAddressTable );
stream << "\n* '" << pModuleName << "'";
stream << " with IAT at " << pAddressTable << "\n";
stream << "\t";
for( int j = 0; pNameTable[j].u1.AddressOfData != 0; ++j )
{
auto const pFuncName =
pImage + OffsetTo<char const>( 2 + pNameTable[j].u1.AddressOfData );
stream << pFuncName << " ";
}
stream << "\n";
}
MessageBoxA(
0,
stream.str().c_str(),
"FYI:",
MB_ICONINFORMATION | MB_SETFOREGROUND
);
}
Finally, from my original answer, here's a Microsoft assembler (MASM) program I made for another purpose that illustrates some of the issues, because by its nature (it produces as output source code that when assembled and run produces that same source code, and so on) it has to be completely relocatable code and with just the barest little help from the ordinary program loader:
.model flat, stdcall
option casemap :none ; Case sensitive identifiers, please.
dword_aligned textequ <4> ; Just for readability.
; Windows API functions:
extern ExitProcess@4: proc ; from [kernel32.dll]
extern GetStdHandle@4: proc ; from [kernel32.dll]
extern WriteFile@20: proc ; from [kernel32.dll]
extern wsprintfA: proc ; from [user32.dll]
STD_OUTPUT_HANDLE equ -11
; The main code.
GlobalsStruct struct dword_aligned
codeStart dword ?
outputStreamHandle dword ?
GlobalsStruct ends
globals textequ <(GlobalsStruct ptr [edi])>
.code
startup:
jmp code_start
; Trampolines to add references to these functions.
myExitProcess: jmp ExitProcess@4
myGetStdHandle: jmp GetStdHandle@4
myWriteFile: jmp WriteFile@20
mywsprintfA: jmp wsprintfA
;------------------------------------------------------------------
;
; The code below is reproduced, so it's all relative.
code_start:
jmp main
prologue:
byte ".model flat, stdcall", 13, 10
byte "option casemap :none", 13, 10
byte 13, 10
byte " extern ExitProcess@4: proc", 13, 10
byte " extern GetStdHandle@4: proc", 13, 10
byte " extern WriteFile@20: proc", 13, 10
byte " extern wsprintfA: proc", 13, 10
byte 13, 10
byte " .code", 13, 10
byte "startup:", 13, 10
byte " jmp code_start", 13, 10
byte 13, 10
byte "jmp ExitProcess@4", 13, 10
byte "jmp GetStdHandle@4", 13, 10
byte "jmp WriteFile@20", 13, 10
byte "jmp wsprintfA", 13, 10
byte 13, 10
byte "code_start:", 13, 10
prologue_nBytes equ $ - prologue
epilogue:
byte "code_end:", 13, 10
byte " end startup", 13, 10
epilogue_nBytes equ $ - epilogue
dbDirective byte 4 dup( ' ' ), "byte "
dbDirective_nBytes equ $ - dbDirective
numberFormat byte " 0%02Xh", 0
numberFormat_nBytes equ $ - numberFormat
comma byte ","
windowsNewline byte 13, 10
write:
push 0 ; space for nBytesWritten
mov ecx, esp ; &nBytesWritten
push 0 ; lpOverlapped
push ecx ; &nBytesWritten
push ebx ; nBytes
push eax ; &s[0]
push globals.outputStreamHandle
call myWriteFile
pop eax ; nBytesWritten
ret
displayMachineCode:
dmc_LocalsStruct struct dword_aligned
numberStringLen dword ?
numberString byte 16*4 DUP( ? )
fileHandle dword ?
nBytesWritten dword ?
byteIndex dword ?
dmc_LocalsStruct ends
dmc_locals textequ <[ebp - sizeof dmc_LocalsStruct].dmc_LocalsStruct>
mov ebp, esp
sub esp, sizeof dmc_LocalsStruct
; Output prologue that makes MASM happy (placing machine code data in context):
; lea eax, prologue
mov eax, globals.codeStart
add eax, prologue - code_start
mov ebx, prologue_nBytes
call write
; Output the machine code bytes.
mov dmc_locals.byteIndex, 0
dmc_lineLoop:
; loop start
; Output a db directive
;lea eax, dbDirective
mov eax, globals.codeStart
add eax, dbDirective - code_start
mov ebx, dbDirective_nBytes
call write
dmc_byteIndexingLoop:
; loop start
; Create string representation of a number
mov ecx, dmc_locals.byteIndex
mov eax, 0
;mov al, byte ptr [code_start + ecx]
mov ebx, globals.codeStart
mov al, [ebx + ecx]
push eax
;push offset numberFormat
mov eax, globals.codeStart
add eax, numberFormat - code_start
push eax
lea eax, dmc_locals.numberString
push eax
call mywsprintfA
add esp, 3*(sizeof dword)
mov dmc_locals.numberStringLen, eax
; Output string representation of number
lea eax, dmc_locals.numberString
mov ebx, dmc_locals.numberStringLen
call write
; Are we finished looping yet?
inc dmc_locals.byteIndex
mov ecx, dmc_locals.byteIndex
cmp ecx, code_end - code_start
je dmc_finalNewline
and ecx, 07h
jz dmc_after_byteIndexingLoop
; Output a comma
; lea eax, comma
mov eax, globals.codeStart
add eax, comma - code_start
mov ebx, 1
call write
jmp dmc_byteIndexingLoop
; loop end
dmc_after_byteIndexingLoop:
; New line
; lea eax, windowsNewline
mov eax, globals.codeStart
add eax, windowsNewline - code_start
mov ebx, 2
call write
jmp dmc_lineLoop;
; loop end
dmc_finalNewline:
; New line
; lea eax, windowsNewline
mov eax, globals.codeStart
add eax, windowsNewline - code_start
mov ebx, 2
call write
; Output epilogue that makes MASM happy:
; lea eax, epilogue
mov eax, globals.codeStart
add eax, epilogue - code_start
mov ebx, epilogue_nBytes
call write
mov esp, ebp
ret
main:
sub esp, sizeof GlobalsStruct
mov edi, esp
call main_knownAddress
main_knownAddress:
pop eax
sub eax, main_knownAddress - code_start
mov globals.codeStart, eax
push STD_OUTPUT_HANDLE
call myGetStdHandle
mov globals.outputStreamHandle, eax
call displayMachineCode
; Well behaved process exit:
push 0 ; Process exit code, 0 indicates success.
call myExitProcess
code_end:
end startup
And here's the self-reproducing output:
.model flat, stdcall
option casemap :none
extern ExitProcess@4: proc
extern GetStdHandle@4: proc
extern WriteFile@20: proc
extern wsprintfA: proc
.code
startup:
jmp code_start
jmp ExitProcess@4
jmp GetStdHandle@4
jmp WriteFile@20
jmp wsprintfA
code_start:
byte 0E9h, 03Bh, 002h, 000h, 000h, 02Eh, 06Dh, 06Fh
byte 064h, 065h, 06Ch, 020h, 066h, 06Ch, 061h, 074h
byte 02Ch, 020h, 073h, 074h, 064h, 063h, 061h, 06Ch
byte 06Ch, 00Dh, 00Ah, 06Fh, 070h, 074h, 069h, 06Fh
byte 06Eh, 020h, 063h, 061h, 073h, 065h, 06Dh, 061h
byte 070h, 020h, 03Ah, 06Eh, 06Fh, 06Eh, 065h, 00Dh
byte 00Ah, 00Dh, 00Ah, 020h, 020h, 020h, 020h, 065h
byte 078h, 074h, 065h, 072h, 06Eh, 020h, 020h, 045h
byte 078h, 069h, 074h, 050h, 072h, 06Fh, 063h, 065h
byte 073h, 073h, 040h, 034h, 03Ah, 020h, 070h, 072h
byte 06Fh, 063h, 00Dh, 00Ah, 020h, 020h, 020h, 020h
byte 065h, 078h, 074h, 065h, 072h, 06Eh, 020h, 020h
byte 047h, 065h, 074h, 053h, 074h, 064h, 048h, 061h
byte 06Eh, 064h, 06Ch, 065h, 040h, 034h, 03Ah, 020h
byte 070h, 072h, 06Fh, 063h, 00Dh, 00Ah, 020h, 020h
byte 020h, 020h, 065h, 078h, 074h, 065h, 072h, 06Eh
byte 020h, 020h, 057h, 072h, 069h, 074h, 065h, 046h
byte 069h, 06Ch, 065h, 040h, 032h, 030h, 03Ah, 020h
byte 070h, 072h, 06Fh, 063h, 00Dh, 00Ah, 020h, 020h
byte 020h, 020h, 065h, 078h, 074h, 065h, 072h, 06Eh
byte 020h, 020h, 077h, 073h, 070h, 072h, 069h, 06Eh
byte 074h, 066h, 041h, 03Ah, 020h, 070h, 072h, 06Fh
byte 063h, 00Dh, 00Ah, 00Dh, 00Ah, 020h, 020h, 020h
byte 020h, 02Eh, 063h, 06Fh, 064h, 065h, 00Dh, 00Ah
byte 073h, 074h, 061h, 072h, 074h, 075h, 070h, 03Ah
byte 00Dh, 00Ah, 020h, 020h, 020h, 020h, 06Ah, 06Dh
byte 070h, 020h, 020h, 020h, 020h, 020h, 063h, 06Fh
byte 064h, 065h, 05Fh, 073h, 074h, 061h, 072h, 074h
byte 00Dh, 00Ah, 00Dh, 00Ah, 06Ah, 06Dh, 070h, 020h
byte 045h, 078h, 069h, 074h, 050h, 072h, 06Fh, 063h
byte 065h, 073h, 073h, 040h, 034h, 00Dh, 00Ah, 06Ah
byte 06Dh, 070h, 020h, 047h, 065h, 074h, 053h, 074h
byte 064h, 048h, 061h, 06Eh, 064h, 06Ch, 065h, 040h
byte 034h, 00Dh, 00Ah, 06Ah, 06Dh, 070h, 020h, 057h
byte 072h, 069h, 074h, 065h, 046h, 069h, 06Ch, 065h
byte 040h, 032h, 030h, 00Dh, 00Ah, 06Ah, 06Dh, 070h
byte 020h, 077h, 073h, 070h, 072h, 069h, 06Eh, 074h
byte 066h, 041h, 00Dh, 00Ah, 00Dh, 00Ah, 063h, 06Fh
byte 064h, 065h, 05Fh, 073h, 074h, 061h, 072h, 074h
byte 03Ah, 00Dh, 00Ah, 063h, 06Fh, 064h, 065h, 05Fh
byte 065h, 06Eh, 064h, 03Ah, 00Dh, 00Ah, 020h, 020h
byte 020h, 020h, 065h, 06Eh, 064h, 020h, 073h, 074h
byte 061h, 072h, 074h, 075h, 070h, 00Dh, 00Ah, 020h
byte 020h, 020h, 020h, 062h, 079h, 074h, 065h, 020h
byte 020h, 020h, 020h, 020h, 020h, 020h, 020h, 030h
byte 025h, 030h, 032h, 058h, 068h, 000h, 02Ch, 00Dh
byte 00Ah, 06Ah, 000h, 08Bh, 0CCh, 06Ah, 000h, 051h
byte 053h, 050h, 0FFh, 077h, 004h, 0E8h, 074h, 0FEh
byte 0FFh, 0FFh, 058h, 0C3h, 08Bh, 0ECh, 083h, 0ECh
byte 050h, 08Bh, 007h, 005h, 005h, 000h, 000h, 000h
byte 0BBh, 036h, 001h, 000h, 000h, 0E8h, 0D7h, 0FFh
byte 0FFh, 0FFh, 0C7h, 045h, 0FCh, 000h, 000h, 000h
byte 000h, 08Bh, 007h, 005h, 057h, 001h, 000h, 000h
byte 0BBh, 00Fh, 000h, 000h, 000h, 0E8h, 0BFh, 0FFh
byte 0FFh, 0FFh, 08Bh, 04Dh, 0FCh, 0B8h, 000h, 000h
byte 000h, 000h, 08Bh, 01Fh, 08Ah, 004h, 019h, 050h
byte 08Bh, 007h, 005h, 066h, 001h, 000h, 000h, 050h
byte 08Dh, 045h, 0B4h, 050h, 0E8h, 02Ah, 0FEh, 0FFh
byte 0FFh, 083h, 0C4h, 00Ch, 089h, 045h, 0B0h, 08Dh
byte 045h, 0B4h, 08Bh, 05Dh, 0B0h, 0E8h, 08Fh, 0FFh
byte 0FFh, 0FFh, 0FFh, 045h, 0FCh, 08Bh, 04Dh, 0FCh
byte 081h, 0F9h, 068h, 002h, 000h, 000h, 074h, 02Bh
byte 083h, 0E1h, 007h, 074h, 013h, 08Bh, 007h, 005h
byte 06Eh, 001h, 000h, 000h, 0BBh, 001h, 000h, 000h
byte 000h, 0E8h, 06Bh, 0FFh, 0FFh, 0FFh, 0EBh, 0AAh
byte 08Bh, 007h, 005h, 06Fh, 001h, 000h, 000h, 0BBh
byte 002h, 000h, 000h, 000h, 0E8h, 058h, 0FFh, 0FFh
byte 0FFh, 0EBh, 086h, 08Bh, 007h, 005h, 06Fh, 001h
byte 000h, 000h, 0BBh, 002h, 000h, 000h, 000h, 0E8h
byte 045h, 0FFh, 0FFh, 0FFh, 08Bh, 007h, 005h, 03Bh
byte 001h, 000h, 000h, 0BBh, 01Ch, 000h, 000h, 000h
byte 0E8h, 034h, 0FFh, 0FFh, 0FFh, 08Bh, 0E5h, 0C3h
byte 083h, 0ECh, 008h, 08Bh, 0FCh, 0E8h, 000h, 000h
byte 000h, 000h, 058h, 02Dh, 04Ah, 002h, 000h, 000h
byte 089h, 007h, 06Ah, 0F5h, 0E8h, 098h, 0FDh, 0FFh
byte 0FFh, 089h, 047h, 004h, 0E8h, 023h, 0FFh, 0FFh
byte 0FFh, 06Ah, 000h, 0E8h, 084h, 0FDh, 0FFh, 0FFh
code_end:
end startup
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With