Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Offset for objective C selectors in ARM assembly (IOS)

I'm trying to understand how IOS objective-C message calls are implemented in ARM assembly language. Looking at IDA disassembly output I can see the class and selector references pushed into registers before __obj_msgsend is called. This makes perfect sense but the strange thing is there is a weird offset to these values.

selector ref = (selRef_arrayWithObject_ - 0x29B0) 
class ref = (classRef_NSArray - 0x29BC)

The 0x29BC value in the class ref seems to be pointing to the instruction after the __obj_msgsend which has a certain logic to it, but the 0x29B0 selector ref is pointing to a random MOVT instruction. And to make matters worse this offset seems to be different for each selector invocation.

Does anyone know where these offsets come from? Why aren't they just referenced to the address of the instruction + 8?

__text:00002998 E8 1F 01 E3                 MOV             R1, #(selRef_arrayWithObject_ - 0x29B0) ; selRef_arrayWithObject_
__text:0000299C 05 20 A0 E1                 MOV             R2, R5
__text:000029A0 00 10 40 E3                 MOVT            R1, #0
__text:000029A4 01 50 A0 E3                 MOV             R5, #1
__text:000029A8 01 10 9F E7                 LDR             R1, [PC,R1] ; selRef_arrayWithObject_ ; "arrayWithObject:"
__text:000029AC 74 00 02 E3                 MOV             R0, #(classRef_NSArray - 0x29BC) ; classRef_NSArray
__text:000029B0 00 00 40 E3                 MOVT            R0, #0
__text:000029B4 00 00 9F E7                 LDR             R0, [PC,R0] ;     _OBJC_CLASS_$_NSArray
__text:000029B8 8C 05 00 EB                 BL              _objc_msgSend

Update: Here is another case:

__text:00002744 50 12 02 E3                 MOV             R1, #(selRef_view - 0x2758) ;    selRef_view
__text:00002748 00 10 40 E3                 MOVT            R1, #0
__text:0000274C 00 50 A0 E1                 MOV             R5, R0
__text:00002750 01 10 9F E7                 LDR             R1, [PC,R1] ; selRef_view ; "view"


__objc_selrefs:000049A8 1A 39 00 00 selRef_view     DCD sel_view            ; DATA XREF:     __text:000025F8o

Thanks to Igor's explanation, I understand where the 0x2758 came from, but the math doesn't work out here: selRef_view - 0x2758 = 0x49A8 - 0x2758 = 0x2250. But the data in the first instruction is 50 12, which translates to 0x1250, 0x1000 less that I would expect. Any ideas???

like image 983
Locksleyu Avatar asked Jan 23 '12 14:01

Locksleyu


1 Answers

In ARM, the PC value points two instruction slots ahead, i.e. . + 8 in ARM mode and . + 4 in Thumb mode. That's where the "random" values come from. For example:

__text:000029A8 LDR R1, [PC,R1]

Since we're in ARM mode, the PC value is 029A8 + 8 = 029B0. So, this code is equivalent to r1 = *(int*)(r1+0x29B0). IDA gives us a hint that R1 is loaded with the value (selRef_arrayWithObject_ - 0x29B0), so after simplification we get r1 = *(int*)(selRef_arrayWithObject_), which presumably resolves to the address of the string (selector) "arrayWithObject:".

like image 118
Igor Skochinsky Avatar answered Sep 19 '22 10:09

Igor Skochinsky