I'm trying to understand how IOS objective-C message calls are implemented in ARM assembly language. Looking at IDA disassembly output I can see the class and selector references pushed into registers before __obj_msgsend is called. This makes perfect sense but the strange thing is there is a weird offset to these values.
selector ref = (selRef_arrayWithObject_ - 0x29B0)
class ref = (classRef_NSArray - 0x29BC)
The 0x29BC value in the class ref seems to be pointing to the instruction after the __obj_msgsend which has a certain logic to it, but the 0x29B0 selector ref is pointing to a random MOVT instruction. And to make matters worse this offset seems to be different for each selector invocation.
Does anyone know where these offsets come from? Why aren't they just referenced to the address of the instruction + 8?
__text:00002998 E8 1F 01 E3 MOV R1, #(selRef_arrayWithObject_ - 0x29B0) ; selRef_arrayWithObject_
__text:0000299C 05 20 A0 E1 MOV R2, R5
__text:000029A0 00 10 40 E3 MOVT R1, #0
__text:000029A4 01 50 A0 E3 MOV R5, #1
__text:000029A8 01 10 9F E7 LDR R1, [PC,R1] ; selRef_arrayWithObject_ ; "arrayWithObject:"
__text:000029AC 74 00 02 E3 MOV R0, #(classRef_NSArray - 0x29BC) ; classRef_NSArray
__text:000029B0 00 00 40 E3 MOVT R0, #0
__text:000029B4 00 00 9F E7 LDR R0, [PC,R0] ; _OBJC_CLASS_$_NSArray
__text:000029B8 8C 05 00 EB BL _objc_msgSend
Update: Here is another case:
__text:00002744 50 12 02 E3 MOV R1, #(selRef_view - 0x2758) ; selRef_view
__text:00002748 00 10 40 E3 MOVT R1, #0
__text:0000274C 00 50 A0 E1 MOV R5, R0
__text:00002750 01 10 9F E7 LDR R1, [PC,R1] ; selRef_view ; "view"
__objc_selrefs:000049A8 1A 39 00 00 selRef_view DCD sel_view ; DATA XREF: __text:000025F8o
Thanks to Igor's explanation, I understand where the 0x2758 came from, but the math doesn't work out here: selRef_view - 0x2758 = 0x49A8 - 0x2758 = 0x2250. But the data in the first instruction is 50 12, which translates to 0x1250, 0x1000 less that I would expect. Any ideas???
In ARM, the PC value points two instruction slots ahead, i.e. . + 8 in ARM mode and . + 4 in Thumb mode. That's where the "random" values come from. For example:
__text:000029A8 LDR R1, [PC,R1]
Since we're in ARM mode, the PC value is 029A8 + 8 = 029B0. So, this code is equivalent to r1 = *(int*)(r1+0x29B0)
. IDA gives us a hint that R1 is loaded with the value (selRef_arrayWithObject_ - 0x29B0), so after simplification we get r1 = *(int*)(selRef_arrayWithObject_)
, which presumably resolves to the address of the string (selector) "arrayWithObject:"
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With