I'm messing around with the Objective-C runtime, trying to compile objective-c code without linking it against libobjc
, and I'm having some segmentation fault problems with a program, so I generated an assembly file from it. I think it's not necessary to show the whole assembly file. At some point of my main
function, I've got the following line (which, by the way, is the line after which I get the seg fault):
callq *l_objc_msgSend_fixup_alloc
and here is the definition for l_objc_msgSend_fixup_alloc
:
.hidden l_objc_msgSend_fixup_alloc # @"\01l_objc_msgSend_fixup_alloc"
.type l_objc_msgSend_fixup_alloc,@object
.section "__DATA, __objc_msgrefs, coalesced","aw",@progbits
.weak l_objc_msgSend_fixup_alloc
.align 16
l_objc_msgSend_fixup_alloc:
.quad objc_msgSend_fixup
.quad L_OBJC_METH_VAR_NAME_
.size l_objc_msgSend_fixup_alloc, 16
I've reimplemented objc_msgSend_fixup
as a function (id objc_msgSend_fixup(id self, SEL op, ...)
) which returns nil
(just to see what happens), but this function isn't even being called (the program crashes before calling it).
So, my question is, what is callq *l_objc_msgSend_fixup_alloc
supposed to do and what is objc_msgSend_fixup
(after l_objc_msgSend_fixup_alloc:
) supposed to be (a function or an object)?
Edit
To better explain, I'm not linking my source file against the objc library. What I'm trying to do is implement some parts of the libray, just to see how it works. Here is an approach of what I've done:
#include <stdio.h>
#include <objc/runtime.h>
@interface MyClass {
}
+(id) alloc;
@end
@implementation MyClass
+(id) alloc {
// alloc the object
return nil;
}
@end
id objc_msgSend_fixup(id self, SEL op, ...) {
printf("Calling objc_msgSend_fixup()...\n");
// looks for the method implementation for SEL in self's method list
return nil; // Since this is just a test, this function doesn't need to do that
}
int main(int argc, char *argv[]) {
MyClass *m;
m = [MyClass alloc]; // At this point, according to the assembly code generated
// objc_msgSend_fixup should be called. So, the program should, at least, print
// "Calling objc_msgSend_fixup()..." on the screen, but it crashes before
// objc_msgSend_fixup() is called...
return 0;
}
If the runtime needs to access the object's vtable or the method list of the obect's class to find the correct method to call, what is the function which actually does this? I think it is objc_msgSend_fixup
, in this case. So, when objc_msgSend_fixup
is called, it receives an object as one of its parameters, and, if this object hasn't been initialized, the function fails.
So, I've implemented my own version of objc_msgSend_fixup
. According to the assembly source above, it should be called. It doesn't matter if the function is actually looking for the implementation of the selector passed as parameter. I just want objc_msgSend_lookup
to be called. But, it's not being called, that is, the function that looks for the object's data is not even being called, instead of being called and cause a fault (because it returns a nil
(which, by the way, doesn't matter)). The program seg fails before objc_msgSend_lookup
is called...
Edit 2
A more complete assembly snippet:
.globl main
.align 16, 0x90
.type main,@function
main: # @main
.Ltmp20:
.cfi_startproc
# BB#0:
pushq %rbp
.Ltmp21:
.cfi_def_cfa_offset 16
.Ltmp22:
.cfi_offset %rbp, -16
movq %rsp, %rbp
.Ltmp23:
.cfi_def_cfa_register %rbp
subq $32, %rsp
movl $0, %eax
leaq l_objc_msgSend_fixup_alloc, %rcx
movl $0, -4(%rbp)
movl %edi, -8(%rbp)
movq %rsi, -16(%rbp)
movq L_OBJC_CLASSLIST_REFERENCES_$_, %rsi
movq %rsi, %rdi
movq %rcx, %rsi
movl %eax, -28(%rbp) # 4-byte Spill
callq *l_objc_msgSend_fixup_alloc
movq %rax, -24(%rbp)
movl -28(%rbp), %eax # 4-byte Reload
addq $32, %rsp
popq %rbp
ret
For l_objc_msgSend_fixup_alloc
, we have:
.hidden l_objc_msgSend_fixup_alloc # @"\01l_objc_msgSend_fixup_alloc"
.type l_objc_msgSend_fixup_alloc,@object
.section "__DATA, __objc_msgrefs, coalesced","aw",@progbits
.weak l_objc_msgSend_fixup_alloc
.align 16
l_objc_msgSend_fixup_alloc:
.quad objc_msgSend_fixup
.quad L_OBJC_METH_VAR_NAME_
.size l_objc_msgSend_fixup_alloc, 16
For L_OBJC_CLASSLIST_REFERENCES_$_
:
.type L_OBJC_CLASSLIST_REFERENCES_$_,@object # @"\01L_OBJC_CLASSLIST_REFERENCES_$_"
.section "__DATA, __objc_classrefs, regular, no_dead_strip","aw",@progbits
.align 8
L_OBJC_CLASSLIST_REFERENCES_$_:
.quad OBJC_CLASS_$_MyClass
.size L_OBJC_CLASSLIST_REFERENCES_$_, 8
OBJC_CLASS_$_MyClass
is a pointer to the MyClass
struct definition, which has been also generated by the compiler and it's also present in the assembly code.
To understand what objc_msgSend_fixup
is and what it does it's necessary to know exactly how message sending is performed in Objective-C. All the ObjC programmers have heard one day that the compiler transforms [obj message]
statements into objc_msgSend(obj, sel_registerName("message"))
calls. However, that's not entirely accurate.
To better ilustrate my explanation, consider the following ObjC snippet:
[obj mesgA];
[obj mesgB];
[obj mesgA];
[obj mesgB];
In this snippet, two messages are sent to obj
, each of which is sent twice. So, you might imagine that the following code is generated:
objc_msgSend(obj, sel_registerName("mesgA"));
objc_msgSend(obj, sel_registerName("mesgB"));
objc_msgSend(obj, sel_registerName("mesgA"));
objc_msgSend(obj, sel_registerName("mesgB"));
However sel_registerName
may be too costly and call it whenever a specific method is called is not a smart thing to do. Then, the compiler generates structures like this for each message to be sent:
typedef struct message_ref {
id (*trampoline) (id obj, struct message_ref *ref, ...);
union {
const char *str;
SEL sel;
};
} message_ref;
So, in the example above, when the program starts, we have something like this:
message_ref l_objc_msgSend_fixup_mesgA = { &objc_msgSend_fixup, "mesgA" };
message_ref l_objc_msgSend_fixup_mesgB = { &objc_msgSend_fixup, "mesgB" };
When these messages need to be sent to obj
, the compiler generates code equivalent to the following:
l_objc_msgSend_fixup_mesgA.trampoline(obj, &l_objc_msgSend_fixup_mesgA, ...); // [obj mesgA];
l_objc_msgSend_fixup_mesgB.trampoline(obj, &l_objc_msgSend_fixup_mesgB, ...); // [obj mesgB];
At the program startup, the message reference trampolines are pointers to the objc_msgSend_fixup
function. For each message_ref
, when its trampoline
pointer is invoked for the first time, objc_msgSend_fixup
gets called receiving the obj
to which the message's got to be sent and the message_ref
structure from which it was called. So, what objc_msgSend_fixup
must do is get the selector for the message to be called. Since, this has to be done only once for each message reference, objc_msgSend_fixup
must also replace the trampoline
field of the ref by a pointer to another function that doesn't fix the message's selector. This function is called objc_msgSend_fixedup
(the selector has been fixed up). Now that the message selector has been set and this doesn't have to be done again, objc_msgSend_fixup
just calls objc_msgSend_fixedup
and this just calls objc_msgSend
. After that, if a message ref's trampoline
is called again, its selector is already fixed, and objc_msgSend_fixedup
is the one that gets called.
In short, we could write objc_msgSend_fixup
and objc_msgSend_fixedup
like this:
id objc_msgSend_fixup(id obj, struct message_ref *ref, ...) {
ref->sel = sel_registerName(ref->str);
ref->trampoline = &objc_msgSend_fixedup;
objc_msgSend_fixedup(obj, ref, ...);
}
id objc_msgSend_fixedup(id obj, struct message_ref *ref, ...) {
objc_msgSend(obj, ref->sel, ...);
}
This makes message sending a lot faster, since the appropriate selector is discovered only at the first time the message is called (by objc_msgSend_fixup
). On later calls, the selector will have been already found and the message is called directly with objc_msgSend
(by objc_msgSend_fixedup
).
In the question's assembly code, l_objc_msgSend_fixup_alloc
is the alloc
method's message_ref
structure and the segmentation fault may have been caused by a problem in its first field (maybe it's not pointing to objc_msgSend_fixup
...)
Ok, your code is Objective-C, not C.
Edit / About objc_msgSend_fixup
objc_msgSend_fixup
is internal Objective-C runtime stuff, used to manage calls using a C++ style method vtable.
You may read some articles about this here:
Edit / End
Now about your segfault.
Objective-C uses a runtime for message passing, allocations, etc.
Message passing (method call) is usually done by the objc_msgSend
function.
That's what is used when you do:
[ someObject someFunction: someArg ];
It's translated to:
objc_msgSend( someObject, @selector( someFunction ), someArg );
So if you have a segfault in such a runtime function, such as objc_msgSend_fixup_alloc
, it certainly means you calling a method on an uninitialized pointer (if not using ARC), or on a freed object.
Something like:
NSObject * o;
[ o retain ]; // Will segfault somewhere in the Obj-C runtime in non ARC, as 'o' may point to anything.
Or:
NSObject * o;
o = [ [ NSObject alloc ] init ];
[ o release ];
[ o retain ]; // Will segfault somewhere in the Obj-C runtime as 'o' is no longer a valid object address.
So even if the segfault location is in the runtime, this is certainly a basic Objective-C memory management issue, in your own code.
Try enabling NSZombie, it should help.
Also try the static analyzer.
Edit 2
It's crashing in the runtime, because the runtime needs to access the object's vtable to find the correct method to call.
As the object is invalid, the vtable lookup results in the dereference of an invalid pointer.
This is why the segfault is located here.
Edit 3
You say you're not linked with the objc library.
What do you call the «objc library»?
I'm asking this because, as we can see in your code, you are definitively using an Objective-C compiler.
You may not link with the «Foundation» framework, for instance, which provides the base objects, but since you're using an Objective-C compiler, the libobjc library (providing the runtime) will still be implicitly linked.
Are you sure it's not the case? Try a simple nm
on your resulting binary.
Edit 4
If this is really the case, the objc_msgSend_fixup
is not the first function to do in order to recreate the runtime.
As you define a class, the runtime needs to know about it, so you need to code stuff like objc_allocateClassPair
and friends.
You'll also need to ensure the compiler won't use shortcuts.
I've seen in you're code stuff like: L_OBJC_CLASSLIST_REFERENCES_$_
.
Does this symbol exist in your own version?
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With