Why do these simple methods compile differently?

Tags:

I'm slightly confused as to why clang is emitting different code for the following two method:

@interface ClassA : NSObject
@end

@implementation ClassA
+ (ClassA*)giveMeAnObject1 {
    return [[ClassA alloc] init];
}
+ (id)giveMeAnObject2 {
    return [[ClassA alloc] init];
}
@end

If we look at the ARMv7 emitted then we see this, at O3, with ARC enabled:

        .align  2
        .code   16
        .thumb_func     "+[ClassA giveMeAnObject1]"
"+[ClassA giveMeAnObject1]":
        push    {r7, lr}
        movw    r1, :lower16:(L_OBJC_SELECTOR_REFERENCES_-(LPC0_0+4))
        mov     r7, sp
        movt    r1, :upper16:(L_OBJC_SELECTOR_REFERENCES_-(LPC0_0+4))
        movw    r0, :lower16:(L_OBJC_CLASSLIST_REFERENCES_$_-(LPC0_1+4))
        movt    r0, :upper16:(L_OBJC_CLASSLIST_REFERENCES_$_-(LPC0_1+4))
LPC0_0:
        add     r1, pc
LPC0_1:
        add     r0, pc
        ldr     r1, [r1]
        ldr     r0, [r0]
        blx     _objc_msgSend
        movw    r1, :lower16:(L_OBJC_SELECTOR_REFERENCES_2-(LPC0_2+4))
        movt    r1, :upper16:(L_OBJC_SELECTOR_REFERENCES_2-(LPC0_2+4))
LPC0_2:
        add     r1, pc
        ldr     r1, [r1]
        blx     _objc_msgSend
        pop.w   {r7, lr}
        b.w     _objc_autorelease

        .align  2
        .code   16
        .thumb_func     "+[ClassA giveMeAnObject2]"
"+[ClassA giveMeAnObject2]":
        push    {r7, lr}
        movw    r1, :lower16:(L_OBJC_SELECTOR_REFERENCES_-(LPC2_0+4))
        mov     r7, sp
        movt    r1, :upper16:(L_OBJC_SELECTOR_REFERENCES_-(LPC2_0+4))
        movw    r0, :lower16:(L_OBJC_CLASSLIST_REFERENCES_$_-(LPC2_1+4))
        movt    r0, :upper16:(L_OBJC_CLASSLIST_REFERENCES_$_-(LPC2_1+4))
LPC2_0:
        add     r1, pc
LPC2_1:
        add     r0, pc
        ldr     r1, [r1]
        ldr     r0, [r0]
        blx     _objc_msgSend
        movw    r1, :lower16:(L_OBJC_SELECTOR_REFERENCES_2-(LPC2_2+4))
        movt    r1, :upper16:(L_OBJC_SELECTOR_REFERENCES_2-(LPC2_2+4))
LPC2_2:
        add     r1, pc
        ldr     r1, [r1]
        blx     _objc_msgSend
        pop.w   {r7, lr}
        b.w     _objc_autoreleaseReturnValue

The only difference is the tail call to objc_autoreleaseReturnValue vs objc_autorelease. I would expect both to call objc_autoreleaseReturnValue to be honest. In-fact the first method not using objc_autoreleaseReturnValue means that it will potentially be slower than the second because there will definitely be an autorelease then a retain by the caller, rather than the faster bypass of this redundant call that ARC can do if it's supported in the runtime.

The LLVM which is emitted gives some kind of reason why it's like that:

define internal %1* @"\01+[ClassA giveMeAnObject1]"(i8* nocapture %self, i8* nocapture %_cmd) {
  %1 = load %struct._class_t** @"\01L_OBJC_CLASSLIST_REFERENCES_$_", align 4
  %2 = load i8** @"\01L_OBJC_SELECTOR_REFERENCES_", align 4
  %3 = bitcast %struct._class_t* %1 to i8*
  %4 = tail call i8* bitcast (i8* (i8*, i8*, ...)* @objc_msgSend to i8* (i8*, i8*)*)(i8* %3, i8* %2)
  %5 = load i8** @"\01L_OBJC_SELECTOR_REFERENCES_2", align 4
  %6 = tail call i8* bitcast (i8* (i8*, i8*, ...)* @objc_msgSend to i8* (i8*, i8*)*)(i8* %4, i8* %5)
  %7 = tail call i8* @objc_autorelease(i8* %6) nounwind
  %8 = bitcast i8* %6 to %1*
  ret %1* %8
}

define internal i8* @"\01+[ClassA giveMeAnObject2]"(i8* nocapture %self, i8* nocapture %_cmd) {
  %1 = load %struct._class_t** @"\01L_OBJC_CLASSLIST_REFERENCES_$_", align 4
  %2 = load i8** @"\01L_OBJC_SELECTOR_REFERENCES_", align 4
  %3 = bitcast %struct._class_t* %1 to i8*
  %4 = tail call i8* bitcast (i8* (i8*, i8*, ...)* @objc_msgSend to i8* (i8*, i8*)*)(i8* %3, i8* %2)
  %5 = load i8** @"\01L_OBJC_SELECTOR_REFERENCES_2", align 4
  %6 = tail call i8* bitcast (i8* (i8*, i8*, ...)* @objc_msgSend to i8* (i8*, i8*)*)(i8* %4, i8* %5)
  %7 = tail call i8* @objc_autoreleaseReturnValue(i8* %6) nounwind
  ret i8* %6
}

But I'm struggling to see why it's decided to compile these two method differently. Can anyone shed some light onto it?

Update:

Even weirder is these other methods:

+ (ClassA*)giveMeAnObject3 {
    ClassA *a = [[ClassA alloc] init];
    return a;
}

+ (id)giveMeAnObject4 {
    ClassA *a = [[ClassA alloc] init];
    return a;
}

These compile to:

        .align  2
        .code   16
        .thumb_func     "+[ClassA giveMeAnObject3]"
"+[ClassA giveMeAnObject3]":
        push    {r4, r7, lr}
        movw    r1, :lower16:(L_OBJC_SELECTOR_REFERENCES_-(LPC2_0+4))
        add     r7, sp, #4
        movt    r1, :upper16:(L_OBJC_SELECTOR_REFERENCES_-(LPC2_0+4))
        movw    r0, :lower16:(L_OBJC_CLASSLIST_REFERENCES_$_-(LPC2_1+4))
        movt    r0, :upper16:(L_OBJC_CLASSLIST_REFERENCES_$_-(LPC2_1+4))
LPC2_0:
        add     r1, pc
LPC2_1:
        add     r0, pc
        ldr     r1, [r1]
        ldr     r0, [r0]
        blx     _objc_msgSend
        movw    r1, :lower16:(L_OBJC_SELECTOR_REFERENCES_2-(LPC2_2+4))
        movt    r1, :upper16:(L_OBJC_SELECTOR_REFERENCES_2-(LPC2_2+4))
LPC2_2:
        add     r1, pc
        ldr     r1, [r1]
        blx     _objc_msgSend
        blx     _objc_retainAutoreleasedReturnValue
        mov     r4, r0
        mov     r0, r4
        blx     _objc_release
        mov     r0, r4
        pop.w   {r4, r7, lr}
        b.w     _objc_autoreleaseReturnValue

        .align  2
        .code   16
        .thumb_func     "+[ClassA giveMeAnObject4]"
"+[ClassA giveMeAnObject4]":
        push    {r4, r7, lr}
        movw    r1, :lower16:(L_OBJC_SELECTOR_REFERENCES_-(LPC3_0+4))
        add     r7, sp, #4
        movt    r1, :upper16:(L_OBJC_SELECTOR_REFERENCES_-(LPC3_0+4))
        movw    r0, :lower16:(L_OBJC_CLASSLIST_REFERENCES_$_-(LPC3_1+4))
        movt    r0, :upper16:(L_OBJC_CLASSLIST_REFERENCES_$_-(LPC3_1+4))
LPC3_0:
        add     r1, pc
LPC3_1:
        add     r0, pc
        ldr     r1, [r1]
        ldr     r0, [r0]
        blx     _objc_msgSend
        movw    r1, :lower16:(L_OBJC_SELECTOR_REFERENCES_2-(LPC3_2+4))
        movt    r1, :upper16:(L_OBJC_SELECTOR_REFERENCES_2-(LPC3_2+4))
LPC3_2:
        add     r1, pc
        ldr     r1, [r1]
        blx     _objc_msgSend
        blx     _objc_retainAutoreleasedReturnValue
        mov     r4, r0
        mov     r0, r4
        blx     _objc_release
        mov     r0, r4
        pop.w   {r4, r7, lr}
        b.w     _objc_autoreleaseReturnValue

This time, they are identical however there's a few things which could be optimised even more here:

There's a redundant mov r4, r0 followed by mov r0, r4.
There's a retain followed by a release.

Surely, the bottom bit of both of those methods can turn into:

LPC3_2:
        add     r1, pc
        ldr     r1, [r1]
        blx     _objc_msgSend
        pop.w   {r4, r7, lr}
        b.w     _objc_autoreleaseReturnValue

Obviously we could then also omit popping r4 because we don't actually clobber it any more. Then the method would turn into the exact same as giveMeAnObject2 which is exactly what we'd expect.

Why is clang not being clever and doing this?!

881

asked Feb 05 '12 16:02

mattjgalloway

1 Answers

This appears to be a bug in the optimizer and is being tracked as rdar://problem/10813093.

answered Oct 21 '22 12:10

bbum

Related questions
                            
                                -[_NSObserverList setCursorPosition:]
                            
                                iOS 12.0 : Is there a way to set MFMailComposeViewController navigation bar title's text to white?
                            
                                iphone: (audio) posting message to kill mediaserverd
                            
                                Undoing Core Data insertions that are performed off the main thread
                            
                                Initialize an object with a superclass' instance
                            
                                menubar app behavior vs full screen app spaces (mac os 10.7/lion)
                            
                                NSManagedObject can't get attributes from NSAtomicStoreCacheNode
                            
                                Checking equality of arbitrary C types using Objective-C @encode -- already exists?
                            
                                How to set iOS app to use usb audio for input and output to internal speakers
                            
                                Swift alternative for #pragma clang diagnostic
                            
                                UItextfield password
                            
                                Applying filters on a video file
                            
                                Trying to dismiss UIAlertController with unknown presenter
                            
                                Casting Class type
                            
                                Clang Tool: rewrite ObjCMessageExpr
                            
                                Is it possible to use Touch-ID Authentication AND Keychain sharing in an iOS app?
                            
                                UIAlertAction handler running after delay
                            
                                Live app crash on UIImage imageNamed:

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Why do these simple methods compile differently?

Tags:

objective-c

automatic-ref-counting

clang

mattjgalloway

People also ask

1 Answers

bbum

Recent Activity

Donate For Us