Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is string formation optimized by the compiler?

I was trying to answer another question about the == operator and I created this code:

NSString *aString = @"Hello";
NSString *bString = aString;
NSString *cString = @"Hello";

if (aString == bString)
    NSLog(@"CHECK 1");

if (bString == cString)
    NSLog(@"CHECK 2");

if ([aString isEqual:bString])
    NSLog(@"CHECK 3");

if ([aString isEqual:cString])
    NSLog(@"CHECK 4");

NSLog(@"%i", aString);
NSLog(@"%i", bString);
NSLog(@"%i", cString);

But was surprised at the results:

Equal[6599:10b] CHECK 1
Equal[6599:10b] CHECK 2
Equal[6599:10b] CHECK 3
Equal[6599:10b] CHECK 4
Equal[6599:10b] 8240
Equal[6599:10b] 8240
Equal[6599:10b] 8240

Is there some compiler trickery going on here?

like image 916
nevan king Avatar asked Dec 17 '22 06:12

nevan king


2 Answers

There is clearly string uniquing going on, at least within a single compilation unit. I recommend you take a brief tour through man gcc during which you visit all uses of "string". You'll find a few options that are directly relevant to literal NSStrings and their toll-free-bridged counterparts, CFStrings:

  • -fconstant-string-class=class-name sets the name of the class used to instantiate @"..." literals. It defaults to NSConstantString unless you're using the GNU runtime. (If you don't know if you are, you aren't.)
  • -fconstant-cfstrings enables use of a builtin to create CFStrings when you write CFSTR(...).

You can disable uniquing for C string literals using -fwritable-strings, though this option is deprecated. I couldn't come up with a combination of options that would stop the uniquing of NSString literals in an Objective-C file. (Anyone want to speak to Pascal string literals?)

You see -fconstant-cfstrings coming into play in CFString.h's definition of the CFSTR() macro used to create CFString literals:

    #ifdef __CONSTANT_CFSTRINGS__
    #define CFSTR(cStr)  ((CFStringRef) __builtin___CFStringMakeConstantString ("" cStr ""))
    #else
    #define CFSTR(cStr)  __CFStringMakeConstantString("" cStr "")
    #endif

If you look at the implementation of the non-builtin __CFStringMakeConstantString() in CFString.c, you'll see that the function does indeed perform uniquing using a very large CFMutableDictionary:

    if ((result = (CFStringRef)CFDictionaryGetValue(constantStringTable, cStr))) {
        __CFSpinUnlock(&_CFSTRLock);
    }
    // . . .
    return result;

See also responses to the question, "What's the difference between a string constant and a string literal?"

like image 151
Jeremy W. Sherman Avatar answered Dec 24 '22 00:12

Jeremy W. Sherman


NSString is defined as an immutable type, so whenever the compiler can optimize things by combining identical strings, it should. As your code demonstrates, gcc clearly does perform this optimization for simple cases.

like image 30
user57368 Avatar answered Dec 24 '22 01:12

user57368