Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How are the digits in ObjC method type encoding calculated?

Is is a follow-up to my previous question: What are the digits in an ObjC method type encoding string?

Say there is an encoding:

v24@0:4:8@12B16@20

How are those numbers calculated? B is a char so it should occupy just 1 byte (not 4 bytes). Does it have something to do with "alignment"? What is the size of void?

Is it correct to calculate the numbers as follows? Ask sizeof on every item and round up the result to multiple of 4? And the first number becomes the sum of all the other ones?

like image 461
Ecir Hana Avatar asked Jul 17 '12 17:07

Ecir Hana


3 Answers

The numbers were used in the m68K days to denote stack layout. That is, you could literally decode the the method signature and, for just about all types, know exactly which bytes at what offset within the stack frame you could diddle to get/set arguments.

This worked because the m68K's ABI was entirely [IIRC -- been a long long time] stack based argument/return passing. There wasn't anything shoved into registers across call boundaries.

However, as Objective-C was ported to other platforms, always-on-the-stack was no longer the calling convention. Arguments and return values are often passed in registers.

Thus, those offsets are now useless. As well, the type encoding used by the compiler is no longer complete (because it never was terribly useful) and there will be types that won't be encoded. Not too mention that encoding some C++ templatized types yields method type encoding strings that can be many Kilobytes in size (I think the record I ran into was around 30K of type information).

So, no, it isn't correct to use sizeof() to generate the numbers because they are effectively meaningless to everything. The only reason why they still exist is for binary compatibility; there are bits of esoteric code here and there that still parse the type encoding string with the expectation that there will be random numbers sprinkled here and there.

Note that there are vestiges of API in the ObjC runtime that still lead one to believe that it might be possible to encode/decode stack frames on the fly. It really isn't as the C ABI doesn't guarantee that argument registers will be preserved across call boundaries in the face of optimization. You'd have to drop to assembly and things get ugly really really fast (>shudder<).

like image 57
bbum Avatar answered Oct 26 '22 10:10

bbum


The full encoding string is constructed (in clang) by the method ASTContext::getObjCEncodingForMethodDecl, which you can find in lib/AST/ASTContext.cpp.

The method that does the size rounding is ASTContext::getObjCEncodingTypeSize, in the same file. It forces each size to be at least the size of an int. On all of Apple's current platforms, an int is 4 bytes.

like image 20
rob mayoff Avatar answered Oct 26 '22 12:10

rob mayoff


The stack frame size and argument offsets are calculated by the compiler. I'm actually trying to track this down in the Clang source myself this week; it possibly has something to do with CodeGenTypes::arrangeObjCMessageSendSignature. (Looks like Rob just made my life a lot easier!)

The first number is the sum of the others, yes -- it's the total space occupied by the arguments. To get the size of the type represented by an ObjC type encoding in your code, you should use NSGetSizeAndAlignment().

like image 1
jscs Avatar answered Oct 26 '22 10:10

jscs