Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is bool check faster than null check?

Tags:

performance

c#

E.g. do I need to extract

bool xIsNull = x == null

from the loop where I check x == null?

As I know if (a == true) and if (x == null) both use the same IL instruction. But pointers consist of 32 or 64 bits. Does CLR should check every bit to compare with null?

UPDATE
Quick test shows that there is no difference but I still would like somebody to explain this.

UPDATE2
I use IL emitting so I can't expect compiler to optimize my code. Only by JIT.

like image 556
Vlad Avatar asked Jan 23 '14 14:01

Vlad


2 Answers

Remembering that "Premature Optimization is the Root of all Evil" and that the first rule of optimization is "Don't" (the second, only for pros, is "Don't Do it Yet"), here's what happens.

TL;DR
If you don't feel like diving into some assembly code I won't blame you ;) Results show that using a temporary variable does not get optimized out and generates a couple more instructions. Summing it up, though, it's not going to make any difference unless you are coding very time critical tasks.


Consider this code:

string x = null;
bool a = x == null;

if ( a == true ) { Console.WriteLine( ); }
if ( x == null ) { Console.WriteLine( ); }

This is the generated IL in Debug mode (I added some comments):

.method private hidebysig static void  Main(string[] args) cil managed
{
  .entrypoint
  // Code size       46 (0x2e)
  .maxstack  2
  .locals init ([0] string x,
           [1] bool a,
           [2] bool CS$4$0000)
  IL_0000:  nop
  IL_0001:  ldnull
  IL_0002:  stloc.0    // string x = null
  IL_0003:  ldloc.0
  IL_0004:  ldnull
  IL_0005:  ceq        // compare x and null
  IL_0007:  stloc.1    // and store the result in a
  IL_0008:  ldloc.1
  IL_0009:  ldc.i4.0 
  IL_000a:  ceq
  IL_000c:  stloc.2    // compare a and false
  IL_000d:  ldloc.2
  IL_000e:  brtrue.s   IL_0018   // if true (that is, a is false), skip
  IL_0010:  nop
  IL_0011:  call       void [mscorlib]System.Console::WriteLine()
  IL_0016:  nop
  IL_0017:  nop
  IL_0018:  ldloc.0
  IL_0019:  ldnull  
  IL_001a:  ceq        // compare x and null
  IL_001c:  ldc.i4.0
  IL_001d:  ceq        // and compare with false
  IL_001f:  stloc.2
  IL_0020:  ldloc.2
  IL_0021:  brtrue.s   IL_002b  // if true (that is, x == null), skip
  IL_0023:  nop
  IL_0024:  call       void [mscorlib]System.Console::WriteLine()
  IL_0029:  nop
  IL_002a:  nop
  IL_002b:  br.s       IL_002d
  IL_002d:  ret
} // end of method Program::Main

Overall, there are a lot of ldloc and stloc which read and write data to memory; they are highly reduntant to help the debugger. But you can see that there is an hidden local variable which has the exact same function as a: so if you don't use a temporary variable the compiler is going to use it for you. Also note the use of a generic null.

Now here's the Release IL with optimizations enabled:

.method private hidebysig static void  Main(string[] args) cil managed
{
  .entrypoint
  // Code size       24 (0x18)
  .maxstack  2
  .locals init ([0] string x,
           [1] bool a)
  IL_0000:  ldnull
  IL_0001:  stloc.0     // set x to null
  IL_0002:  ldloc.0
  IL_0003:  ldnull
  IL_0004:  ceq
  IL_0006:  stloc.1     // bool a = x == null
  IL_0007:  ldloc.1
  IL_0008:  brfalse.s  IL_000f  // if false skip
  IL_000a:  call       void [mscorlib]System.Console::WriteLine()
  IL_000f:  ldloc.0
  IL_0010:  brtrue.s   IL_0017  // if true (so x != null) skip
  IL_0012:  call       void [mscorlib]System.Console::WriteLine()
  IL_0017:  ret
} // end of method Program::Main

In the optimized version the compiler does not perform explicit comparisons and does not use a temp variable. Still, like in the unoptimized version, it stores a and loads it right after that to check for the condition; this is because stloc pops a out of the stack so it has to push it again.

Now, let's compare the code generated by the JITter (I set x = Console.Readline() to prevent the whole code being optimized out). This is for the debug configuration (as seen in Visual Studio):

            string x = null;
00000043  xor         edx,edx 
00000045  mov         dword ptr [ebp-40h],edx 
            bool a = x == null;
00000048  cmp         dword ptr [ebp-40h],0 
0000004c  sete        al 
0000004f  movzx       eax,al 
00000052  mov         dword ptr [ebp-44h],eax 

            if ( a == true ) { Console.WriteLine( ); }
00000055  cmp         dword ptr [ebp-44h],0 
00000059  sete        al 
0000005c  movzx       eax,al 
0000005f  mov         dword ptr [ebp-48h],eax 
00000062  cmp         dword ptr [ebp-48h],0 
00000066  jne         00000070 
00000068  nop 
00000069  call        6027B57C 
0000006e  nop 
0000006f  nop 
            if ( x == null ) { Console.WriteLine( ); }
00000054  cmp         dword ptr [ebp-0Ch],0 
00000058  jne         00000065 
0000005a  mov         ecx,dword ptr ds:[0350208Ch] 
00000060  call        602DD5E0 
            return;
00000065  nop 
00000066  mov         esp,ebp 
00000068  pop         ebp 
00000069  ret 

As you can see, this code follows closely the corresponding unoptimized IL and uses a temporary variable when checking the condition for a. On the other hand, since null is implemented as 0 on my machine, comparing x and null is way quicker.

And here's the code for the release as seen through OllyDbg:

                string x = Console.ReadLine( );
002F0075    E8 EA808A60     CALL mscorlib_ni.60B98164
002F007A    8BC8            MOV ECX, EAX
002F007C    8B01            MOV EAX, DWORD PTR DS:[ECX]
002F007E    8B40 2C         MOV EAX, DWORD PTR DS:[EAX+2C]
002F0081    FF50 1C         CALL DWORD PTR DS:[EAX+1C]

                bool a = x == null;
002F0084    8BF0            MOV ESI, EAX
002F0086    85F6            TEST ESI, ESI
002F0088    0F94C0          SETE AL
002F008B    0FB6C0          MOVZX EAX, AL
002F008E    8BF8            MOV EDI, EAX

                Systed.Diagnostics.Debugger.Break( );
002F0090    E8 E37C8E60     CALL mscorlib_ni.60BD7D78

                if ( a == true ) { Console.ReadLine( ); }
002F0095    85FF            TEST EDI, EDI
002F0097    74 0E           JE SHORT 002F00A7
002F0099    E8 A6F92D60     CALL mscorlib_ni.605CFA44
002F009E    8BC8            MOV ECX, EAX
002F00A0    8B01            MOV EAX, DWORD PTR DS:[ECX]
002F00A2    8B40 38         MOV EAX, DWORD PTR DS:[EAX+38]
002F00A5    FF10            CALL DWORD PTR DS:[EAX]

                if ( x == null ) { Console.ReadLine( ); }
002F00A7    85F6            TEST ESI, ESI
002F00A9    75 0E           JNE SHORT 002F00B9
002F00AB    E8 94F92D60     CALL mscorlib_ni.605CFA44
002F00B0    8BC8            MOV ECX, EAX
002F00B2    8B01            MOV EAX, DWORD PTR DS:[ECX]
002F00B4    8B40 38         MOV EAX, DWORD PTR DS:[EAX+38]
002F00B7    FF10            CALL DWORD PTR DS:[EAX]

                return;
002F00B9    5E              POP ESI
002F00BA    5F              POP EDI
002F00BB    5D              POP EBP
002F00BC    C3              RETN

In this code, a is held in edi and x is held in esi and there are some calls to mscorlib to retrieve the pointers to ReadLine and WriteLine. That being said, there actually is a difference between the two approaches; after comparing x with null (test esi, esi) the result is moved from the zero flag to al (sete al), then extended to the whole eax (movzx eax, al).

So, even in such a simple case, the JITter isn't doing a good work; therefore, you can expect a minor performance gain without the temporary variable.

like image 127
BlackBear Avatar answered Oct 24 '22 02:10

BlackBear


So with modern compilation and hardware optimizations I doubt there would be enough performance difference to make enough difference to worry about (if any).

Part of the point higher level languages like C# is to take these insignificant optimization details out of the hands of app devs and leave them to compiler devs who will do it much better and free up app devs to make their decisions off of readability/maintainablity and spend more time on high level algorithmic efficiency rather than the low level stuff. If you are having performance issues this is probably the least of your worries.

Bottom line I would recommend using whatever you feels makes your code most readable.

like image 22
Matthew Beatty Avatar answered Oct 24 '22 02:10

Matthew Beatty