Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Strange IL code emitted by some compiler

I've been looking at some old, (Reflector) decompiled source code that I dug up. The DLL was originally compiled from Visual Basic .NET source, using .NET 2.0 - apart from that I have no information about the compiler anymore.

At some point something strange happened. There was a branch in the code that wasn't followed, even though the condition should have holded. To be exact, this was the branch:

[...]
if (item.Found > 0)
{
    [...]

Now, the interesting part was that if item.Found was -1, the scope of the if statement was entered. The type of item.Found was int.

To figure out what was going on, I went looking in the IL code and found this:

ldloc.3 
ldfld int32 Info::Found
ldc.i4.0 
cgt.un
stloc.s flag3
ldloc.s flag3
brfalse.s L_0024

Obviously Reflector was wrong here. The correct decompiled code should have been:

if ((uint)item.Found > (uint)0) 
{ ... }

OK so far for context. Now for my question.

First off, I cannot imagine someone actually writing this code; IMO no-one with a sane mind makes the distinction between '-1' and '0' this way - which are the only two values that 'Found' can have.

So, that leaves me with the conclusion that the compiler does something I do not understand.

  • Why on earth / in what context would a compiler generate IL code like this? What's the benefit of this check (instead of ceq or bne_un - which is what I would have expected and is normally generated by C#)?
  • And related: what was the original source code most likely?
like image 471
atlaste Avatar asked Feb 09 '23 21:02

atlaste


2 Answers

Looks quirky but this is related to previous versions of Visual Basic, the generation that ended with VB6. It had a very different Boolean type representation, a VARIANT_BOOL. This still is a factor in VB.NET due to its need to support legacy code.

The value representation for True was different, it was -1. False is 0 like it is in .NET.

While that looks like a very quirky choice as well, any other language uses 1 to represent True, there was a very good reason for it. It makes the distinction between the logical and the mathemetical And and Or operators disappear. Which is pretty nice, one more thing a programmer doesn't have to learn. That this is a learning obstacle is pretty evident from the kind of code most any C# programmer writes, they blindly apply && or || in their if() statements. Even when it is not a good idea to do so, these operators are expensive due to the required short-circuiting branch in the machine code. If the left operand is poorly predicted by the processor's branch prediction then you'll easily lose a bunch of cpu cycles due to the pipeline stall.

Nice but not without problems, And and Or always evaluate both left and right operands. And that has a knack for tripping exceptions, sometimes you really do need short-circuiting. VB.NET added the AndAlso and OrElse operators to fix that problem.

So cgt.un makes sense, that can handle both a .NET Boolean value and a legacy value. It doesn't care if the True value is -1 or 1. And does not care that the variable or expression is actually Boolean, permitted in VB.NET with Option Strict Off.

like image 69
Hans Passant Avatar answered Feb 12 '23 11:02

Hans Passant


As an experiment I compiled this VB code:

Dim test As Boolean
test = True
Dim x As Integer
x = test
If x Then Console.WriteLine("True")

The IL for the release version of this is:

.custom instance void [mscorlib]System.STAThreadAttribute::.ctor()
.entrypoint
.maxstack 2
.locals init (
    [0] bool test,
    [1] int32 x)
L_0000: ldc.i4.1 
L_0001: stloc.0 
L_0002: ldloc.0 
L_0003: ldc.i4.0 
L_0004: cgt.un 
L_0006: neg 
L_0007: stloc.1 
L_0008: ldloc.1 
L_0009: ldc.i4.0 
L_000a: cgt.un 
L_000c: brfalse.s L_0018
L_000e: ldstr "True"
L_0013: call void [mscorlib]System.Console::WriteLine(string)
L_0018: ret 

Note the use of cgt.un

Reflector's interpretation as C# is:

bool test = true;
int x = (int) -(test > false);
if (x > 0x0)
{
    Console.WriteLine("True");
}

And as VB:

Dim test As Boolean = True
Dim x As Integer = CInt(-(test > False))
If (x > &H0) Then
    Console.WriteLine("True")
End If

Therefore I conclude the generated code is related to the conversion of the VB Boolean to a numeric value.

like image 21
Matthew Watson Avatar answered Feb 12 '23 10:02

Matthew Watson