Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Can a C# statement generate non connected MSIL

The question is about C# language specification and CIL language specification, as well as Microsoft's and Mono's C# compiler behavior.

I'm building some code analysis tools (no matter what), which operate on CIL.

Considering a few code samples, I notice that code statements (try/catch, ifelse, ifthen, loops, ...) generate connected blocks of MSIL.

But I'd like to be sure that I can't write C# code construct which yields non-connected MSIL. More specifically, can I write any C# statement which translates to (something similar to):

IL_0000: 
IL_0001: 
IL_0002: 

// hole

IL_001a: 
IL_001b:

I already tried some weird stuff using goto and nested loops, but maybe I'm not as mad as some users would be.

like image 645
Regis Portalez Avatar asked Apr 23 '19 14:04

Regis Portalez


People also ask

Do AC recharge cans work?

Are Air Conditioning Recharge Kits Worth It? No, they are not because they don't fix broken AC systems. Instead, they simply recharge refrigerant and leave the cause of the problem unattended. So while a recharge may get cool air blowing again, it masks the real issue as it worsens.

Can AC be hot?

A car A/C blowing hot air is often the result of a refrigerant leak. Refrigerant is a liquid that circulates through your car's A/C system, expanding and contracting as it removes heat and humidity from the cabin. None of the other A/C components will function correctly without proper refrigerant levels.

Are all AC recharge kits the same?

AC recharge kits come in a variety of prices and capabilities. Some are just refrigerant in a can with a hose and a gauge. Others include supplies to diagnose issues beyond a low refrigerant pressure and make further repairs. Some of those can even be used to diagnose and recharge home HVAC systems.

Will AutoZone put Freon in my car?

If you need help performing this, or any AC recharge or service, check out our list of preferred shops in your area that can help. Or, if you'd like to tackle the job yourself, AutoZone has all the tools and refrigerant to service your R-134A or R-12 vehicle.


2 Answers

Sure, that's trivially possible. Something like:

static void M(bool x)
{
    if (x)
        return;
    else
        M(x);
    return;
}

If you compile that in debug mode you get

    IL_0000: nop
    IL_0001: ldarg.0
    IL_0002: stloc.0
    IL_0003: ldloc.0
    IL_0004: brfalse.s IL_0008
    IL_0006: br.s IL_0011
    IL_0008: ldarg.0
    IL_0009: call void A::M(bool)
    IL_000e: nop
    IL_000f: br.s IL_0011
    IL_0011: ret

The if statement goes from 0001 to 0009, and the consequence of the if is a goto to 0011; both return statements are the same code, so there is a "hole" containing a nop and an unconditional branch between the main body of the if and the consequence.

More generally, you should never assume anything whatsoever about the layout of the IL produced by the C# compiler. The compiler makes no guarantees whatsoever other than that the IL produced will be legal and, if safe, verifiable.


You say you are writing some code analysis tools; as the author of significant portions of the C# analyzer, and someone who worked on third-party analysis tools at Coverity, a word of advice: for the majority of questions you typically want answered about C# programs, the parse tree produced by Roslyn is the entity you wish to analyze, not the IL. The parse tree is a concrete syntax tree; it is one-to-one with every character in the source code. It can be very difficult to map optimized IL back to the original source code, and it can be very easy to produce false positives in an IL analysis.

Put another way: source-to-IL is semantics-preserving but also information-losing; you typically want to analyze the artifact that has the most information in it.

If you must, for whatever reason, operate your analyzer at the IL level, your first task should probably be to find the boundaries of the basic blocks, particularly if you are analyzing reachability properties.

A "basic block" is a contiguous chunk of IL where the end point of the block does not "carry on" to the following instruction -- because it is a branch, return or throw, for instance -- and there are no branches into the block to anywhere except the first instruction.

You can then form a graph of basic blocks for each method, indicating which ones can possible transfer control to which other blocks. This "raises the level" of your analysis; instead of analyzing the effects of a sequence of IL instructions, now you're analyzing the effects of a graph of basic blocks.

If you say more about what sorts of analysis you're doing I can advise further.

like image 175
Eric Lippert Avatar answered Oct 21 '22 14:10

Eric Lippert


In theory yes (this comes from my experience) . Your analysis tool does not deal with c# directly, but works on IL code only. IL can be produced by anybody, not only by visual studio, but also by other language compilers like visual basic, python. Net... and obfuscators! Obfuscators are the real culprit:while other compilers try to adhere to the specs, obfuscators do their best to exploit the specs and the target runtime.

Obfuscated code might violate certain common sense patterns. Consider this case: certain smart obfuscators produce illegal msil, but the jitter digest it because it happens that the invalid portions are in the end not executed.

When building an analysis tool, you can't handle these cases unless your target is to build a deobfuscator.

like image 31
Yennefer Avatar answered Oct 21 '22 13:10

Yennefer