Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Understanding of Managed C++

I am having a trouble understanding how does managed C++ work and compile.

In .NET Framework, you can develop code in C#/VB/F#/..etc. and all of these languages will be compiled to the same Common Intermediate Language (CIL) which is similar to Java Bytecode. Theoretically, the CIL can be mounted on any platform (Mono made it practical). On Windows, CLR compiles the CIL to native code Just-In-Time (JIT) and everything runs smoothly and good to go.

Now, how does Managed C++ get compiled? Does it compile to CIL code and wait for CLR to run it using JIT? I think not, because Managed C++ can use Standard C++ code (which isn't compiled to CIL). Moreover, how is it able to use .NET assemblies (which are CIL)?

I would appreciate any help. Thanks

EDIT:

I have seen this answer. It notes that in C++/CLI, the managed code gets compiled to MSIL, and you have an option to compile the unmanaged code either to native code or to MSIL. Therefore, I understand now how is the call to .NET assemblies possible.

Anyway, I still do not understand how the C++ unmanaged code can run with the managed code in the same assembly if the unmanaged code was compiled to native code. Any ideas?

like image 317
Everyone Avatar asked Feb 06 '23 00:02

Everyone


1 Answers

It is a big topic with very gritty implementation details. Hard to address them all, but there are some misconceptions in the question. Let's address those, might help to get to next stage.

Moreover, how is it able to use .NET assemblies (which are CIL)?

Not just CIL, the linker produces a mixed-mode assembly. Contains both .NET metadata + msil and native code. In fact, as far as the OS loader is concerned, it is the native code in the executable file that is normal. No different from the kind produced by a native C++ compiler. It gets loaded and relocated just like a pure native executable image. It is the .NET metadata + msil that is the oddball. To the loader it just looks like a chunk of data, it doesn't touch it at all. Only the CLR does.

... use Standard C++ code (which isn't compiled to CIL)

Not quite accurate, native C++ code can be compiled to msil or machine code. What you get depends on whether the /clr compile option was used or the #pragma managed that was in effect at the function level. CIL does not compare that well to, say, the bytecode used in the Java JVM. It is more powerful and can support any C++03 compliant native C++ code. Sometimes you do this on purpose to take advantage of reverse pinvoke (native code calling managed code). Sometimes it is done by accident and entirely too much native C++ code gets compiled to msil. The machine code produced by the jitter is not as optimal (it optimizes under time constraints) and is not managed in any way. It is not verifiable and doesn't get garbage collector love.

Best mental image for CIL is as the Intermediate Representation that is used in any native C++ compiler between the front-end (parser) and the back-end (code-generator and optimizer). Often an invisible implementation detail but gets more visible when you use a C++ compiler that uses LLVM (like Clang does). The .NET just-in-time compiler does at runtime what LLVM does at compile time.


Most programmers have the mental image of a giant mode switch being thrown when managed code calls native code (or the other way around). That is not accurate at all. You might want to take a look at this post, shows the difference between machine code produced by the C++ compiler's back-end and the jitter. Key is it is almost identical, an essential feature to ensure managed code is competitive with native code. Helps to clarify how managed code calling native code, or the other way around, is not that special.

Another misconception is that managed code is automatically safer. Not quite true, a language like C# lets you party with pointers and scribble around on the stack just like you can with C++ and you can corrupt memory just as easily that way. It is just partitioned better, it forces you to be explicit about it with the unsafe keyword. No such constraints on C++/CLI, anything goes.

The essential difference between managed and native code is a data structure that the jitter generates when it compiles msil. Extra data you don't get from a native compiler. That data is required by the garbage collector, it tells it how to find object roots back. More about that data in this post. Having to conform to that data and allowing the GC to get its job done is what makes managed code a bit slower at runtime.

like image 164
Hans Passant Avatar answered Feb 08 '23 16:02

Hans Passant