Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

C# Native Interop - Why most libraries use LoadLibrary and delegates instead of SetDllDirectory and simple DllImport

There is a great answer on SO about how to set the search directory for DllImport at runtime. Works just fine with two lines of code.

However, many open source projects instead use LoadLibrary function. There are "rumors" that calling native methods via delegates is slower. I call them "rumors" because I have seen this only in two places and this is micro optimization anyway.

The most interesting place is this blog post: http://ybeernet.blogspot.com/2011/03/techniques-of-calling-unmanaged-code.html

There, the author measured performance of different techniques:

  • C# (informative) 4318 ms
  • PInvoke - suppressed security 5415 ms
  • Calli instruction 5505 ms
  • C++/CLI 6311 ms
  • Function delegate - suppressed security 7788 ms
  • PInvoke 8249 ms
  • Function delegate 11594ms

NNanomsg uses function delegates but mentions the blog post with a comment "The performance impact of this over conventional P/Invoke is evidently not good" on this line.

Kestrel server from MSFT's ASP vNext uses the same technique with Libuv library: here is the code

I think that delegates are more cumbersome to use than simple DllImport, and given the performance difference I wonder why the performance-oriented libraries use delegates instead of setting dll search folder?

Are there any technical reasons like security, flexibility or whatever - or this is just a matter of taste? I do not understand the rationale - is it possible that the authors just didn't search StackOverflow enough!?

like image 420
V.B. Avatar asked Jan 16 '15 10:01

V.B.


People also ask

What C is used for?

C programming language is a machine-independent programming language that is mainly used to create many types of applications and operating systems such as Windows, and other complicated programs such as the Oracle database, Git, Python interpreter, and games and is considered a programming foundation in the process of ...

What is the full name of C?

In the real sense it has no meaning or full form. It was developed by Dennis Ritchie and Ken Thompson at AT&T bell Lab. First, they used to call it as B language then later they made some improvement into it and renamed it as C and its superscript as C++ which was invented by Dr.

Is C language easy?

Compared to other languages—like Java, PHP, or C#—C is a relatively simple language to learn for anyone just starting to learn computer programming because of its limited number of keywords.

What is C language?

C is an imperative procedural language supporting structured programming, lexical variable scope, and recursion, with a static type system. It was designed to be compiled to provide low-level access to memory and language constructs that map efficiently to machine instructions, all with minimal runtime support.


1 Answers

Hmya, blog posts, that fundamentally flawed way to distribute technical information. The world would be a better place if we could vote for them. The author is comparing apples and oranges. Well, more like apples and bicycles.

There are two fundamentally different kind of interop scenarios being compared here. The first one is the "normal" one, a managed program calling code in an unmanaged DLL. Using the [DllImport] attribute or C++/CLI are the weapons of choice. Very highly optimized inside the CLR, it dynamically generates machine code that translates the arguments and makes the call. Important, a managed program always runs lots of unmanaged code, given that it runs on top of a purely unmanaged operating system.

What you are talking about, the "slow" version, is going the other way. Calling managed code from an unmanaged program. Some people call this "reverse pinvoke". It is much more convoluted because before you can call managed code, you first have to get the CLR loaded and initialized. And create an appdomain. And locate and load the .NET assembly that contains the code. And JIT compile it.

There are three basic ways to do this:

  • Custom-host the CLR. This is by far the most powerful version. You use the hosting interfaces to create the CLR instance explicitly and have full control over its configuration. The CLRRuntimeHost COM coclass is the primary vehicle to get that ball rolling.

  • Expose .NET classes as COM components by giving them the [ComVisible(true)] attribute. Very simple to get going, the unmanaged code is completely unaware that it is actually using .NET code. The default CLR host gets loaded, the registry entry for the COM component points to mscoree.dll which bootstraps the CLR as necessary. Only disadvantage is the unmanaged code author needs to write COM client code, a skill that's getting lost.

  • What you are talking about, taking advantage of the ability of the C++/CLI compiler to generate DLL exports. Notable also for being used by Robert Gieseke's Unmanaged Exports tool, using the exact same technique but injecting these DLL exports by rewriting the assembly.

There are very significant disadvantages to doing it the 3rd way, beyond the expense of the call. It scales poorly, every single method must be exported explicitly and it must be static so you cannot implement an object model. And the super-duper, horrible, nasty, impossible-to-deal-with problem that you cannot get any diagnostic when the call fails. Managed code likes to throw exceptions, if not from the code itself then from the CLR that tries to tell you that you passed the wrong arguments or cannot get the code prepped. You cannot see those exceptions, there is no way to tell that the function failed nor a way to tell why it failed. If the unmanaged code doesn't catch SEH exceptions with the non-standard __try/__except keywords then the program bombs. No diagnostic whatsoever. Even if it does catch the SEH, you only get an "it didn't work" signal.

Managed code that's invoked this way must be written to deal with this problem. It must contain try/catch-em-all in the public method. And log the exception and provide a way to return an error code so the caller can detect failure. Gross problems, like missing dependent DLLs or versioning issues are however not diagnosable at all. It looks easy for the unmanaged code author, simple LoadLibrary + GetProcAddress, it is a long-term support nightmare however.

like image 167
Hans Passant Avatar answered Oct 17 '22 22:10

Hans Passant