Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How does running C# app inside a code profiler differ from running one outside code profiler?

Tags:

c#

.net

profiler

I'm trying to understand how a code profiler (in this case the Drone Profiler) runs a .NET app differently from just running it directly. The reason I need to know this is because I have a very strange problem/corruption with my dev computer's .NET install which manifests itself outside of the profiler but very strangely not inside and if I can understand why I can probably fix my computer's issue.

The issue ONLY seems to affect calls to System.Net.NetworkInformation's methods (and only within .NET 3.5 to 2.0, if I build something against 4 all is well). I built a little test app which only does one thing, it calls System.Net.NetworkInformation.IsNetworkAvailable(). Outside of the profiler I get "Fatal Execution Engine Error" occurred in System.dll, and that's all the info it gives. From what I understand that error usually results from native method calls, which presumably occur when the System.dll lets some native DLL perform the IsNetworkAvailable() logic.

  • I tried to figure out the inside and outside the profiler difference using Process Monitor, recording events from both situations and comparing them. Both logs were the same until just a moment after iphlpapi.dll and winnsi.dll were called and just before the profiler-run code called dnsapi.dll and the non-profiler code began loading crash reporting related stuff. At that moment when it seemed to go wrong the profiler-run code created 4-6 new threads and the non-profiler (crashing) code only created 1 or 2. I don't know what that means, if anything.

Arguably unnecessary background

My Windows 7 included .NET installation (3.5 to 2.0) was working fine until my hard drive suffered some corruption and checkdisk began finding bad clusters. I imaged the drive to a new one and everything works fine except this one issue with .NET.

I need to resolve this problem reinstalling Windows or reverting to image backups.

Here are some of the things I've looked into:

  • I have diffed the files/directories which seemed most relevant (the .NET stuff under Windows and Program Files) pre- and post- disk trouble and seen no changes where I didn't expect any (no obvious file corruption).
  • I have diffed the software and system registry hives pre- and post- disk trouble and seen no changes which seemed relevant.
  • I have created a new user account and cleaned up any environment variables in case environment was related. No change.
  • I did "sfc /scannow" and it found no integrity problems.
  • I tried "ngen update" to regenerate pre-compiled code in case I missed something that might be damaged and nothing changed.
  • I removed my virus scanner to see if it was interfering, no difference.
  • I tried running the test code in Safe Mode, same crash issue.

I assume I need to repair my .NET installation but because Windows 7 included .NET 3.5 - 2.0 you can't just re-run a .NET installer to redo it. I do not have access to the Windows disks to try to re-install Windows over itself (the computer has a recovery partition but it is unusable); also the drive uses a whole-disk encryption solution and re-installing would be difficult.

I absolutely do not want to start from scratch here and install a fresh Windows, reinstall dozens of software packages, try and remember dozens of development-related customizations/etc.

Given all that... does anyone have any helpful advice? I need .NET 3.5 - 2.0 working as I am a developer and need to build and test against it.

Thanks!

Quinxy

like image 718
Quinxy von Besiex Avatar asked Nov 12 '22 23:11

Quinxy von Besiex


1 Answers

The short answer is that my System.ni.dll file was damaged, I replaced it and all is well.

The long answer might help someone else by way of its approach to the solution...

My problem related to .Net being damaged in such a way that apps wouldn't run except through a profiler. I downloaded the source for the SlimTune open source profiler, built it locally, and set a break point right before the call to Process.Start(). I then compared all the parameters involved in starting the app successfully through the profiler versus manually. The only meaningful difference I found was the addition of the .NET profile parameters added to the environment variables:

  • cor_enable_profiling=1
  • cor_profiler={38A7EA35-B221-425a-AD07-D058C581611D}

I then tried setting these in my own user's environment, and voila! Now any app I ran manually would work. (I had actually tried doing the same thing a few hours earlier but I used a GUID that was included in an example and which didn't point to a real profiler and apparently .NET knew I had given it a bogus GUID and didn't run in profiling mode.)

I now went back and began reading about just how a PE file is executed by CLR hoping to figure out why it mattered that my app was run with profiling enabled. I learned a lot but nothing which seemed to apply.

I did however remember that I should recheck the chkdsk log I kept listing the files that were damaged by the drive failure. After the failure I had turned all the listed file ids into file paths/names and I had replaced all the 100+ files I could from backup but sure enough when I went back now and looked I found a note that while I had replaced 4 or 5 .NET related files successfully there was one such file I wasn't able to replace because it was "in use". That file? System.ni.dll!!! I was now able to replace this file from backup and voila my .NET install is back to normal, apps work whether profiled or not.

The frustrating thing is that when this incident first occurred I fully expected the problem to relate to a damaged file, and specifically to a file called System.dll which housed the methods that failed. And so I diffed and rediffed all files named System.dll. But I did not realize at that time that System.ni.dll was a native compiled manifestation of System.dll (or somesuch). And because I had diffed and rediffed the .NET related directories and not noticed this (no idea how I missed it) I'd given up on that approach.

Anyway... long story short, it was a damaged System.ni.dll that caused my problems, one or more clusters within it had their content replaced with 0x0 and it just so happened to manifest as the odd problem I observed.

like image 75
Quinxy von Besiex Avatar answered Nov 15 '22 12:11

Quinxy von Besiex