Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cleaning up pathologically-nested "if { } else { if { } else { if { ... } } }"

I currently have the misfortune of working on Somebody Else's C# code which has truly blown my mind. I have no idea how the person before me ever maintained this code, as its various pathologies have crashed the IDE, the compiler, the runtime environment...

The problem I'm facing today involves a 15 megabyte source file that features a truly mindblowing degree of pathological nesting. Code like:

if(var == 0) {
  // do stuff
}
else {
  if(var == 1) {
    // do stuff
  }
  else {
    if(var == 2) {
      // do stuff, identical word for word to the `var == 1` case
    }
    else {
      // etc.
    }
  }
}

This is a questionable stylistic choice at the best of times. However, this combines with another pathology of the code: some of these blocks are nearly a thousand levels deep. (The deepest I bothered to measure was well over 700.) I sincerely hope the person before me, as one of their final acts before being forcibly separated from this code, ran a styling tool that resulted in the abomination before me. I cannot conceive that they could possibly have written this code as it is now, especially since every third or fourth edit to the code crashes the IDE. (And sometimes deletes my copy of the source file, as a bonus.)

I wrote a simple regex-based tool to try to condense the simpler cases, but it seems to half-process and then corrupt this particular code. (I'm not sure whether it fails because this code also uses preprocessor conditionals from time to time, or because the longest of the matches would be almost 10MB long and Lua's regular expression matcher just can't cope.) I'm hoping there's a widely-used tool or technique that can reverse this problem. I already had to use astyle to clean up some other stylistic "issues" the code had. The --remove-brackets option for astyle almost does what I want, but requires that the bracketed statement be a single statement on a single line, which is very much not the case here... (And just to cross my "t"s, I checked; astyle did not create this particular problem.)

Edit: Deeper examination of the problem code reveals things like this:

#if OneThing
int num2296 = otherThing();
#endif
#if AnotherThing
int num44 = otherThing()
int num45 = 0;
#endif
int num72 = 0;
#if OneThing
int num45 = 0; // note: multiple equivalent declarations of num45
#endif
#if OneThing
for(int num2297 = 0; num2297 < num2296; ++num2297) {
  num45 = doSomething(num2297);
#endif
#if AnotherThing
for(int num43 = 0; num43 < num44; ++num43) {
  num45 = doSomething(num43);
#endif
  if(somethingElse(num45)) {
    ++num72;
  }
} // note: only one closing brace for the two protected by #ifs

Two versions of this code are compiled for different purposes, one with OneThing defined and one with AnotherThing defined. However, most of the differences between the two are just variable names, with logic identical. (Most, not all.)

Cases like the brace at the end of the above snippet explains why my simple tool was breaking. This is also looking more and more like job security by design, and less like innocent incompetence. (If the code was once at a point where a variable name like num2276 would be generated by a decompiler, it isn't currently at that point.)

Unfortunately, this means an automated tool will probably not cut it alone. I'll just have to slog through, slowly undoing the damage the last programmer did. I leave this question here on the off chance there is a miraculous tool that, I dunno, can convert both versions to SSA and identify and collapse their logical equivalences and then convert them back...

like image 659
Solra Bizna Avatar asked May 30 '16 18:05

Solra Bizna


1 Answers

You could use Roslyn to rewrite the code. It's not a good approach to modify source code as text. With Roslyn you can modify it as a syntax tree.

Maybe it helps you to flatten everything?

if (a)
 if (b) F2()
 else F3();
else
 F4();

Could become:

if (a && b) F2();
else if (a && !b) F3();
else F4();

That way the source code becomes a flat list and it's more obvious under what conditions a branch is entered.

like image 193
usr Avatar answered Sep 26 '22 02:09

usr