Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Reverse Engineering poorly documented Java from source [closed]

I'm a systems engineer, recent college grad, and I've just been given a project that is exceptionally daunting.

We have a legacy system, we legally own the entire code and all rights to it. The problem is that the code is poorly documented, what little documentation is incomplete, sometimes wrong and the original devs are unavailable.

It uses a custom Perl build script requires a thousand modules from CPAN to work and I do not know Perl. Reverse engineering into UML has failed except with Doxygen and that is limited to just inheritance diagrams and call graphs.

I've obtained a massive chalkboard and I'm slowly trawling through the code, modeling packages and then the nested packages within.

My question is whether or not I'm approaching this reverse engineering from the right direction. I'm working close from the bottom trying to figure out what calls what while developing UML and writing a Design Document. I did a package diagram but it's hard to figure out what's going on at that high a level.

An academic paper I pulled up suggests I also make a new Requirements Document which would slow me down even more and I don't know if it's a good idea as the other developers are always busy trying to keep the legacy system up.

Are there any books out there that can help me and am I approaching this from the right angle? Should I hire a contract worker that knows Perl and JMX to assist me?

like image 731
sqn Avatar asked Jun 14 '11 14:06

sqn


People also ask

Can source code be reverse-engineered?

Software applications comprise source code files that are compiled to convert them into binary executable code. If this binary executable code is converted back into source code files using a decompiler then this will be termed as reverse engineering of source code.

What is the issue with reverse engineering?

Some of the most common reverse engineering challenges that you are likely to face include: Not having the right equipment - Even if you have successfully reverse engineered an object in the past, the equipment you have in-house may not be sufficient for the next object you must scan.

What are the disadvantages of reverse engineering?

Two main disadvantages: You can never really disassemble an application fully to it's original state before being compiled. Additionally, it can be very difficult to make anything of a disassembled application due to the Obfuscation of the critical and important source code.


1 Answers

The book "Working Effectively with Legacy Code" by Michael Feathers will probably help you more than anything we can tell you here.

However, the most important thing you need to clarify for yourself (and from your question it sounds like it's not completely clear) is this: what is your goal? What do you want to achieve with this codebase?

If the answer is (as it sounds) "being able to effectively maintain the existing project", then trying to directly build a complete high-level model of the system may not be the most effective path. It's probably just too much at once to keep in mind.

In this case, I would try to understand only the use cases of the system that you currently need to modify; follow method calls through the code (pssibly using a debugger on the running system) to see what parts are involved. Do this for a few different use cases and you'll start to see patterns, then document those and gradually fit them together into a high-level image of the system.

like image 77
Michael Borgwardt Avatar answered Oct 06 '22 00:10

Michael Borgwardt