Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I check if two Java classes are semantically identical?

I need to merge two similar huge projects (1000+ classes). The second one is a fork of the first one, and it contains some country-specific behavior. The two projects diverge a lot, because svn versioning was handled very poorly.

It often happens that two classes are semantically identical. Their source codes only differ in terms of warnings, import statements, the order of some methods or variables, code formatting, comments, etc.

Is there a way to automatically check if two classes are semantically identical?

like image 729
agdev84 Avatar asked Jul 18 '18 21:07

agdev84


1 Answers

You should consider using program analysis tools like Soot. Soot has some excellent APIs to analyze code that is best suited for your purpose. For example, to check whether two classes "semantically identical", you can consider (1) whether both of the classes have same (or similar fields) (2) both of the classes has same (or similar methods).

Fields are represented as SootField in Soot. You will have all the necessary information in a SootField object that you want to use for comparison. To check the semantic similarity of two methods you can check whether their control flow graphs (CFGs) are similar or not (Details are in Section 5.7 of this guide).

Tips on how you can use soot.

If your source dir is srcDir, Java Home is javaHome and the list of classes is classNames, then you can use the following code snippet to programmatically load your classes in Soot toolset.

 String sootClassPath = srcDir + ":" 
            + javaHome + "/jre/lib/rt.jar:"
            +javaHome + "/jre/lib/jce.jar";

    Options.v().set_output_format(Options.output_format_jimple);
    Options.v().set_src_prec(Options.src_prec_java);

    for (String className : classNames) { // // "className" is like a.b.Myclass
        Options.v().classes().add(className);
    }

    Options.v().set_keep_line_number(true);
    Options.v().set_allow_phantom_refs(true);
    Scene.v().setSootClassPath(sootClassPath);

    Scene.v().loadBasicClasses();

When your classes are loaded, you can access a class like below:

 SootClass sClass = Scene.v().loadClassAndSupport(className); // "className" is like a.b.Myclass

Now you can access the fields and methods of sClass like below:

 Chain<SootField> fieldList =  sClass.getFields(); // import soot.util.Chain;
 List<SootMethod> methods = sClass.getMethods();

You can iterate the CFG of a method, like below to get the list of instructions of it,

  if (method.isConcrete()) {

     List<Unit> instructionList = new ArrayList<>();

     Body b = method.retrieveActiveBody();
     DirectedGraph g = new ExceptionalUnitGraph(b);
     Iterator gitr = g.iterator();
     while (gitr.hasNext()) {
          Unit unit = (Unit) gitr.next();
          instructionList.add(unit);
     }
  }
like image 186
Sazzadur Rahaman Avatar answered Oct 20 '22 19:10

Sazzadur Rahaman