I need to merge two similar huge projects (1000+ classes). The second one is a fork of the first one, and it contains some country-specific behavior. The two projects diverge a lot, because svn versioning was handled very poorly.
It often happens that two classes are semantically identical. Their source codes only differ in terms of warnings, import statements, the order of some methods or variables, code formatting, comments, etc.
Is there a way to automatically check if two classes are semantically identical?
You should consider using program analysis tools like Soot. Soot has some excellent APIs to analyze code that is best suited for your purpose. For example, to check whether two classes "semantically identical", you can consider (1) whether both of the classes have same (or similar fields) (2) both of the classes has same (or similar methods).
Fields are represented as SootField
in Soot. You will have all the necessary information in a SootField
object that you want to use for comparison. To check the semantic similarity of two methods you can check whether their control flow graphs (CFGs) are similar or not (Details are in Section 5.7 of this guide).
Tips on how you can use soot.
If your source dir is srcDir
, Java Home is javaHome
and the list of classes is classNames
, then you can use the following code snippet to programmatically load your classes in Soot toolset.
String sootClassPath = srcDir + ":"
+ javaHome + "/jre/lib/rt.jar:"
+javaHome + "/jre/lib/jce.jar";
Options.v().set_output_format(Options.output_format_jimple);
Options.v().set_src_prec(Options.src_prec_java);
for (String className : classNames) { // // "className" is like a.b.Myclass
Options.v().classes().add(className);
}
Options.v().set_keep_line_number(true);
Options.v().set_allow_phantom_refs(true);
Scene.v().setSootClassPath(sootClassPath);
Scene.v().loadBasicClasses();
When your classes are loaded, you can access a class like below:
SootClass sClass = Scene.v().loadClassAndSupport(className); // "className" is like a.b.Myclass
Now you can access the fields and methods of sClass
like below:
Chain<SootField> fieldList = sClass.getFields(); // import soot.util.Chain;
List<SootMethod> methods = sClass.getMethods();
You can iterate the CFG of a method, like below to get the list of instructions of it,
if (method.isConcrete()) {
List<Unit> instructionList = new ArrayList<>();
Body b = method.retrieveActiveBody();
DirectedGraph g = new ExceptionalUnitGraph(b);
Iterator gitr = g.iterator();
while (gitr.hasNext()) {
Unit unit = (Unit) gitr.next();
instructionList.add(unit);
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With