Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java tool to find - copy/ paste code across projects

We inherited some leagcy code that has a whole lot of code copy/pasted across projects. Is there a way to find these? PMD can do a single project

like image 739
shikarishambu Avatar asked Sep 09 '11 22:09

shikarishambu


2 Answers

Summary

There is also CloneDetective, Simian and Simscan. This paper from the International Conference on Software Engineering 2009 compares them, and PMD's CPD.

In detail

One tool that can handle several languages is CloneDetective (based on ConQuat, Continuous Quality Assessment Toolkit): ABAP, ADA, Java, C#, C/C++, Visual Basic, Cobol, PL1.

Another tool is Simian, the Similarity Analyser, which identifies duplication in Java, C#, C, C++, COBOL, Ruby, JSP, ASP, HTML, XML, Visual Basic, Groovy source code and even plain text files. It runs on JVM and .NET.

Actually, if you look at .NET, there are a lot of copy paste detection tools...

SimScan, the SimilarityScanner is an Eclipse/IDEA/JBUILDER plugin that finds duplicated or similar fragments of code in large Java source code bases. I don't know it, and have no idea what "similar fragments" means. It sounds like it might also just look isolatedly in single projects, but the IntelliJ-Screenshots look nifty.

This paper from the International Conference on Software Engineering 2009 compares CloneDetective, PMD's CPD, Simian and Simscan.

Just as PMD's copy & paste finder is actually called CPD for "copy paste detector", using that term as the terminus technicus for googling helps. Another term often used is "clone detection".

like image 173
DaveFar Avatar answered Dec 06 '22 21:12

DaveFar


You could try using the command line version of PMD CPD:

http://pmd.sourceforge.net/cpd.html

You should be able to specify multiple source trees to check.

Simian, which is the other prominent copy/paste detector has similar command line capabilities.

like image 24
Brian Smith Avatar answered Dec 06 '22 20:12

Brian Smith