I'm looking for the best approach to dealing with duplicate code in a legacy PHP project with about 150k lines of code.
Is this something best approached manually or are there standalone duplicate code detectors that will ease the pain?
A Solution Use lambda expressions (C++11). Move the duplicated code into a lambda expression to define in one place a callable object local to the function that can be reused each of the four times.
Code duplication makes software less maintainable and reduces our ability to iterate fast.
Duplication is bad, but… It isn't a question of whether you'll remember: it's a question of when you'll forget.” Which makes perfect sense. It's time well spent when you try to make your code streamlined and readable. You'll end up with a cleaner, easier-to-maintain, and more extensible code base as a result.
As the other answers already mention, this should be approached manually, because you may want to change other things as you go along to make the code base cleaner. Maybe the actual invocation is already superfluous, or similar fragments can be combined.
Also, in practice people usually slightly change the copied code, so there will often not be direct duplicates, but close variants. I fear automatic c&p detection will mostly fail you there.
There are however refactoring tools that can help you with acutally performing the changes (and sometimes also with finding likely candidates). Google for "php refactoring", there are quite a few tools available, both standalone and as part of IDEs.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With