Duplication is bad, but… It isn't a question of whether you'll remember: it's a question of when you'll forget.” Which makes perfect sense. It's time well spent when you try to make your code streamlined and readable. You'll end up with a cleaner, easier-to-maintain, and more extensible code base as a result.
The Issue With Code DuplicationDuplication greatly decreases the maintainability of your code. Ideally, introducing a change in business logic should require your team to change one class or function, and no more. Otherwise, a developer has to spend extra effort hunting down all these extra occurrences.
In computer programming, duplicate code is a sequence of source code that occurs more than once, either within a program or across different programs owned or maintained by the same entity. Duplicate code is generally considered undesirable for a number of reasons.
Refactoring: Improving the Design of Existing Code
The Rule of Three
The first time you do something, you just do it. The second time you do
something similar, you wince at the duplication, but you do the duplicate
thing anyway. The third time you do something similar, you refactor.
Three strikes and you refactor.
Coders at Work
Seibel: So for each of these XII calls you're writing an implementation.
Did you ever find that you were accumulating lots of bits of very similar code?Zawinski: Oh, yeah, definitely. Usually by the second or third time you've cut and pasted
that piece of code it's like, alright, time to stop cutting and pasting and put it in a subroutine.
I tolerate none. I may end up having some due to time constraints or whatnot. But I still haven't found a case where duplicated code is really warranted.
Saying that it'll hurt readability only suggests that you are bad at picking names :-)
Personally, I prefer keeping code understandable, first and foremost.
DRY is about easing the maintenance in code. Making your code less understandable in order to remove repeated code hurts the maintainability more, in many cases, than having some repeated lines of code.
That being said, I do agree that DRY is a good goal to follow, when practical.
If the code in question has a clear business or technology-support purpose P, you should generally refactor it. Otherwise you'll have the classic problem with cloned code: eventually you'll discover a need to modify code supporting P, and you won't find all the clones that implement it.
Some folks suggest 3 or more copies is the threshold for refactoring. I believe that if you have two, you should do so; finding the other clone(s) [or even knowing they might exist] in a big system is hard, whether you have two or three or more.
Now this answer is provided in the context of not having any tools for finding the clones. If you can reliably find clones, then the original reason to refactor (avoiding maintenance errors) is less persausive (the utility of having a named abstraction is still real). What you really want is a way to find and track clones; abstracting them is one way to ensure you can "find" them (by making finding trivial).
A tool that can find clones reliably can at least prevent you from making failure-to-update-clone maintenance errors. One such tool (I'm the author) is the CloneDR. CloneDR finds clones using the targeted langauge structure as guidance, and thus finds clones regardless of whitespace layout, changes in comments, renamed variables, etc. (It is implemented for a number a languages including C, C++, Java, C#, COBOL and PHP). CloneDR will find clones across large systems, without being given any guidance. Detected clones are shown, as well as the antiunifier, which is essentially the abstraction you might have written instead. Versions of it (for COBOL) now integrate with Eclipse, and show you when you are editing inside a clone in a buffer, as well as where the other clones are, so that you may inspect/revise the others while you are there. (One thing you might do is refactor them :).
I used to think cloning was just outright wrong, but people do it because they don't know how the clone will vary from the original and so the final abstraction isn't clear at the moment the cloning act is occurring. Now I believe that cloning is good, if you can track the clones and you attempt to refactor after the abstraction becomes clear.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With