Chapter 5
[
141
]
There are several reasons, but they all boil down to readability and maintainability.
When we're writing a new piece of code that is similar to an earlier piece, the easiest
thing to do is copy the old code and change whatever needs to be changed (variable
names, logic, comments) to make it work in the new location. Alternatively, if we're
writing new code that seems similar, but not identical to
code elsewhere in the project,
it is often easier to write fresh code with similar behavior, rather than figure out how
to extract the overlapping functionality.
But as soon as someone has to read and understand the code and they come across
duplicate blocks, they are faced with a dilemma. Code that might have made sense
suddenly has to be understood. How is one section different from the other? How
are they the same? Under what conditions is one section called? When do we call
the other? You might argue that you're
the only one reading your code, but if you
don't touch that code for eight months it will be as incomprehensible to you as it is
to a fresh coder. When we're trying to read two similar pieces of code, we have to
understand why they're different, as well as how they're different.
This wastes the
reader's time; code should always be written to be readable first.
I once had to try to understand someone's code that had three identical
copies of the same 300 lines of very poorly written code. I had been
working with the code for a month before I finally comprehended that
the three "identical" versions were actually performing slightly different
tax calculations. Some of the subtle differences were intentional, but
there were also obvious areas where someone
had updated a calculation
in one function without updating the other two. The number of subtle,
incomprehensible bugs in the code could not be counted. I eventually
replaced all 900 lines with an easy-to-read function of 20 lines or so.
Reading such duplicate code can be tiresome, but code maintenance is even more
tormenting. As the preceding story suggests, keeping two
similar pieces of code up
to date can be a nightmare. We have to remember to update both sections whenever
we update one of them, and we have to remember how the multiple sections differ so
we can modify our changes when we are editing each of them. If we forget to update
both sections, we will end up with extremely annoying
bugs that usually manifest
themselves as, "but I fixed that already, why is it still happening?"
The result is that people who are reading or maintaining our code have to spend
astronomical amounts of time understanding and testing it compared to if we
had written the code in a nonrepetitive manner in the first place. It's even more
frustrating when we are the ones doing the maintenance;
we find ourselves saying,
"why didn't I do this right the first time?" The time we save by copy-pasting existing
code is lost the very first time we have to maintain it. Code is both read and modified
many more times and much more often than it is written. Comprehensible code
should always be paramount.
www.it-ebooks.info