Should we write unit tests or integration tests ?
“There is hardly anything in the world that someone cannot make a little worse and sell a little cheaper, and the people who consider price alone are that person’s lawful prey. It’s unwise to pay too much, but it’s worse to pay too little. When you pay too much, you lose a little money — that is all. When you pay too little, you sometimes lose everything, because the thing you bought was incapable of doing the thing it was bought to do. The common law of business balance prohibits paying a little and getting a lot — it can’t be done. If you deal with the lowest bidder, it is well to add something for the risk you run, and if you do that you will have enough to pay for something better.”
We are working on an existing software solution. We regularly have to change code. Should we write tests for the code ? And should we write isolated tests or integrated tests ?
As strange as this statement may sound, this is not a technical problem. It’s a project management problem.
All code is legacy code, meaning that for all software that is in production there’s a misalignment between the objectives, the context and the heuristics that have been put to use in order to obtain the current solution.
It could be that the initial project was insufficiently funded. Is the system ‘s documentation up to date ? Are there documented test cases ? See ? It was insufficiently funded.
It could be that the codebase outgrew the initial project context and objectives in such a way that efforts were always dedicated to adding features and never to adapting the architecture or maintaining code modularity.
It could be that the initial contractor/team took a project that was way out of their league technically but that fact could not be established soon enough in the project to change course. Let those who never lied on a résumé cast the first stone.
It could be technical debt, a term which generally denotes a combination of the problems mentioned above.
What is the first thing to do then, when you have to change some legacy code?
Most of the time: to make sure you have tests that you can execute before and after changing the code.
The primary value of the code lies in its correctness. Or as Jerry Weinberg said
“If the code doesn’t have to work, it can satisfy every other quality.”
Don’t jump from a legacy solution to a broken solution.
Most of the time, a safe edit of the code is cheaper than an unsafe one. This is very counterintuitive, so let’s take an example.
|cost of editing the code||0.1||0.1|
|cost of writing tests||0.5||0.0|
|probability of regression||0.01||0.9|
|cost of fixing regression||1.0||1.0|
|risk of regression||0.01||0.9|
In the table above, we see that risks are being accounted for. A probability of 0.01 of a regression costing 1 means that over 100 code edits you will incur a cost of 1 in bug fix. A probability of 0.9 means that over the same number of edits you will incur a cost of at least 90. Some fixes will require making important changes to the code, themselves leading to more regressions, and so on.
This is why we write tests : not because we don’t know how to code but because we are ordinary humans, not ninjas or cowboys, and we understand risks, albeit confusedly. Writing tests in this context is a defect prevention strategy.
Still, writing tests costs time. How can we reduce the cost of writing tests ?
There are two ways: the self-deceptive way, and a courageous way.
The self-deceptive way of reducing the costs of writing tests:
[Write integrated tests instead of isolated tests.]
Instead of writing tests that check each specific behavior of each component in the system, you write tests for the general behavior of the system as a whole.
(Important note: the heuristic: [Write integrated tests.] is not a self-deceptive strategy. It is indeed a very good defect prevention strategy that complements isolated tests and helps find those defects that are born from integration issues).
In the table above we summarized the number of test cases that a development team wrote on a (small) software system. In the courageous project, the policy was to write a test case for every distinct behavior of each component of the system. In the self-deceptive project, the policy was to create a (rather large and impressive) set of integrated tests.
Why is the self-deceptive called self-deceptive ? Because in this plan, the initial expectations about the total coverage of the tests are being subtly (and sometimes subconsciously) lowered.
If the sum of all of each component’s distinct behaviors amounts to 300, checking these behaviors through integrated tests should require at least as many distinct test cases, minus the test cases where two or more specific component behaviors cannot happen in the same execution path. Most probably the number of test cases will incur a combinatory explosion.Given that integrated tests are harder to write and maintain, are slower to execute and require a larger amount of test data preparation, the natural human reaction, especially if tests were not really planned at the outset of the project, will be to silently and subtly reduce the coverage expectation: simply have less test cases than required.
Looking at your whole career, in how many project plans did you ever have to estimate the number of test cases that the system test suite would comprise ? I’m guessing not a lot. In how many project plans did you have to come up with a detailed budget estimate ? I would bet for: all of them.
It’s easier to lower an expectation that was never clarified or even stated, than to announce that we’re officially behind the due date and need more money.
Let’s take the path of the courageous way of writing tests instead.
The courageous way of reducing the costs of writing tests:
[Write isolated tests for the parts of the system’s behavior that are conceptually isolated.]
Isn’t it surprising, ironic even, that when we examine the process of making a change to existing software, we see that:
- The product owner (or client) was able to formulate precisely, and to isolate by means of accurate business terms, the very part of the behavior that has to change.
- The developer team was perfectly able to understand the nature, range and impact of the change business-wise. They also could in a matter of hours figure out precisely what part of the code should change, and how to implement that change.
- No unit test could be written, due to: cost of writing unit tests, drastic lack of modularity in the code, leading to a dependency hell while trying to build a simple test, leading to a plethora of mock objects, and so on, and so on amounting to just another fine mess. The team went with the change in the code base, and redeployed. Push and pray.
In every legacy code we find that situation for almost every change, which I would like to dub the DECEIT state of legacy software :
DECEIT : Described Easily, Changed Easily, Impossible to Test.
Thus, in contexts where the code base is large, preventing defects through only integrated tests is a costly strategy that amounts eventually to code that is DECEIT, which is why it is called the self-deceptive way of writing tests.
The courageous way, the remedy to a DECEIT code base is to write isolated tests. It is a design remedy as well as a testing strategy.
Why is the courageous way being called courageous ? Because it takes courage to pursue the method that consists in checking every distinct behavior or every component of the system, especially when the code does not lend itself to such modular approach, and also when so many actors around the project are advising that we take shortcuts to quality, given that they probably won’t be exposed to the consequences of that decision.
So we definitely should check that we do have these test cases, and if not, write them. Right now, the immediate cost (in time, budget, and energy) of doing that may seem intimidating to the point where we are tempted to minimize the cost, and forego that part of the process. But we have to think.
When going to the mall willing to buy shoes, and trying to reduce the cost of shoes, you have to think about the cost of wearing bad shoes.
When trying to reduce the cost of writing tests, you have to take into account the cost of not writing enough tests.
Or to paraphrase Ruskin’s Common Law of Business Balance :
It is unwise to pay too much, but it is folly to not pay enough for what you need.
So when facing the prospect of having to change some legacy code, what do you need ?
You need to make sure that the code you are about to change has tests, and that you can run them before and after making the change.
Then again, adding tests to a legacy system can be a lot of hard work. The temptation is big to just go for this simple one line change, and redeploy.
Maybe regressions in the code behavior don’t matter? How much are you willing to pay (in budget, effort, time, reputation) to discover if they matter or not?
Once again it’s a difficult situation because while very simple project information like the total cost and the deadline are written on the wall, the figures for risk assessment and actual coverage expectations are generally not made explicit, not shared, not known even. Which means that some of the actors taking a significant part or influence in the decision to write tests or not are not fully aware of the consequences of that decision.
Obvious short term cost overruns and delays vs unspoken or long term risks, communication gaps and misaligned expectations: As mentioned earlier, this is all but a technical problem.
This is a project management problem.