TDD

Question

From what I understand, in TDD you have to write a failing test first, then write the code to make it pass, then refactor. But what if your code already accounts for the situation you want to test?

For example, lets say I'm TDD'ing a sorting algorithm (this is just hypothetical). I might write unit tests for a couple of cases:

input = 1, 2, 3
output = 1, 2, 3

input = 4, 1, 3, 2
output = 1, 2, 3, 4
etc...

To make the tests pass, I wind up using a quick 'n dirty bubble-sort. Then I refactor and replace it with the more efficient merge-sort algorithm. Later, I realize that we need it to be a stable sort, so I write a test for that too. Of course, the test will never fail because merge-sort is a stable sorting algorithm! Regardless, I still need this test incase someone refactors it again to use a different, possibly unstable sorting algorithm.

Does this break the TDD mantra of always writing failing tests? I doubt anyone would recommend I waste the time to implement an unstable sorting algorithm just to test the test case, then reimplement the merge-sort. How often do you come across a similar situation and what do you do?

Mendelt · Accepted Answer

There are two reasons for writing failing tests first and then making them run;

The first is to check if the test is actually testing what you write it for. You first check if it fails, you change the code to make the test run then you check if it runs. It seems stupid but I've had several occasions where I added a test for code that already ran to find out later that I had made a mistake in the test that made it always run.

The second and most important reason is to prevent you from writing too much tests. Tests reflect your design and your design reflects your requirements and requirements change. You don't want to have to rewrite lots of tests when this happens. A good rule of thumb is to have every test fail for only one reason and to have only one test fail for that reason. TDD tries to enforce this by repeating the standard red-green-refactor cycle for every test, every feature and every change in your code-base.

But of course rules are made to be broken. If you keep in mind why these rules are made in the first place you can be flexible with them. For example when you find that you have tests that test more than one thing you can split it up. Effectively you have written two new tests that you havent seen fail before. Breaking and then fixing your code to see your new tests fail is a good way to double check things.

Brian · Answer

I doubt anyone would recommend I waste the time to implement an unstable sorting algorithm just to test the test case, then reimplement the merge-sort. How often do you come across a similar situation and what do you do?

Let me be the one recommend it then. :)

All this stuff is trade-offs between the time you spend on the one hand, and the risks you reduce or mitigate, as well as the understanding you gain, on the other hand.

Continuing the hypothetical example...

If "stableness" is an important property/feature, and you don't "test the test" by making it fail, you save the time of doing that work, but incur risk that the test is wrong and will always be green.

If on the other hand you do "test the test" by breaking the feature and watching it fail, you reduce the risk of the test.

And, the wildcard is, you might gain some important bit of knowledge. For example, while trying to code a 'bad' sort and get the test to fail, you might think more deeply about the comparison constraints on the type you're sorting, and discover that you were using "x==y" as the equivalence-class-predicate for sorting but in fact "!(x<y) && !(y<x)" is the better predicate for your system (e.g. you might uncover a bug or design flaw).

So I say err on the side of 'spend the extra time to make it fail, even if that means intentionally breaking the system just to get a red dot on the screen for a moment', because while each of these little "diversions" incurs some time cost, every once in a while one will save you a huge bundle (e.g. oops, a bug in the test means that I was never testing the most important property of my system, or oops, our whole design for inequality predicates is messed up). It's like playing the lottery, except the odds are in your favor in the long run; every week you spend $5 on tickets and usually you lose but once every three months you win a $1000 jackpot.

David Norman · Answer

The one big advantage to making the test fail first is that it ensures that your test is really testing what you think. You can have subtle bugs in your test that cause it to not really test anything at all.

For example, I once saw in our C++ code base someone check in the test:

assertTrue(x = 1);

Clearly they didn't program so that the test failed first, since this doesn't test anything at all.

TDD - When is it okay to write a non-failing test?

Tags:

failing-tests

Cybis

3 Answers

Mendelt

Brian

David Norman

Recent Activity

Donate For Us