I am trying to learn more about JUnit and TDD, but I am running into some issues with coupling between test cases.
When I am writing a test case for a particular data type's API, say a Deque<T>
, how can I limit the coupling between the test cases? For instance, if I were writing a test case for the method insertFirst(T item)
, it seems straightforward to assume that I should be able to assert two things after calling the method on a properly initialized object:
Deque
object should have increased by oneT removeFirst()
method , it should return a reference to the object I inserted with the initial call. However, this creates an undesirable coupling between at least two of my test cases, where one test case passing depends on the correct implementation of another API method. For instance, in order for this test case to pass, I would need a correct implementation for checking the number of items in the Deque
and also for removing items. If my test for either of those methods was incorrect or incomplete for whatever reason, then my test for the insertFirst
method would automatically be suspect.
What are best practices for avoiding this scenario? Is my approach to writing test cases wrong in some way?
When writing a test for one method, you have to assume that the rest of class is working correctly. If you wouldn't make this assumption, the only conclusion would be a single, massive test per class. And that's not what we do.
You can make the assumption that the other parts of the class work correctly, because there will be tests for those other parts, too, ensuring their correctness.
If one part is not working correctly, a test will fail, showing you that something is not correct.
As soon as a test of your test suite fails, there is an error you have to fix. You no longer can make any assumptions.
Example:
You have a simple list implementation with only three methods:
You have three tests:
insert
:insert
item (Act)count
equals 1 (Assert)remove
:
insert
item (Arrange)remove
item (Act)count
equals 0 (Assert)count
:
insert
n items (Arrange)count
(Act)count
equals n (Assert)Now, if any of the above tests fail, you can't be sure of the correctness of a singlemember of your class:
remove
, because there was nothing to remove.insert
and count
are working correctly, because the second test will fail if any of the three members doesn't work correctly.The failing tests tell you something though:
Depending on the tests that fail, you often can deduct where the error has to be.
Example: If only the second test fails but not the first or third, the error most likely is in the remove
method.
It is generally more productive to think of unit tests as testing particular features rather than particular methods. Any given test will check that some collection of methods works correctly to implement the feature that is the subject of the test, and the pattern of failure in a well-designed set of tests will tend to tell you which method broke fairly quickly.
A good collection of tests tends to fall naturally out of doing TDD; that's one of the things that makes the technique so powerful. If I'm writing a Deque
, the tests I write will tend to be the following, generally presenting in this order.
empty_Deque_isEmpty
-- implement isEmpty
to always return true
non_empty_Deque_isntEmpty
-- implement insertFirst
to make isEmpty
instance variable falsere_emptied_Deque_isEmpty
-- change instance variable used by isEmpty
to be a number that responds to insertFirst
and removeFirst
is_empty_Deque_size_correct
-- implement size
to always return 0is_nonempty_Deque_size_correct
-- add instance variable to track size; realize it's doing the same thing needed by isEmpty
; refactoris_re_emptied_Deque_size_correct
-- have the test just pass because of what we did to make 5. happendoes_removing_from_empty_Deque_throw
-- removeFirst
needs to check size
before doing anything elseis_inserted_item_returned
-- insertFirst
and removeFirst
now populate a T
instance variableis_inserted_item_returned_from_end
-- add removeLast
that is a copy of removeFirst
; refactoris_rear_inserted_item_returned
-- add insertLast
that copies insertFirst
; refactorare_all_inserted_items_returned
-- change insertFirst
and removeFirst
to act on SomeKindOfCollection<T>
; make a point of not checking order of retrievaldoes_removeFirst_retrieve_items_in_correct_order
-- insert two things, make sure the second one is returned by removeFirst
. Might already be true.does_removeLast_retrieve_items_in_correct_order
-- ditto for removeLast
, except pretty certain not to already pass.That's a whole bunch of tests, but as you look through them you should notice the pattern. None of these tests is really "the test for count
" or "the test for removeFirst
". But by the time we're through, the entire interface of the class is being exercised and all the internals necessary to that interface have been developed. Some of the tests depend on more than one method and if that method should fail, they will all break. But the pattern of breaks will tend to be very helpful in determining where the bug is.
Also interesting is how many of these tests we can make pass without ever needing to commit to actually having a collection in the object, which suggests that that set of tests could be factored out into a more general test suite which will be useful when developing PriorityQueue
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With