The database is Hot Lava!
Unless you’ve been through [appendix_purist_unit_tests], almost all of the "unit" tests in the book should perhaps have been called 'integrated' tests, because they either rely on the database or use the Django Test Client, which does too much magic with the middleware layers that sit between requests, responses, and view functions.
There is an argument that a true unit test should always be isolated, because it’s meant to test a single unit of software. If it touches the database, it can’t be a unit test. The database is hot lava!
Some TDD veterans say you should strive to write "pure", isolated unit tests wherever possible, instead of writing integrated tests. It’s one of the ongoing (occasionally heated) debates in the testing community.
Being merely a young whippersnapper myself, I’m only partway towards all the subtleties of the argument. But in this chapter, I’d like to talk about why people feel strongly about it, and try to give you some idea of when you can get away with muddling through with integrated tests (which I confess I do a lot of!), and when it’s worth striving for more "pure" unit tests.
Note
|
I revisited some of these issues in my second book on architecture patterns, which outlines some strategies for getting the right balance between different types of test. |
- Isolated tests ("pure" unit tests) vs. integrated tests
-
The primary purpose of a unit test should be to verify the correctness of the logic of your application. An 'isolated' test is one that tests exactly one chunk of code, and whose success or failure does not depend on any other external code. This is what I call a "pure" unit test: a test for a single function, for example, written in such a way that only that function can make it fail. If the function depends on another system, and breaking that system breaks our test, we have an 'integrated' test. That system could be an external system, like a database, but it could also be another function which we don’t control. In either case, if breaking the system makes our test fail, our test is not properly isolated; it is not a "pure" unit test. That’s not necessarily a bad thing, but it may mean the test is doing two jobs at once.
- Integration tests
-
An integration test checks that the code you control is integrated correctly with some external system which you don’t control. 'Integration' tests are typically also 'integrated' tests.
- System tests
-
If an integration test checks the integration with one external system, a system test checks the integration of multiple systems in your application—for example, checking that we’ve wired up our database, static files, and server config together in such a way that they all work.
- Functional tests and acceptance tests
-
An acceptance test is meant to test that our system works from the point of view of the user ("would the user accept this behaviour?"). It’s hard to write an acceptance test that’s not a full-stack, end-to-end test. We’ve been using our functional tests to play the role of both acceptance tests and system tests.
If you’ll forgive the pretentious philosophical terminology, I’d like to structure our discussion of these issues like a Hegelian dialectic:
-
The Thesis: the case for "pure", fast unit tests.
-
The Antithesis: some of the risks associated with a (naive) pure unit testing approach.
-
The Synthesis: a discussion of best practices like "Ports and Adapters" or "Functional Core, Imperative Shell", and of just what it is that we want from our tests, anyway.
One of the things you often hear about unit tests is that they’re much faster. I don’t think that’s actually the primary benefit of unit tests, but it’s worth exploring the theme of speed.
Other things being equal, the faster your unit tests run, the better. To a lesser extent, the faster 'all' your tests run, the better.
I’ve outlined the TDD test/code cycle in this book. You’ve started to get a feel for the TDD workflow, the way you flick between writing tiny amounts of code and running your tests. You end up running your unit tests several times a minute, and your functional tests several times a day.
So, on a very basic level, the longer they take, the more time you spend waiting for your tests, and that will slow down your development. But there’s more to it than that.
Thinking sociology for a moment, we programmers have our own culture, and our own tribal religion in a way. It has many congregations within it, such as the cult of TDD to which you are now initiated. There are the followers of vi and the heretics of emacs. But one thing we all agree on, one particular spiritual practice, our own transcendental meditation, is the holy flow state. That feeling of pure focus, of concentration, where hours pass like no time at all, where code flows naturally from our fingers, where problems are just tricky enough to be interesting but not so hard that they defeat us…
There is absolutely no hope of achieving flow if you spend your time waiting for a slow test suite to run. Anything longer than a few seconds and you’re going to let your attention wander, you context-switch, and the flow state is gone. And the flow state is a fragile dream. Once it’s gone, it takes at least 15 minutes to live again.
If your test suite is slow and ruins your concentration, the danger is that you’ll start to avoid running your tests, which may lead to bugs getting through. Or, it may lead to our being shy of refactoring the code, since we know that any refactor will mean having to wait ages while all the tests run. In either case, bad code can be the result.
You might be thinking, OK, but our test suite has lots of integrated tests in it—over 50 of them, and it only takes 0.2 seconds to run.
But remember, we’ve got a very simple app. Once it starts to get more complex, as your database grows more and more tables and columns, integrated tests will get slower and slower. Having Django reset the database between each test will take longer and longer.
Gary Bernhardt, a man with far more experience of testing than me, put these points eloquently in a talk called Fast Test, Slow Test. I encourage you to watch it.
But perhaps more importantly than any of this, remember the lesson from [appendix_purist_unit_tests]. Going through the process of writing good, isolated unit tests can help us drive out better designs for our code, by forcing us to identify dependencies, and encouraging us towards a decoupled architecture in a way that integrated tests don’t.
All of this comes with a huge "but". Writing isolated united tests comes with its own hazards, particularly if, like you or me, we are not yet advanced TDD’ers.
Cast your mind back to the first isolated unit test we wrote. Wasn’t it ugly? Admittedly, things improved when we refactored things out into the forms, but imagine if we hadn’t followed through? We’d have been left with a rather unreadable test in our codebase. And even the final version of the tests we ended up with contain some pretty mind-bending bits.
As we saw a little later on, isolated tests by their nature only test the unit under test, in isolation. They won’t test the integration between your units.
This problem is well known, and there are ways of mitigating it. But, as we saw, those mitigations involve a fair bit of hard work on the part of the programmer—you need to remember to keep track of the interfaces between your units, to identify the implicit contract that each component needs to honour, and to write tests for those contracts as well as for the internal functionality of your unit.
Unit tests will help you catch off-by-one errors and logic snafus, which are the kinds of bugs we know we introduce all the time, so in a way we are expecting them. But they don’t warn you about some of the more unexpected bugs. They won’t remind you when you forgot to create a database migration. They won’t tell you when the middleware layer is doing some clever HTML-entity escaping that’s interfering with the way your data is rendered…something like Donald Rumsfeld’s unknown unknowns?
And finally, mocky tests can become very tightly coupled with the implementation.
If you choose to use List.objects.create()
to build your objects but your
mocks are expecting you to use List()
and .save()
, you’ll get failing tests
even though the actual effect of the code would be the same. If you’re not
careful, this can start to work against one of the supposed benefits of having
tests, which was to encourage refactoring. You can find yourself having to
change dozens of mocky tests and contract tests when you want to change an
internal API.
Notice that this may be more of a problem when you’re dealing with an API
you don’t control. You may remember the contortions we had to go through
to test our form, mocking out two Django model classes and using side_effect
to check on the state of the world. If you’re writing code that’s totally
under your own control, you’re likely to design your internal APIs so that
they are cleaner and require fewer contortions to test.
Let’s step back and have a think about what benefits we want our tests to deliver. Why are we writing them in the first place?
We want our application to be free of bugs—both low-level logic errors, like off-by-one errors, and high-level bugs like the software not ultimately delivering what our users want. We want to find out if we ever introduce regressions which break something that used to work, and we want to find that out before our users see something broken. We expect our tests to tell us our application is correct.
We want our code to obey rules like YAGNI and DRY. We want code that clearly expresses its intentions, which is broken up into sensible components that have well-defined responsibilities and are easily understood. We expect our tests to give us the confidence to refactor our application constantly, so that we’re never scared to try to improve its design, and we would also like it if they would actively help us to find the right design.
Finally, we want our tests to help enable a fast and productive workflow. We want them to help take some of the stress out of development, and we want them to protect us from stupid mistakes. We want them to help keep us in the "flow" state not just because we enjoy it, but because it’s highly productive. We want our tests to give us feedback about our work as quickly as possible, so that we can try out new ideas and evolve them quickly. And we don’t want to feel like our tests are more of a hindrance than a help when it comes to evolving our codebase.
I don’t think there are any universal rules about how many tests you should write and what the correct balance between functional, integrated, and isolated tests should be. Circumstances vary between projects. But, by thinking about all of your tests and asking whether they are delivering the benefits you want, you can make some decisions.
Objective | Some considerations |
---|---|
'Correctness' |
|
'Clean, maintainable code' |
|
'Productive workflow' |
|
There are also some architectural solutions that can help to get the most out of your test suite, and particularly that help avoid some of the disadvantages of isolated tests.
Mainly these involve trying to identify the boundaries of your system—the points at which your code interacts with external systems, like the database or the filesystem, or the internet, or the UI—and trying to keep them separate from the core business logic of your application.
Integrated tests are most useful at the 'boundaries' of a system—at the points where our code integrates with external systems, like a database, filesystem, or UI components.
Similarly, it’s at the boundaries that the downsides of test isolation and mocks are at their worst, because it’s at the boundaries that you’re most likely to be annoyed if your tests are tightly coupled to an implementation, or to need more reassurance that things are integrated properly.
Conversely, code at the 'core' of our application—code that’s purely concerned with our business domain and business rules, code that’s entirely under our control—has less need for integrated tests, since we control and understand all of it.
So one way of getting what we want is to try to minimise the amount of our code that has to deal with boundaries. Then we test our core business logic with isolated tests and test our integration points with integrated tests.
Steve Freeman and Nat Pryce, in their book Growing Object-Oriented Software, Guided by Tests, call this approach "Ports and Adapters" (see Ports and Adapters (diagram by Nat Pryce)).
We actually started moving towards a ports and adapters architecture in [appendix_purist_unit_tests], when we found that writing isolated unit tests was encouraging us to push ORM code out of the main application, and hide it in helper functions from the model layer.
This pattern is also sometimes known as the "clean architecture" or "hexagonal architecture". See Further Reading for more info.
Gary Bernhardt pushes this further, recommending an architecture he calls "Functional Core, Imperative Shell", whereby the "shell" of the application, the place where interaction with boundaries happens, follows the imperative programming paradigm, and can be tested by integrated tests, acceptance tests, or even (gasp!) not at all, if it’s kept minimal enough. But the core of the application is actually written following the functional programming paradigm (complete with the "no side effects" corollary), which actually allows fully isolated, "pure" unit tests, 'entirely without mocks'.
Check out Gary’s presentation titled "Boundaries" for more on this approach.
I’ve tried to give an overview of some of the more advanced considerations that come into the TDD process. Mastery of these topics is something that comes from long years of practice, and I’m not there yet, by any means. So I heartily encourage you to take everything I’ve said with a pinch of salt, to go out there, try various approaches, listen to what other people have to say too, and find out what works for you.
Here are some places to go for further reading.
- Fast Test, Slow Test and Boundaries
-
Gary Bernhardt’s talks from Pycon 2012 and 2013. His screencasts are also well worth a look.
- Ports and Adapters
-
Steve Freeman and Nat Pryce wrote about this in their book. You can also catch a good discussion in this talk. See also Uncle Bob’s description of the clean architecture, and Alistair Cockburn coining the term "hexagonal architecture".
- Hot Lava
-
Casey Kinsey’s memorable phrase encouraging you to avoid touching the database, whenever you can.
- Inverting the Pyramid
-
The idea that projects end up with too great a ratio of slow, high-level tests to unit tests, and a visual metaphor for the effort to invert that ratio.
- Integrated tests are a scam
-
J.B. Rainsberger has a famous rant about the way integrated tests will ruin your life. Then check out a couple of follow-up posts, particularly this defence of acceptance tests (what I call functional tests), and this analysis of how slow tests kill productivity.
- The Test-Double testing wiki
-
Justin Searls’s online resource is a great source of definitions and discussions of testing pros and cons, and arrives at its own conclusions of the right way to do things: testing wiki.
- A pragmatic view
-
Martin Fowler (author of 'Refactoring') presents a reasonably balanced, pragmatic approach.
- Start out by being pragmatic
-
Spending a long time agonising about what kinds of test to write is a great way to prevaricate. Better to start by writing whichever type of test occurs to you first, and change it later if you need to. Learn by doing.
- Focus on what you want from your tests
-
Your objectives are 'correctness', 'good design', and 'fast feedback cycles'. Different types of test will help you achieve each of these in different measures. How do different types of test help us achieve our objectives? has some good questions to ask yourself.
- Architecture matters
-
Your architecture to some extent dictates the types of tests that you need. The more you can separate your business logic from your external dependencies, and the more modular your code, the closer you’ll get to a nice balance between unit tests, integration tests and end-to-end tests.