Automated tests ‐ overview

Why tests

Automated tests are a great, we should write more of them. You may have seen similar statements when learning to code, and I think it's a true statement. However, if you tried to search "automated tests" or "how to write tests", you probably got very abstract, hard to understand answers. Here I try to give concrete reasons to motive us to write more tests, and some hints on how to do it.

There's probably more reasons than listed below, but here are the few that came to my mind.

They are comprehensive

While we develop a feature, we usually test it manually while developping, for example by clicking through our new view or calling our new function from a Django shell. However, when a feature gets bigger, manually testing everything can get quite long. It becomes very tempting to skip a part because we're sure it works or to forget test it. Tests never get tired and never forget: once written, they will always run, without the possibility of a human mistake.

Similarly, changes that we think affect only one part of the code may affect other, unexpected parts. If those other parts are tested, we are safer.

They enable us to work with foreign parts of the code

If we have to update a part of the code that we didn't know, we are often scared to do it. Even if the code we change is clearly written, it is difficult to be confident that our changes are covering all the same bases as the original version did. Automated tests can give us that confidence and with it the courage to update, refactor, improve...

They are a great documentation

Even if code is written clearly, it can be hard to deduce from the code itself what the intended effects are. By written test functions, we can give those test function names that make explicit what the code implies. For example, TestFrozenStatusService.test_shouldFreezeMember_memberIsNotExpectedToDoShifts_returnsFalse tells us that members that are exempted from shifts should not be frozen. Reading the names of the functions that touch our target code, we can get an understanding of what that code does.

This is also great against regressions: fixing a bug may require us to organize our code in a way that looks more complicated than necessary. If we don't make explicit why the code is written like it is, someone could be tempted to simplify the code, thus re-introducing the bug that we fixed. By writing a test that ensures that the bug is not present, we also ensure that it won't be re-introduced in the future.

The different kinds of tests

There are many kinds of tests. If you're new to the concept overall, you may try to search for "unit vs integration vs system tests" or so. In my experience, the answers given are very abstract and hard to understand. Here I try to give a short definition with examples. They may not be formally correct, but will hopefully let you understand the overall concept.

Unit vs Integration

Within a Django project, unit tests are tests that check one single function for one specific case. An example would be TestFrozenStatusService.test_shouldFreezeMember_memberAlreadyFrozen_returnsFalse. An integration test instead will check an entire view, for example TestMemberSelfUnregisters.test_member_self_unregisters_threshold

Unit tests

Isolation

When writting unit tests, test a piece of code "in isolation": we want to test that piece of code and nothing else. For example if in FrozenStatusService.should_freeze_member() we remove the check for ShiftExpectationService.is_member_expected_to_do_shifts() and then run the tests, the only test that should fail is test_shouldFreezeMember_memberIsNotExpectedToDoShifts_returnsFalse. All other test_shouldFreezeMember_* should pass, thus letting us identify quickly what the problem is.

Database access

Formally, unit tests should not interact with the database, since that would mean that we test the database connection on top of our code. When possible, you can ensure that by having your test class inherit directly from SimpleTestCase. This also has the advantage of leading to faster tests. It may however require you to write a lot of mocking code to fake the database accesses. This is why, while formally not correct, Tapir unit tests often do access the database and inherit from TestCase (through TapirFactoryTestBase)

Integration tests

Where unit tests intend to check that a single isolated piece of code works as intended, integration tests try to make sure that the pieces fit together well. This usually means testing a view. A simple example is TestTapirUserSelfUpdate, which makes sure that a member can update their displayed pronouns but not the pronouns of other members.

The test client

In TestTapirUserSelfUpdate.try_update(), a request is sent with self.client.post. This produces the same effects as if someone had opened their browser, filled in the field as defined in the data parameter, and submitted the field. We can then check what the answer from the server looks like, and if the expected changes have been applied correctly.

The responses given by the test client also contain the context that was sent from the view to the template. This allows us the check that context without looking at the HTML, which would be tricky. You can see an example in TestShareOwnerList.visit_view()

Selenium tests or E2E-tests

There is another kind of test that we use rarely: selenium tests. The generic name would be system tests or E2E-tests (for end-to-end). In these tests, an actual django server is started, a browser is opened, and the browser is manipulated by selenium. Thus the test is as close to actual user behaviour as possible, which is nice. However, those tests are quite long to write and are quite fragile because they depend on the structure of our html pages. You can see for example how a user can login with selenium here: TapirSeleniumTestBase.login.

Running tests

You can run all tests using the following command:

docker compose run --rm web poetry run pytest

Since we have quite a few tests now, running them all can take several minutes. You can run the tests from a single folder or a single file by specifying the corresponding path, for example this command will run only the tests from the shift app or from the test_FrozenStatusService.py file:

docker compose run --rm web poetry run pytest tapir/shifts/tests[/test_FrozenStatusService.py]

You can also specify a test name, the test name being the name of the function:

docker compose run --rm web poetry run pytest -k 'test_name'

Shorter test output

Our test runner pytest is configured to check for code coverage, which is written at the end of a run of tests. While developing it may be annoying to scroll past that everytime, so you can remove the addopts = --cov=tapir line from pytest.ini.

Github workflows

GitHub will automatically run all the tests when commits are pushed. This behaviour is among others defined in .github/workflows/tests.yml.

This normally costs "workflow minutes", which are limited to 2000 per month on a free plan, but since our organization has the open source plan, we have unlimitted minutes. You may run into that limit if you make a fork to your organization.

Factories

When writting tests, we often need to create instances of our models: a shift to register to, a user to check the permissions of... We could do it with Model.objects.create, but this has several drawbacks:

We may need to define field values that are required by the model but that are irrelevant to our test, making it less clear what the tested fields are
If a required field is added to a model, we would need to update all the calls to create() that have been written until now.

That's why we use Factories instead. They are defined in tests/factories.py for each app. Once a factory is defined, we can create valid objects with Factory.create(). The created object has random but "life-like" field values. For example ShiftFactory.create() will create a Shift object with a random name and date, with the duration being random between 1 and 4 hours.

You can set the value of a field to something explicit instead of letting it be randomly generated by calling Factory.create(field=name). This way lets us define only the fields that are relevant for our test. For example, in TestExemptions.test_invalid_attendances_are_not_affected_by_exemptions, we create several shifts, but we only care about the start_time, the name is irrelevant.

Create the same objects each run with the seed

We want our fake field values to be random so that we don't have to define them, but we also want them to be the same everytime we run the tests. Otherwise we could have tests that sometimes pass and sometimes not, depending on what values got generated. To prevent that, we set the randomization seed with TapirFactoryTestBase.setUp().

`create()` vs `build()`

Factories have a create() and a build() function. The create() one creates the object in the database, while build() doesn't. If you know that your test won't access the database, using build() will make that explicit and make the test faster.

Mocks

Mocking is an important part of testing. It is especially useful when writing unit tests. For example, in TestFrozenStatusService.test_freezeMemberAndSendMail(), we want to make sure that the freeze_member_and_send_email() function also calls the _update_attendance_mode_and_create_log_entry() function. We are already testing _update_attendance_mode_and_create_log_entry() in test_updateAttendanceModeAndCreateLogEntry, so we know that it works. We also would like that, if _update_attendance_mode_and_create_log_entry() breaks, only the test_updateAttendanceModeAndCreateLogEntry() test fails. test_freezeMemberAndSendMail should not fail, to make it clear where the error is.

That's why we mock _update_attendance_mode_and_create_log_entry in test_freezeMemberAndSendMail. This is gives us a mock function that we can controll how many times and with which parameters it was called.

Mocking is a fairly deep topic, search for @patch and @patch.object to get examples.

Good practices

Naming convention

The naming convention for test functions is: test_[ELEMENT_BEING_TESTED]_[CASE_BEING_TESTED]_[EXPECTED_RESULT]. For example: test_shouldFreezeMember_memberIsNotExpectedToDoShifts_returnsFalse.

Since we use underscores to separate the 3 parts of the function name, and python naming convention for function is snake case, we often can't use the exact function name in the test name. Replace snake case with camel case.

Time-independance

Sometimes, we want to test things that depend on what time or what date is now, compared to some values in the database. For example in TestWelcomeDeskMessages.test_is_paused, we create a MembershipPause with fixed start and end dates. The behaviour we want to test depends on wether today's date is inside the pause or outside. To make sure our test works regardless of when it is run, we use mock_timezone_now.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Automated tests ‐ overview

Why tests

They are comprehensive

They enable us to work with foreign parts of the code

They are a great documentation

The different kinds of tests

Unit vs Integration

Unit tests

Isolation

Database access

Integration tests

The test client

Selenium tests or E2E-tests

Running tests

Shorter test output

Github workflows

Factories

Create the same objects each run with the seed

`create()` vs `build()`

Mocks

Good practices

Naming convention

Time-independance

Clone this wiki locally

Automated tests ‐ overview

Why tests

They are comprehensive

They enable us to work with foreign parts of the code

They are a great documentation

The different kinds of tests

Unit vs Integration

Unit tests

Isolation

Database access

Integration tests

The test client

Selenium tests or E2E-tests

Running tests

Shorter test output

Github workflows

Factories

Create the same objects each run with the seed

create() vs build()

Mocks

Good practices

Naming convention

Time-independance

Clone this wiki locally

`create()` vs `build()`