Skip to content

Black-box testing of stateful systems using properties

Notifications You must be signed in to change notification settings

47degrees/kotest-trace

Repository files navigation

kotest-trace

Black-box testing of stateful systems using properties

This library provides the ability to generate traces of actions to test properties of a stateful system. kotest-trace builds upon the property-based testing capabilities of Kotest. In order to express how the system evolves over time, new temporal operators have been introduced, such as always and eventually.

When we test using properties, the inputs are generated randomly, and then fed to the test body, that performs the corresponding checks. In any case, generators represent a single value. With kotest-trace you gain the ability to produce arbitrary traces of actions, which are applied in sequence. At each step, properties can be checked.

Example: testing a counter

As an example of how the library works, let's test a counter. Any implementation should simply provide a means to increment the inner counter, and to return the current value.

interface Counter {
  fun increment()
  val value: Int
}

The first step to apply kotest-trace is to define the actions which can be performed over this interface. In this case they closely correspond to the members of the interface, but if your system is more complex -- like a HTTP server -- this may not be the case.

enum class Action {
  INCREMENT, READ
}

The next step is to define a model generator for this trace. The simplest option is to generate each simple action independently of the previous ones. Since Action is an enumeration, we can use Kotest's Arb.enum.

val model: ArbModel<Unit, Action> = 
  StatelessArbModel { Arb.enum<Action>() }

In more complex scenarios the steps are not independent of each other -- for example, you may want to generate two steps relating to the same identifier --, so you need to keep a small piece of information aside. Those cases are covered by implementing the full ArbModel interface.

We still need to link our Actions with the particular implementation, by providing a function which takes a single action and the current state, and generates the new state and a response. If you are familiar with reducers (or folds) in other languages, this is the same idea.

fun apply(action: Action, counter: Counter): Step<Counter, Int> =
  when (action) {
    Action.INCREMENT -> Step(counter.increment(), 0)
    Action.READ -> Step(counter, counter.value)
  }

The fact that we have a response is important, because in our properties we're only able to check how the responses look like, inspecting the state is not possible. In other words, we approach the system as a black-box, with which we can only communicate with actions and responses.

We're ready to put the pieces together! Using checkAgainst we provide the model generator, the initial state for the system under test, and the function to evolve it at every step.

"let's test" {
  checkAgainst(model, CounterUnderTest(), ::apply) {
    // property
  }
}

But how do we express a property of this implementation?

Temporal properties

Let's begin with a simple one: we expect every READ to return a non-negative number. Since this is a property which should be satisfied at every step, we use the always operator.

always {
  should {
    if (it.action == Action.READ)
      it.response.shouldBeGreaterThanOrEqual(0)
  }
}

This example shows two important characteristics of how we express properties. The first one is the aforementioned usage of always to make the property apply to every step (if you are wondering; without it, it only applies to the first step). The second is that we need to wrap checks with should, inside that block we can access both the action which was applied and the response we obtained.

If the system under test works correctly, Kotest is happy. Otherwise, a problem is reported.

Property test failed for inputs

0) [READ, READ, READ, READ, READ, INCREMENT, READ, READ, READ, INCREMENT, INCREMENT, READ, READ, READ, READ, INCREMENT, INCREMENT, INCREMENT, INCREMENT, READ, ...and 80 more (set the 'kotest.assertions.collection.print.size' JVM property to see more / less items)]

Caused by TraceAssertionError(
  trace=[READ],
  state=Success(Info(action=READ, previousState=-2, nextState=-2, response=-1)),
  problems=[java.lang.AssertionError: -1 should be >= 0]) at
	com.fortyseven.kotest.trace.formula.ValidateKt.throwIfFailed(Validate.kt:40)
	com.fortyseven.kotest.trace.formula.ValidateKt.check(Validate.kt:29)
	com.fortyseven.kotest.trace.formula.ValidateKt.check$default(Validate.kt:16)
	com.fortyseven.kotest.trace.ArbModelKt$checkAgainst$2.invokeSuspend(ArbModel.kt:60)

Attempting to shrink arg [READ, READ, READ, READ, READ, INCREMENT, READ, READ, READ, INCREMENT, INCREMENT, READ, READ, READ, READ, INCREMENT, INCREMENT, INCREMENT, INCREMENT, READ, ...and 80 more (set the 'kotest.assertions.collection.print.size' JVM property to see more / less items)]
Shrink #1: [READ, READ, READ, READ, READ, INCREMENT, READ, READ, READ, INCREMENT, INCREMENT, READ, READ, READ, READ, INCREMENT, INCREMENT, INCREMENT, INCREMENT, READ, ...and 30 more (set the 'kotest.assertions.collection.print.size' JVM property to see more / less items)] fail
Shrink #2: [READ, READ, READ, READ, READ, INCREMENT, READ, READ, READ, INCREMENT, INCREMENT, READ, READ, READ, READ, INCREMENT, INCREMENT, INCREMENT, INCREMENT, READ, ...and 5 more (set the 'kotest.assertions.collection.print.size' JVM property to see more / less items)] fail
Shrink #3: [READ, READ, READ, READ, READ, INCREMENT, READ, READ, READ, INCREMENT, INCREMENT, READ] fail
Shrink #4: [READ, READ, READ, READ, READ, INCREMENT] fail
Shrink #5: [READ, READ, READ] fail
Shrink #6: [READ] fail
Shrink #7: [] pass
Shrink result (after 7 shrinks) => [READ]

As you can see, the first trace where this property failed is quite long, but the shrinking process found that a sequence with a single READ is already wrong. It also tells which of the formulae in the property failed -- in this case, shouldBeGreaterThanOrEqual(0). You can have more than one should block in your property, and each of them is checked independently.

Now let's move to a more complex property, which showcases how you can relate different steps in the sequence. In this case the property should state that if you read at a given moment, any later read should return a value at least as large as the one you obtained.

always {
  implies({ it.action == Action.READ }) {
    afterwards { previous ->
      implies({ it.action == Action.READ }) {
        should { it.response.shouldBeGreaterThanOrEqual(previous.response) }
      }
    }
  }
}

The outer operator is always again, since this is a property to apply at every step. We've introduced implies, which specifies a conditional property, something which should only be checked when the condition holds. Using implies instead of if is important for performance reasons, since the library is able to drop checks in the whole sequence of states. Going back to our example, it says that if the action was READ, then we should continue with further checks. The next operator is afterwards, which specifies a property which should hold in every step after the current one, and also gives access to the information of the current state -- which at the point of afterwards has become a previous step. The final element is the test of the new response being larger than the previous one.

Comparison with...

...white-box testing

There are other libraries geared towards testing of stateful systems, like Hedgehog for Scala, and quickcheck-state-machine for Haskell. Those libraries emphasize a different approach of testing, ín which the state of the implementation is tested against the state of the model. This is more similar to white-box testing, that is, in testing in which the inner workings of the model are available to checks.

kotest-trace takes another approach, emphasizing the action/response model. There are few reasons that makes this approach interesting. Those reasons stem from the fact that you don't need a full model of the system in order to test it, only a model of how actions are generated. Instead, the power lies on the temporal operators provided by the library, which allow us to test "at a distance", relating different steps in the same trace.

One of the most difficult things to practice when learning about property-based testing is to not simply replicate the desired implementation, but instead define properties which give enough room to different implementations, given that they satisfy whatever is needed for the rest of the system to behave correctly. This problem is more pressing if you use white-box testing, because you want to test step-by-step that the inners of the implementation correspond to the inner of the model. You're thus prone to overfit against the model.

The last reason to use black-box testing is that modelling some systems would require a huge investment, yet specifying properties about traces of inputs is relatively easy. Think of an HTTP server, or a set of microservices.

...non-deterministic testing

Kotest already ships with some temporal operators, like eventually or continually. Those operators are related to the ones in kotest-trace, but apply to a completely different realm. Kotest built-in one talk about actual time, so you can express things like "if we increment the counter, any read after 50 ms should return a greater number". This is useful when testing the concurrent properties of a piece of code.

In kotest-trace the temporal operators refer to logical time; each action counts as one further step, there's no relation to a certain step taking more or less time. As such, the usage is different, because it talks about the logical behavior of the code, not about its concurrency characteristics.

References

The use of temporal logic to describe program properties has a long history. Some interesting pointers are:

  • Quickstrom's Specification Language. Quickstrom is a tool for testing how a web application behaves when some events like clicking occur. It uses temporal logic to express the desired outcome.
  • Formal Software Design with Alloy 6. Alloy is a tool for formal modeling and exploration of systems. In Alloy you use temporal logic in both describing the system and specifying properties which should be verified.

About

Black-box testing of stateful systems using properties

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages