Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve the tests, like, a lot #109

Open
Datseris opened this issue Aug 19, 2022 · 0 comments
Open

Improve the tests, like, a lot #109

Datseris opened this issue Aug 19, 2022 · 0 comments
Labels
good first issue Good for newcomers help wanted Extra attention is needed tests Related with the testing suite

Comments

@Datseris
Copy link
Member

Datseris commented Aug 19, 2022

One of the things the Good Scientific Code Workshop teaches is writing good unit tests. I have to admit, this repository suffers tremendously from really bad tests when it comes to the delay embedding tests. Practically all tests test if the output of the functions matchjes the value of the output of the same functions in some pre-existing data. I am pasting here the slide with "good advice on writing tests":

  • Actually unit: test atomic, self-contained functions. Each test must test only one thing, the unit. When a test fails, it should pinpoint the location of the problem. Testing entire processing pipelines (a.k.a. integration tests) should be done only after units are covered, and only if resources/time allow for it!
  • Known output / Deterministic: tests defined through minimal examples that their result is known analytically are the best tests you can have! If random number generation is necessary, either test valid output range, or use seed for RNG
  • Robust: Test that the expected outcome is met, not the implementation details. Test that the target functionality is met without utilizing knowledge about the internals. Also, never use internal functions in the test suite.
  • High coverage: the more functionality of the code is tested, the better
  • Clean slate: each test file should be runnable by itself, and not rely on previous test files
  • Fast: use the minimal amount of computations to test what is necessary
  • Regression: Whenever a bug is fixed, a test is added for this case
  • Input variety: attempt to cover a wide gambit of input types

One doesn't have to worry about re-writing all tests. In fact, a PR "correcting" a single test file is already very much welcomed!

The best place to start is re-writting delay embedding tests so that they are more flexible (and not test whether the found delay time is e.g., exactly 42), and to be analytically resolvable, e.g., test things that we know for sure what the outcome should be, like a dataset with cosine and sine as the timeseries (the embedding dimension here is clearly 2). Also separate the files to be individually runnable and not rely on global state. ANother analytic test: get the Lorenz96 and generate timeseries with 4 and with 6 oscillators. We do not know analytically the fractal dimension of the 4 and 6 case, but we do know analytically that the 6 case has larger fractal dimension. Hence, our embedding must be higher dimensional in the 6 over the 4 case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers help wanted Extra attention is needed tests Related with the testing suite
Projects
None yet
Development

No branches or pull requests

1 participant