Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate Continuous Benchmarking #79

Closed
assignUser opened this issue Oct 27, 2020 · 10 comments · Fixed by #112
Closed

Integrate Continuous Benchmarking #79

assignUser opened this issue Oct 27, 2020 · 10 comments · Fixed by #112

Comments

@assignUser
Copy link
Collaborator

use bench::cb with github actions to assert simstudy's good performance.

@assignUser
Copy link
Collaborator Author

@kgoldfeld You told me that you manually test the speed of certain functions before submission to cran.
How are you doing that? Just generating very larg sets of data with each distribution etc.?

@kgoldfeld
Copy link
Owner

Yes, I test each function manually - by very creating large data sets. I generally haven't done it systematically for all distributions and functions, but I've made sure that any new or affected functions work well with large data sets. It seems like a systematic approach is warranted here.

@assignUser
Copy link
Collaborator Author

The solution: https://github.com/lorenzwalthert/touchstone

@assignUser
Copy link
Collaborator Author

The solution: https://github.com/lorenzwalthert/touchstone

I will test this on my fork and then ready a pr to integrate this here.

@kgoldfeld
Copy link
Owner

This looks pretty cool, but it still requires us to develop the large-scale tests - correct?

@assignUser
Copy link
Collaborator Author

Yes {touchstone} provides the infrastructure but we have to think about what we want to test for performance. (this is btw separate from the unit test using {testthat} which is testing functionality)

For now, my idea was to take the examples from the vignettes and create the test from them just with heavily increased sample sizes. Unless you have a more methodical idea?

@kgoldfeld
Copy link
Owner

That sounds like a reasonable place to start

@assignUser
Copy link
Collaborator Author

So I worked with touchstone a bit and contributed some code and now have a better understanding of how to use it.

I think it makes sense to approach it kind of like unit tests: benchmark small units (like dist generating functions) we currently work on. We can still do large "integration" kinds of tests using bench manually. Due to the way touchstone's inference works all tests are run multiple times for each branch, so run time on these jobs can get very long. But maybe the def table with ALL dists in it was just over kill xD

On the other hand I would like to have some broader tests in there as well with the large changes we are going to make as to not miss some large, unintended slow down somewhere... I will keep working on it and introduce a pr once I feel it is at a good place :)

Let me know what you think on the matter!

@kgoldfeld
Copy link
Owner

Thanks for working through this. I’m not totally sure I understand how touchstone relates to testthat, and how it relates to benchmarking for speed. If we have a set of pre-established generation data processes that should remain stable after making changes, I wouldn’t mind if it took a couple of minutes to run – that should be plenty of time. I cannot imagine what we would do that would take much longer than that – and even that seems quite long, given that data generation is pretty much instantaneous. (This is obviously not the case if we get into model estimation, but I don’t really see any need for that in these particular performance tests.)

Let me know if I am totally missing the point.

@assignUser
Copy link
Collaborator Author

Oh a def table with all dists and 1000000 rows takes a while xD I guess the main issue there is actually memory (-> #50 ) but yeah for a few thousands rows everything should be rather quick.
And in the end these tests will run on github as to not slow our machines down, so the duration is not that important anyway.

I will prepare some benchmarks on my fork to demonstrate :)

@assignUser assignUser added this to the Simstudy 1.0(?) milestone Oct 31, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants