-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrate Continuous Benchmarking #79
Comments
@kgoldfeld You told me that you manually test the speed of certain functions before submission to cran. |
Yes, I test each function manually - by very creating large data sets. I generally haven't done it systematically for all distributions and functions, but I've made sure that any new or affected functions work well with large data sets. It seems like a systematic approach is warranted here. |
The solution: https://github.com/lorenzwalthert/touchstone |
I will test this on my fork and then ready a pr to integrate this here. |
This looks pretty cool, but it still requires us to develop the large-scale tests - correct? |
Yes {touchstone} provides the infrastructure but we have to think about what we want to test for performance. (this is btw separate from the unit test using {testthat} which is testing functionality) For now, my idea was to take the examples from the vignettes and create the test from them just with heavily increased sample sizes. Unless you have a more methodical idea? |
That sounds like a reasonable place to start |
So I worked with touchstone a bit and contributed some code and now have a better understanding of how to use it. I think it makes sense to approach it kind of like unit tests: benchmark small units (like dist generating functions) we currently work on. We can still do large "integration" kinds of tests using bench manually. Due to the way touchstone's inference works all tests are run multiple times for each branch, so run time on these jobs can get very long. But maybe the def table with ALL dists in it was just over kill xD On the other hand I would like to have some broader tests in there as well with the large changes we are going to make as to not miss some large, unintended slow down somewhere... I will keep working on it and introduce a pr once I feel it is at a good place :) Let me know what you think on the matter! |
Thanks for working through this. I’m not totally sure I understand how touchstone relates to testthat, and how it relates to benchmarking for speed. If we have a set of pre-established generation data processes that should remain stable after making changes, I wouldn’t mind if it took a couple of minutes to run – that should be plenty of time. I cannot imagine what we would do that would take much longer than that – and even that seems quite long, given that data generation is pretty much instantaneous. (This is obviously not the case if we get into model estimation, but I don’t really see any need for that in these particular performance tests.) Let me know if I am totally missing the point. |
Oh a def table with all dists and 1000000 rows takes a while xD I guess the main issue there is actually memory (-> #50 ) but yeah for a few thousands rows everything should be rather quick. I will prepare some benchmarks on my fork to demonstrate :) |
use
bench::cb
with github actions to assert simstudy's good performance.The text was updated successfully, but these errors were encountered: