Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

temporary suggestion for testing simulate for design class #802

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

robertadamsbayer
Copy link
Collaborator

@robertadamsbayer robertadamsbayer commented Mar 20, 2024

Pull Request

Fixes #798

@robertadamsbayer robertadamsbayer linked an issue Mar 20, 2024 that may be closed by this pull request
11 tasks
@robertadamsbayer
Copy link
Collaborator Author

Hello @danielinteractive please review the suggestion for testing. I tried both, the scenario with fixed seeds (checking the full mySims object, thus inherently checking all subsequent scenarios) and the scenario which just checks for the "meta data" (i.e. lengths, classes etc.). Before continuing to apply that to all other methods in Design-methods I wanted to check in and ask for best-practice guidance. Also, I would suggest, if we agree on the testing strategy, to do all Design-methods testing in one branch / PR then because it will be quite straight forward to implement and review?

Copy link
Contributor

github-actions bot commented Mar 20, 2024

badge

Code Coverage Summary

Filename                         Stmts    Miss  Cover    Missing
-----------------------------  -------  ------  -------  ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
R/checkmate.R                       73       1  98.63%   72
R/crmPack-package.R                  4       0  100.00%
R/CrmPackClass-methods.R             5       0  100.00%
R/Data-class.R                     156       5  96.79%   43, 565-566, 572-577
R/Data-methods.R                   267       0  100.00%
R/Data-validity.R                  144       1  99.31%   21
R/Design-class.R                   391       0  100.00%
R/Design-methods.R                2775     771  72.22%   563-567, 591-594, 602-621, 626-655, 659-660, 662, 677-685, 689, 701-720, 1122-1126, 1253, 1267-1271, 1333, 1523, 1740, 1767-1770, 1779-1790, 1794-1813, 1824-1828, 1834-1846, 2109, 2134-2137, 2144-2155, 2159-2178, 2190-2194, 2200-2212, 2496-2499, 2527-2530, 2538-2550, 2553-2564, 2568-2604, 2621-2630, 2636-2651, 2676, 2717-2718, 2981-3451, 3546-3557, 3560-3571, 3575-3611, 3628-3637, 3645-3660, 3698, 3740-3741, 4020, 4022-4023, 4082, 4118, 4155-4158, 4177-4181, 4242-4249, 4277-4281, 4289-4307, 4333-4352, 4355, 4388, 4410-4428, 4697, 4792
R/Design-validity.R                 38      10  73.68%   47-56
R/fromQuantiles.R                  176      67  61.93%   238-378
R/helpers_broom.R                   74      10  86.49%   30, 34-35, 37-38, 40, 81, 102-104
R/helpers_covr.R                    23       0  100.00%
R/helpers_data.R                    96       1  98.96%   139
R/helpers_design.R                 126      36  71.43%   22, 77-129
R/helpers_jags.R                    77       0  100.00%
R/helpers_kintr_Increments.R       156       2  98.72%   191, 278
R/helpers_knitr_CohortSize.R       100       0  100.00%
R/helpers_knitr_GeneralData.R      183      61  66.67%   18, 67-70, 83-86, 99-102, 114-117, 130-133, 155-157, 240-244, 253-298, 346, 351
R/helpers_model.R                   85       4  95.29%   38, 89-90, 139
R/helpers_rules.R                  428       0  100.00%
R/helpers_samples.R                  5       0  100.00%
R/helpers_simulations.R             27       0  100.00%
R/helpers.R                        214      61  71.50%   107-127, 162-178, 200-304, 339-353
R/logger.R                          11       0  100.00%
R/mcmc.R                           290      18  93.79%   92-97, 376-377, 387, 389-390, 393-396, 579-580, 669, 675, 733
R/McmcOptions-class.R               22       0  100.00%
R/McmcOptions-methods.R              8       1  87.50%   43
R/McmcOptions-validity.R            42       0  100.00%
R/Model-class.R                   1062     166  84.37%   145-147, 216-218, 222-224, 283-285, 357-359, 363-365, 444-446, 513-515, 577-581, 584-587, 690-693, 697-698, 813-817, 937-939, 943-951, 1096-1098, 1103-1106, 1110-1113, 1229-1233, 1235-1238, 1242-1245, 1248, 1409-1419, 1424-1430, 1585-1588, 1594-1601, 1758, 1767, 1776, 1785, 1794-1799, 1935, 1944, 1953, 1961-1963, 2807-2836, 2840-2846, 2853-2857, 2862, 2969-2982, 3008, 3104-3106, 3110, 3203-3205, 3209, 3278-3290, 3308, 3368-3370, 3372-3373, 3376-3381
R/Model-methods.R                  472      38  91.95%   78, 233-238, 809-854, 1175-1184
R/Model-validity.R                 443      16  96.39%   430-433, 442-445, 596-604
R/ModelParams-class.R               17       0  100.00%
R/ModelParams-validity.R            21       0  100.00%
R/Rules-class.R                    458       0  100.00%
R/Rules-methods.R                 1541     184  88.06%   889, 892, 895, 1010, 1013, 1016, 1136-1139, 1173, 1276-1279, 1314, 2582-2590, 2614-2621, 2784-2793, 3073-3082, 3215-3458, 3745, 3749, 3794, 3798
R/Rules-validity.R                 448      30  93.30%   684-723
R/Samples-class.R                    6       0  100.00%
R/Samples-methods.R               1188      21  98.23%   410-420, 648, 1665-1666, 1698, 1711, 1893, 2223-2228
R/Samples-validity.R                10       0  100.00%
R/Simulations-class.R              208       5  97.60%   769-772, 1028
R/Simulations-methods.R           1617    1473  8.91%    65-350, 406, 416-435, 448-453, 500-509, 674-2969
R/Simulations-validity.R            75      75  0.00%    20-168
R/utils.R                            6       0  100.00%
TOTAL                            13568    3057  77.47%

Diff against main

Filename              Stmts    Miss  Cover
------------------  -------  ------  -------
R/checkmate.R           -14      -1  +0.93%
R/fromQuantiles.R        +4     +15  -7.84%
R/helpers_design.R        0      -6  +4.76%
TOTAL                   -10      +8  -0.90%

Results for commit: f19a571

Minimum allowed coverage is 80%

♻️ This comment has been updated with latest results

Copy link
Collaborator

@danielinteractive danielinteractive left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @robertadamsbayer for the nice work! please see feedback below - I would simplify this further. yes, once agreed, we can put many tests in the same PR. no need to shoot for everything right away though, because then the PR gets again too large and takes too much time. incremental progress here is still best too

rng_kind = "Mersenne-Twister",
rng_seed = 1234
)
time <- system.time(mySims <- simulate(design,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need to time this simulation run? (not sure, because time is also not used below and would anyway differ between platforms)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fully agree - was a bit lazy from me taking that over from the examples...

Comment on lines 400 to 406
expect_class(mySims, "Simulations")

expect_equal(any(sapply(mySims@fit[[1]], is.numeric)), TRUE) # check if all elements in mySims@fit are numeric

expect_equal(length(mySims@stop_report), 5) # check for length

expect_logical(mySims@stop_report) # check for stop_report to be logical vector
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

something like this would be sufficient

),
))[3]

expected <-
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is too detailed / too much maintenance when numbers change etc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for guidance! I will adopt.

Copy link
Collaborator Author

@robertadamsbayer robertadamsbayer Mar 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@danielinteractive Hi Daniel, I put the new "end-to-end" test for simulate in "test-helpers_design.R". When moving it over to "test-Design-methods.R" I realized that there are already tests (using snapshots) for simulate (and different classes) already implemented. Those use fixed rng seeds, therefore "mimicking" quite closely what you suggested not to do. I don´t know what to keep and what to get rid of there?!

image

image

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would just leave existing tests. More coverage does not hurt us

Copy link
Contributor

github-actions bot commented Mar 20, 2024

Unit Test Performance Difference

Test Suite $Status$ Time on main $±Time$ $±Tests$ $±Skipped$ $±Failures$ $±Errors$
CrmPackClass-methods 💚 $59.73$ $-10.26$ $0$ $0$ $0$ $0$
Design-methods 💔 $46.33$ $+4.66$ $+46$ $0$ $0$ $+1$
Rules-methods 💔 $49.37$ $+4.30$ $0$ $0$ $0$ $0$
Samples-methods 💔 $30.21$ $+1.02$ $0$ $0$ $0$ $0$
fromQuantiles 💀 $0.03$ $-0.03$ $-8$ $0$ $0$ $0$
helpers_knitr 💔 $27.84$ $+3.53$ $+11$ $0$ $0$ $0$
Additional test case details
Test Suite $Status$ Time on main $±Time$ Test Case
CrmPackClass-methods 💚 $30.08$ $-5.68$ tidy_methods_exist_for_all_relevant_classes
CrmPackClass-methods 💚 $29.65$ $-4.58$ tidy_methods_return_non_empty_value_for_all_classes
Design-methods 👶 $+0.94$ simulate_DualDesign_produces_consistent_results_with_sentinel_patients
Design-methods 👶 $+1.53$ simulate_for_the_class_design_returns_correct_objects
Design-methods 👶 $+0.92$ simulate_for_the_class_design_with_placebo_and_sentinel_patients_returns_correct_objects
Design-methods 👶 $+1.12$ simulate_for_the_class_design_with_placebo_returns_correct_objects
Rules-methods 💔 $13.45$ $+2.09$ stopTrial_works_correctly_for_StoppingTDCIRatio_when_samples_are_not_provided
Rules-methods 💔 $10.89$ $+1.53$ stopTrial_works_correctly_for_StoppingTDCIRatio_when_samples_are_provided
Samples-methods 💔 $0.25$ $+1.42$ size_Samples_returns_correct_number_of_samples
Simulations-class 💀 $0.01$ $-0.01$ .DefaultPseudoSimulations_cannot_be_instantiated_directly
Simulations-class 💀 $0.01$ $-0.01$ PseudoSimulations_generator_function_works_as_expected
Simulations-class 💀 $0.03$ $-0.03$ PseudoSimulations_object_can_be_created_with_the_user_constructor
Simulations-class 💀 $0.00$ $-0.00$ PseudoSimulations_user_constructor_argument_names_are_as_expected
fromQuantiles 💀 $0.03$ $-0.03$ h_get_min_inf_beta_works_as_expected_with_p_q
helpers_knitr 💔 $27.03$ $+3.03$ asis_parameter_works_correctly_for_all_implemented_methods

Results for commit abf2c65

♻️ This comment has been updated with latest results.

@robertadamsbayer
Copy link
Collaborator Author

robertadamsbayer commented Mar 27, 2024

Hi @danielinteractive - in DualDesign there are placebo clauses which are not covered in tests yet (indicated by the missing rows in test coverage). When I create a dual design which contains placebo data and try to pass that to simulate I get the following error:

image

Do we have a working example for DualDesign with placebo or ist that somehow a special case that I am not aware of?

@danielinteractive
Copy link
Collaborator

Thanks @robertadamsbayer , I would check the examples folder, if there is nothing it might be a bug ... and we would need to fix it

@danielinteractive
Copy link
Collaborator

This PR is parked for now, in order to prioritize tests for Simulation methods.

@danielinteractive danielinteractive marked this pull request as draft June 13, 2024 14:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: To Start Review
Development

Successfully merging this pull request may close these issues.

Code coverage for simulate for RuleDesign Code coverage for design methods
2 participants