Parse $DATA record for the purpose of filtering the input data #711

barrettk · 2024-07-02T21:13:40Z

Parse $DATA record for the purpose of filtering the input data

Extracts and formats IGNORE and ACCEPT record options, transforming them into dplyr::filter() expressions
adds filter argument to nm_data()
adjusts setup_bootstrap_run to use this method instead of nm_join(). Additional details will be documented in data argument to setup_bootstrap_run and remove nm_join requirement #707

- Extracts and formats IGNORE and ACCEPT record options, transforming them into `dplyr::filter()` expressions

R/modify-records.R

- After some internal discussion, it was suggested that the ability to create a full NM-TRAN dataset via parsing the $DATA and $INPUT records could be valuable. - This commit adds the dropping of columns via `DROP` and `SKIP` options, handles the renaming of columns, and null mapping (changing null values to a particular character string). - nm_data_filter was renamed from filter_nm_data. All the helper function names may change, but was trying to be consistent across the new helper functions at the minimum. - minor bug fix to nm_data_filter: can handle `=` IGNORE/ACCEPT options

- read_data_record and get_records() can only be used for reading the records, not overwriting. This slipped past me in a previous commit. - dont apply `#` filter to all columns - only the first

- pull out the individual expression translation into a separate function - move all related functions to this PR to a new script

- (list) type expressions must be split up first. This wasn't necessary before the refactor, but is now - add examples and documentation

R/nm-tran-data.R

- removed other NM-TRAN setup functions. They were incomplete/not fully accurate, and we can just use NM-TRAN directly to create that dataset - Hook up filter_nm_data to setup_bootstrap_run and update tests

- pulled out additional logic for parsing the IGNORE and ACCEPT options - fix: adjusted `@` IGNORE filter option to only apply to the first column

- add tests for all NONMEM operators

barrettk · 2024-07-11T15:32:53Z

Hey @kylebaron I was wondering if you could take a look at translate_nm_operator and translate_nm_expr after the @ adjustment and latest commit (added 2 more NONMEM 7.3 supported operators). Additionally, I was wondering if the prior would have to be revisited to account for the TIME column in the section below:

Part of the IGNORE=(list) documentation in the $DATA docs:

With "=", "==", "/=', ".EQ." and ".NE.", the value in the data record and
the value in the list are compared as character strings. Otherwise, they
are converted to numeric and compared numerically. (This is the case
with | .NEN. and .EQN.) This comparison is made prior to time translation.
Hence, the TIME item cannot be compared numerically if it contains
non-numeric characters such as ":".

kylebaron · 2024-07-16T12:40:40Z

Hey @barrettk - looking at this this AM. KTB

kylebaron

This looks great, @barrettk. Left some comments / suggestions. cc @seth127 .

R/nm-file.R

R/filter-nm-data.R

- adjust map call placement to improve clarity of helper functions - abort if both IGNORE and ACCEPT expressions are found - now return the `type` as part of `get_data_filter_exprs()`, which is then passed to `translate_nm_expr()` - inform user of how many records are dropped as part of `nm_data(filter = TRUE)` - adjust documentation and arguments

- doesnt work by default for internal functions. I imagine there is a workaround for this, though the inclusion of these examples are not important and only for developer purposes

- dont look for digits - only look for one character

- extra `\\` are added, which are escaped when parse() is called down the line

kylebaron

Noticed at the end of the review some inconsistency in capitalization for the first letter of your function argument descriptions; might be worth a look at all the functions here to get it consistent.

R/bootstrap-model.R

R/filter-nm-data.R

R/bootstrap-model.R

R/nm-file.R

R/filter-nm-data.R

R/bootstrap-model.R

- Some of this was a bit out of scope for this PR, but would be ok with the scope of the parent PR (updating bootstrap functionality). The out of scope portions included capitalizing parameter documentation for all bootstrap related functions - Added examples and improved regex for IGNORE/ACCEPT list parsing

R/bootstrap-model.R

R/filter-nm-data.R

R/bootstrap-model.R

R/filter-nm-data.R

If parsing the NONMEM filtering expressions fails, we now instruct users to provide a starting dataset instead of proceeeding with an unfiltered `nm_data(mod)` dataset - filter_nm_data no longer returns NULL on failure

- If the based_on model has run, check the number of records in `starting_data` to make sure the filtering went ok. This introduces a `.bbi_args` argument passed to `model_summary()`, similar to the behavior in `nm_join()`.

- This will read in the _starting_ dataset (i.e. does not take any resampling for each sub-model into account) - This is done ahead of the next commit, which will change the control stream template that's used for parsing NONMEM filter expressions

…pressions - previously used the parent model. However we want any changes made in the bootstrap control stream template to be reflected during setup_bootstrap_run - Add a warning if the parent model hasn't been submitted (and `data` is not provided, as this prevents us from being able to perform certain checks (number of expected records).

…arent model - minor bug fix

- move data path fix to above parsing the control stream. This previously wasnt necessary since we parsed the parent model

- Towards the end of PR #711 I changed which model was passed to `nm_data(filter = TRUE)`. While updating the NEWS I realized some parameter documentation and an error message were not consistent with this change. - Minor enough to include in this PR

Parse $DATA record for the purpose of filtering the input data

bfae5ff

- Extracts and formats IGNORE and ACCEPT record options, transforming them into `dplyr::filter()` expressions

kylebaron reviewed Jul 3, 2024

View reviewed changes