Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse $DATA record for the purpose of filtering the input data #711

Merged
merged 28 commits into from
Sep 19, 2024

Conversation

barrettk
Copy link
Collaborator

@barrettk barrettk commented Jul 2, 2024

Parse $DATA record for the purpose of filtering the input data

 - Extracts and formats IGNORE and ACCEPT record options, transforming them into `dplyr::filter()` expressions
R/modify-records.R Outdated Show resolved Hide resolved
R/modify-records.R Outdated Show resolved Hide resolved
R/modify-records.R Outdated Show resolved Hide resolved
R/modify-records.R Outdated Show resolved Hide resolved
R/modify-records.R Outdated Show resolved Hide resolved
 - After some internal discussion, it was suggested that the ability to create a full NM-TRAN dataset via parsing the $DATA and $INPUT records could be valuable.

 - This commit adds the dropping of columns via `DROP` and `SKIP` options, handles the renaming of columns, and null mapping (changing null values to a particular character string).

 - nm_data_filter was renamed from filter_nm_data. All the helper function names may change, but was trying to be consistent across the new helper functions at the minimum.

 - minor bug fix to nm_data_filter: can handle `=` IGNORE/ACCEPT options
 - read_data_record and get_records() can only be used for reading the records, not overwriting. This slipped past me in a previous commit.

 - dont apply `#` filter to all columns - only the first
 - pull out the individual expression translation into a separate function

 - move all related functions to this PR to a new script
 - (list) type expressions must be split up first. This wasn't necessary before the refactor, but is now

 - add examples and documentation
R/nm-tran-data.R Outdated Show resolved Hide resolved
 - removed other NM-TRAN setup functions. They were incomplete/not fully accurate, and we can just use NM-TRAN directly to create that dataset

 - Hook up filter_nm_data to setup_bootstrap_run and update tests
 - pulled out additional logic for parsing the IGNORE and ACCEPT options

 - fix: adjusted `@` IGNORE filter option to only apply to the first column
@barrettk barrettk marked this pull request as ready for review July 10, 2024 22:24
 - add tests for all NONMEM operators
@barrettk
Copy link
Collaborator Author

barrettk commented Jul 11, 2024

Hey @kylebaron I was wondering if you could take a look at translate_nm_operator and translate_nm_expr after the @ adjustment and latest commit (added 2 more NONMEM 7.3 supported operators). Additionally, I was wondering if the prior would have to be revisited to account for the TIME column in the section below:

Part of the IGNORE=(list) documentation in the $DATA docs:

With "=", "==", "/=', ".EQ." and ".NE.", the value in the data record and
the value in the list are compared as character strings. Otherwise, they
are converted to numeric and compared numerically. (This is the case
with | .NEN. and .EQN.) This comparison is made prior to time translation.
Hence, the TIME item cannot be compared numerically if it contains
non-numeric characters such as ":".

@kylebaron
Copy link
Contributor

Hey @barrettk - looking at this this AM. KTB

Copy link
Contributor

@kylebaron kylebaron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, @barrettk. Left some comments / suggestions. cc @seth127 .

R/nm-file.R Outdated Show resolved Hide resolved
R/nm-file.R Outdated Show resolved Hide resolved
R/filter-nm-data.R Outdated Show resolved Hide resolved
R/filter-nm-data.R Outdated Show resolved Hide resolved
R/filter-nm-data.R Show resolved Hide resolved
R/filter-nm-data.R Outdated Show resolved Hide resolved
R/filter-nm-data.R Outdated Show resolved Hide resolved
R/filter-nm-data.R Outdated Show resolved Hide resolved
R/filter-nm-data.R Outdated Show resolved Hide resolved
R/filter-nm-data.R Outdated Show resolved Hide resolved
 - adjust map call placement to improve clarity of helper functions

 - abort if both IGNORE and ACCEPT expressions are found

 - now return the `type` as part of `get_data_filter_exprs()`, which is then passed to `translate_nm_expr()`

 - inform user of how many records are dropped as part of `nm_data(filter = TRUE)`

 - adjust documentation and arguments
 - doesnt work by default for internal functions. I imagine there is a workaround for this, though the inclusion of these examples are not important and only for developer purposes
@barrettk barrettk requested a review from kylebaron July 17, 2024 18:25
 - dont look for digits
 - only look for one character
 - extra `\\` are added, which are escaped when parse() is called down the line
@barrettk barrettk requested review from kylebaron and removed request for seth127 and kylebaron July 29, 2024 15:05
Copy link
Contributor

@kylebaron kylebaron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noticed at the end of the review some inconsistency in capitalization for the first letter of your function argument descriptions; might be worth a look at all the functions here to get it consistent.

R/bootstrap-model.R Outdated Show resolved Hide resolved
R/bootstrap-model.R Outdated Show resolved Hide resolved
R/bootstrap-model.R Outdated Show resolved Hide resolved
R/filter-nm-data.R Outdated Show resolved Hide resolved
R/filter-nm-data.R Outdated Show resolved Hide resolved
R/bootstrap-model.R Outdated Show resolved Hide resolved
R/nm-file.R Outdated Show resolved Hide resolved
R/filter-nm-data.R Outdated Show resolved Hide resolved
R/filter-nm-data.R Outdated Show resolved Hide resolved
R/bootstrap-model.R Show resolved Hide resolved
 - Some of this was a bit out of scope for this PR, but would be ok with the scope of the parent PR (updating bootstrap functionality). The out of scope portions included capitalizing parameter documentation for all bootstrap related functions

 - Added examples and improved regex for IGNORE/ACCEPT list parsing
@barrettk barrettk requested a review from kylebaron July 31, 2024 16:04
R/bootstrap-model.R Outdated Show resolved Hide resolved
R/bootstrap-model.R Show resolved Hide resolved
R/filter-nm-data.R Show resolved Hide resolved
R/filter-nm-data.R Outdated Show resolved Hide resolved
R/bootstrap-model.R Outdated Show resolved Hide resolved
R/bootstrap-model.R Outdated Show resolved Hide resolved
R/bootstrap-model.R Show resolved Hide resolved
R/bootstrap-model.R Outdated Show resolved Hide resolved
R/bootstrap-model.R Outdated Show resolved Hide resolved
R/filter-nm-data.R Show resolved Hide resolved
If parsing the NONMEM filtering expressions fails, we now instruct users to provide a starting dataset instead of proceeeding with an unfiltered `nm_data(mod)` dataset

 - filter_nm_data no longer returns NULL on failure
- If the based_on model has run, check the number of records in `starting_data` to make sure the filtering went ok. This introduces a `.bbi_args` argument passed to `model_summary()`, similar to the behavior in `nm_join()`.
 - This will read in the _starting_ dataset (i.e. does not take any resampling for each sub-model into account)

 - This is done ahead of the next commit, which will change the control stream template that's used for parsing NONMEM filter expressions
…pressions

 - previously used the parent model. However we want any changes made in the bootstrap control stream template to be reflected during setup_bootstrap_run

 - Add a warning if the parent model hasn't been submitted (and `data` is not provided, as this prevents us from being able to perform certain checks (number of expected records).
 - move data path fix to above parsing the control stream. This previously wasnt necessary since we parsed the parent model
@barrettk barrettk merged commit e2d2e15 into bootstrap/starting-data Sep 19, 2024
8 checks passed
@barrettk barrettk deleted the parse-data-record branch September 19, 2024 21:51
barrettk added a commit that referenced this pull request Sep 19, 2024
 - Towards the end of PR #711 I changed which model was passed to `nm_data(filter = TRUE)`. While updating the NEWS I realized some parameter documentation and an error message were not consistent with this change.

 - Minor enough to include in this PR
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants