Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make data handling during evaluation more explicit #88

Open
alexzwanenburg opened this issue Dec 6, 2024 · 1 comment
Open

Make data handling during evaluation more explicit #88

alexzwanenburg opened this issue Dec 6, 2024 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@alexzwanenburg
Copy link
Owner

During evaluation, models are assessed for three categories of data: external validation, internal validation and development. At the level of an ensemble of models, each model may have different development and internal validation data. This is currently handled correctly. However, the system is obscure. It depends on data not being loaded at prediction time, and then making decisions based on data_ids and run_ids. That is, this behaviour is baked into the deepest code layers, but is directed from high-level code in an indirect manner.

To make this behaviour more transparent and testable, we need to make this behaviour steerable from the code that generates and configures the evaluation task (enabled in v2.0.0 by switching to a task-based system instead of a purely functional system).

@alexzwanenburg alexzwanenburg added the enhancement New feature or request label Dec 6, 2024
@alexzwanenburg alexzwanenburg self-assigned this Dec 6, 2024
@alexzwanenburg
Copy link
Owner Author

  • Added delayedDataObject class to start distinguishing between dataObject that can be processed directly, and delayedDataObject which is intended to populate the dataObject using its associated attributes and the backend. This will allow for separating methods and attributes related to loading data that are currently associated with dataObject.
  • Rework methods for dataObject that deal with loading data and move these to delayedDataObject.
  • Use delayedDataObject to contextualise data processing within evaluation steps.

alexzwanenburg added a commit that referenced this issue Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant