Have ForwardModelRunner validate executable_file earlier #8735

jonathan-eq · 2024-09-18T13:05:45Z

This commit fixes the issue where the ForwardModelRunner on the job_runner side would crash when validating the existence of a ForwardModelStep executable; causing the drivers to hang while waiting for a never coming return code from JobRunner.

Issue
Resolves #8729

Approach
⛄

(Screenshot of new behavior in GUI if applicable)

PR title captures the intent of the changes, and is fitting for release notes.
Added appropriate release note label
Commit history is consistent and clean, in line with the contribution guidelines.
Make sure unit tests pass locally after every commit (git rebase -i main --exec 'pytest tests/unit_tests -n logical -m "not integration_test"')

When applicable

When there are user facing changes: Updated documentation
New behavior or changes to existing untested code: Ensured that unit tests are added (See Ground Rules).
Large PR: Prepare changes in small commits for more convenient review
Bug fix: Add regression test for the bug
Bug fix: Create Backport PR to latest release

codecov-commenter · 2024-09-19T04:45:24Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91.35%. Comparing base (b3a4ec7) to head (3c370b5).
Report is 2 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #8735   +/-   ##
=======================================
  Coverage   91.35%   91.35%           
=======================================
  Files         342      342           
  Lines       20975    20975           
=======================================
  Hits        19162    19162           
  Misses       1813     1813

Flag	Coverage Δ
cli-tests	`39.44% <ø> (-0.13%)`	⬇️
gui-tests	`73.48% <ø> (-0.03%)`	⬇️
performance-tests	`50.05% <ø> (ø)`
unit-tests	`79.81% <ø> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

yngve-sk · 2024-09-19T07:57:52Z

Is there some reason why this couldn't/shouldn't be done in the config parsing stage, and on the job_runner side? It could be added in ert_config.py somewhere, in this segment and then also show that executable in the GUI before going further

        for fm_step in fm_steps:
            if fm_step.name in cls.PREINSTALLED_FORWARD_MODEL_STEPS:
                try:
                    substituted_json = cls._create_forward_model_json(
                        run_id=None,
                        context=substitution_list,
                        forward_model_steps=[fm_step],
                        skip_pre_experiment_validation=True,
                    )
                    job_json = substituted_json["jobList"][0]
                    fm_step.validate_pre_experiment(job_json)
                except ForwardModelStepValidationError as err:
                    errors.append(
                        ConfigValidationError.with_context(
                            f"Forward model step pre-experiment validation failed: {err!s}",
                            context=fm_step.name,
                        ),
                    )

really we want users to add their own, but validating that an executable exists is pretty elementary. Also an edge case; if we expect the executable to be created by a preceding forward model (or maybe we want to disallow that, not sure).

jonathan-eq · 2024-09-19T08:15:52Z

Is there some reason why this couldn't/shouldn't be done in the config parsing stage, and on the job_runner side? It could be added in ert_config.py somewhere, in this segment and then also show that executable in the GUI before going further
        for fm_step in fm_steps:
            if fm_step.name in cls.PREINSTALLED_FORWARD_MODEL_STEPS:
                try:
                    substituted_json = cls._create_forward_model_json(
                        run_id=None,
                        context=substitution_list,
                        forward_model_steps=[fm_step],
                        skip_pre_experiment_validation=True,
                    )
                    job_json = substituted_json["jobList"][0]
                    fm_step.validate_pre_experiment(job_json)
                except ForwardModelStepValidationError as err:
                    errors.append(
                        ConfigValidationError.with_context(
                            f"Forward model step pre-experiment validation failed: {err!s}",
                            context=fm_step.name,
                        ),
                    )
really we want users to add their own, but validating that an executable exists is pretty elementary. Also an edge case; if we expect the executable to be created by a preceding forward model (or maybe we want to disallow that, not sure).

Yes, but it is unclear if this is ert's responsibility, or if the forward model step should have already validated itself.
We had a discussion about it, but it is nonetheless a separate PR

yngve-sk · 2024-09-19T08:24:06Z

model step should have already validated itself.
We had a discussion about it, but it is nonetheless a separate PR

I see. Just some context for this: In the ForwardModelStepPlugin we did want users to do their own validation, so we omitted providing default validations for argument types etc. But at the same time we may have just "forgotten" to add the most basic validation (that the executable exists). Non-plugin fms validate this and other things, so it might just make sense to do it there.

src/_ert/forward_model_runner/io/__init__.py

xjules · 2024-09-23T07:55:39Z

model step should have already validated itself.
We had a discussion about it, but it is nonetheless a separate PR

I see. Just some context for this: In the ForwardModelStepPlugin we did want users to do their own validation, so we omitted providing default validations for argument types etc. But at the same time we may have just "forgotten" to add the most basic validation (that the executable exists). Non-plugin fms validate this and other things, so it might just make sense to do it there.

Actually I agree having the basic file exists validation makes sense while all FMplugin specific validation should be up to the user.

xjules

As suggested by @yngve-sk, let's add the "basic" file-exist validation.

jonathan-eq · 2024-09-23T10:17:34Z

As suggested by @yngve-sk, let's add the "basic" file-exist validation.

Can we have this as a separate issue?
Yngve mentioned that the executable could be created by a preceding forward model, so we would emit a warning at most

xjules · 2024-09-23T12:26:43Z

As suggested by @yngve-sk, let's add the "basic" file-exist validation.

Can we have this as a separate issue? Yngve mentioned that the executable could be created by a preceding forward model, so we would emit a warning at most

Ok, then create one (as you have more insight ;)) and mention it also in this PR

jonathan-eq · 2024-09-24T07:59:08Z

#8779

xjules

Very nice! I think it should do 🚀

This commit fixes the issue where the ForwardModelRunner on the job_runner side would crash when validating the existence of a ForwardModelStep executable; causing the drivers to hang while waiting for a never coming return code from JobRunner.

jonathan-eq force-pushed the fix-bug branch 2 times, most recently from 16f7255 to 7fb475f Compare September 18, 2024 13:09

jonathan-eq added the release-notes:bug-fix Automatically categorise as bug fix in release notes label Sep 18, 2024

jonathan-eq force-pushed the fix-bug branch from 7fb475f to e6a2f0c Compare September 19, 2024 04:31

jonathan-eq force-pushed the fix-bug branch from e6a2f0c to f365fef Compare September 19, 2024 08:11

jonathan-eq force-pushed the fix-bug branch from f365fef to 3abc5b2 Compare September 19, 2024 08:20

jonathan-eq force-pushed the fix-bug branch 3 times, most recently from 445298a to 648ccd2 Compare September 20, 2024 05:46

jonathan-eq requested a review from sondreso September 20, 2024 06:56

xjules reviewed Sep 23, 2024

View reviewed changes

src/_ert/forward_model_runner/io/__init__.py Outdated Show resolved Hide resolved

jonathan-eq force-pushed the fix-bug branch from 648ccd2 to 82c096f Compare September 23, 2024 07:46

xjules requested changes Sep 23, 2024

View reviewed changes

jonathan-eq mentioned this pull request Sep 24, 2024

Add basic validations to ForwardModelStepPlugin such as executable exists #8779

Open

13 tasks

xjules approved these changes Sep 24, 2024

View reviewed changes

jonathan-eq force-pushed the fix-bug branch from 82c096f to 3c370b5 Compare September 25, 2024 05:31

jonathan-eq enabled auto-merge (rebase) September 25, 2024 05:45

jonathan-eq merged commit 535e314 into equinor:main Sep 25, 2024
50 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Have ForwardModelRunner validate executable_file earlier #8735

Have ForwardModelRunner validate executable_file earlier #8735

jonathan-eq commented Sep 18, 2024 •

edited

Loading

codecov-commenter commented Sep 19, 2024 •

edited

Loading

yngve-sk commented Sep 19, 2024

jonathan-eq commented Sep 19, 2024

yngve-sk commented Sep 19, 2024

xjules commented Sep 23, 2024

xjules left a comment

jonathan-eq commented Sep 23, 2024

xjules commented Sep 23, 2024

jonathan-eq commented Sep 24, 2024

xjules left a comment

Have ForwardModelRunner validate executable_file earlier #8735

Have ForwardModelRunner validate executable_file earlier #8735

Conversation

jonathan-eq commented Sep 18, 2024 • edited Loading

When applicable

codecov-commenter commented Sep 19, 2024 • edited Loading

Codecov Report

yngve-sk commented Sep 19, 2024

jonathan-eq commented Sep 19, 2024

yngve-sk commented Sep 19, 2024

xjules commented Sep 23, 2024

xjules left a comment

Choose a reason for hiding this comment

jonathan-eq commented Sep 23, 2024

xjules commented Sep 23, 2024

jonathan-eq commented Sep 24, 2024

xjules left a comment

Choose a reason for hiding this comment

jonathan-eq commented Sep 18, 2024 •

edited

Loading

codecov-commenter commented Sep 19, 2024 •

edited

Loading