Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have ForwardModelRunner validate executable_file earlier #8735

Merged
merged 1 commit into from
Sep 25, 2024

Conversation

jonathan-eq
Copy link
Contributor

@jonathan-eq jonathan-eq commented Sep 18, 2024

This commit fixes the issue where the ForwardModelRunner on the job_runner side would crash when validating the existence of a ForwardModelStep executable; causing the drivers to hang while waiting for a never coming return code from JobRunner.

Issue
Resolves #8729

Approach

(Screenshot of new behavior in GUI if applicable)
image

  • PR title captures the intent of the changes, and is fitting for release notes.
  • Added appropriate release note label
  • Commit history is consistent and clean, in line with the contribution guidelines.
  • Make sure unit tests pass locally after every commit (git rebase -i main --exec 'pytest tests/unit_tests -n logical -m "not integration_test"')

When applicable

  • When there are user facing changes: Updated documentation
  • New behavior or changes to existing untested code: Ensured that unit tests are added (See Ground Rules).
  • Large PR: Prepare changes in small commits for more convenient review
  • Bug fix: Add regression test for the bug
  • Bug fix: Create Backport PR to latest release

@jonathan-eq jonathan-eq force-pushed the fix-bug branch 2 times, most recently from 16f7255 to 7fb475f Compare September 18, 2024 13:09
@jonathan-eq jonathan-eq added the release-notes:bug-fix Automatically categorise as bug fix in release notes label Sep 18, 2024
@codecov-commenter
Copy link

codecov-commenter commented Sep 19, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91.35%. Comparing base (b3a4ec7) to head (3c370b5).
Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #8735   +/-   ##
=======================================
  Coverage   91.35%   91.35%           
=======================================
  Files         342      342           
  Lines       20975    20975           
=======================================
  Hits        19162    19162           
  Misses       1813     1813           
Flag Coverage Δ
cli-tests 39.44% <ø> (-0.13%) ⬇️
gui-tests 73.48% <ø> (-0.03%) ⬇️
performance-tests 50.05% <ø> (ø)
unit-tests 79.81% <ø> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@yngve-sk
Copy link
Contributor

Is there some reason why this couldn't/shouldn't be done in the config parsing stage, and on the job_runner side? It could be added in ert_config.py somewhere, in this segment and then also show that executable in the GUI before going further

        for fm_step in fm_steps:
            if fm_step.name in cls.PREINSTALLED_FORWARD_MODEL_STEPS:
                try:
                    substituted_json = cls._create_forward_model_json(
                        run_id=None,
                        context=substitution_list,
                        forward_model_steps=[fm_step],
                        skip_pre_experiment_validation=True,
                    )
                    job_json = substituted_json["jobList"][0]
                    fm_step.validate_pre_experiment(job_json)
                except ForwardModelStepValidationError as err:
                    errors.append(
                        ConfigValidationError.with_context(
                            f"Forward model step pre-experiment validation failed: {err!s}",
                            context=fm_step.name,
                        ),
                    )

really we want users to add their own, but validating that an executable exists is pretty elementary. Also an edge case; if we expect the executable to be created by a preceding forward model (or maybe we want to disallow that, not sure).

@jonathan-eq
Copy link
Contributor Author

Is there some reason why this couldn't/shouldn't be done in the config parsing stage, and on the job_runner side? It could be added in ert_config.py somewhere, in this segment and then also show that executable in the GUI before going further

        for fm_step in fm_steps:
            if fm_step.name in cls.PREINSTALLED_FORWARD_MODEL_STEPS:
                try:
                    substituted_json = cls._create_forward_model_json(
                        run_id=None,
                        context=substitution_list,
                        forward_model_steps=[fm_step],
                        skip_pre_experiment_validation=True,
                    )
                    job_json = substituted_json["jobList"][0]
                    fm_step.validate_pre_experiment(job_json)
                except ForwardModelStepValidationError as err:
                    errors.append(
                        ConfigValidationError.with_context(
                            f"Forward model step pre-experiment validation failed: {err!s}",
                            context=fm_step.name,
                        ),
                    )

really we want users to add their own, but validating that an executable exists is pretty elementary. Also an edge case; if we expect the executable to be created by a preceding forward model (or maybe we want to disallow that, not sure).

Yes, but it is unclear if this is ert's responsibility, or if the forward model step should have already validated itself.
We had a discussion about it, but it is nonetheless a separate PR

@yngve-sk
Copy link
Contributor

model step should have already validated itself.
We had a discussion about it, but it is nonetheless a separate PR

I see. Just some context for this: In the ForwardModelStepPlugin we did want users to do their own validation, so we omitted providing default validations for argument types etc. But at the same time we may have just "forgotten" to add the most basic validation (that the executable exists). Non-plugin fms validate this and other things, so it might just make sense to do it there.

@jonathan-eq jonathan-eq force-pushed the fix-bug branch 3 times, most recently from 445298a to 648ccd2 Compare September 20, 2024 05:46
@xjules
Copy link
Contributor

xjules commented Sep 23, 2024

model step should have already validated itself.
We had a discussion about it, but it is nonetheless a separate PR

I see. Just some context for this: In the ForwardModelStepPlugin we did want users to do their own validation, so we omitted providing default validations for argument types etc. But at the same time we may have just "forgotten" to add the most basic validation (that the executable exists). Non-plugin fms validate this and other things, so it might just make sense to do it there.

Actually I agree having the basic file exists validation makes sense while all FMplugin specific validation should be up to the user.

Copy link
Contributor

@xjules xjules left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As suggested by @yngve-sk, let's add the "basic" file-exist validation.

@jonathan-eq
Copy link
Contributor Author

As suggested by @yngve-sk, let's add the "basic" file-exist validation.

Can we have this as a separate issue?
Yngve mentioned that the executable could be created by a preceding forward model, so we would emit a warning at most

@xjules
Copy link
Contributor

xjules commented Sep 23, 2024

As suggested by @yngve-sk, let's add the "basic" file-exist validation.

Can we have this as a separate issue? Yngve mentioned that the executable could be created by a preceding forward model, so we would emit a warning at most

Ok, then create one (as you have more insight ;)) and mention it also in this PR

@jonathan-eq
Copy link
Contributor Author

#8779

Copy link
Contributor

@xjules xjules left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! I think it should do 🚀

This commit fixes the issue where the ForwardModelRunner on the job_runner side would crash when validating the existence of a ForwardModelStep executable; causing the drivers to hang while waiting for a never coming return code from JobRunner.
@jonathan-eq jonathan-eq enabled auto-merge (rebase) September 25, 2024 05:45
@jonathan-eq jonathan-eq merged commit 535e314 into equinor:main Sep 25, 2024
50 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-notes:bug-fix Automatically categorise as bug fix in release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing validation when using ForwardModelStepPlugin causes ert to hang
4 participants