Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Orion Extension concept [OC-343] #673

Open
wants to merge 15 commits into
base: develop
Choose a base branch
from

Conversation

Delaunay
Copy link
Collaborator

Hi there! Thank you for contributing. Feel free to use this pull request template; It helps us reviewing your work at its true value.

Please remove the instructions in italics before posting the pull request :).

Description

Describe what is the purpose of this pull request and why it should be integrated to the repository.
When your changes modify the current behavior, explain why your solution is better.

If it solves a GitHub issue, be sure to link it.

Changes

Give an overview of the suggested solution.

Checklist

This is simply a reminder of what we are going to look for before merging your code.

Add an x in the boxes that apply.
If you're unsure about any of them, don't hesitate to ask. We're here to help!
You can also fill these out after creating the PR if it's a work in progress (be sure to publish the PR as a draft then)

Tests

  • I added corresponding tests for bug fixes and new features. If possible, the tests fail without the changes
  • All new and existing tests are passing ($ tox -e py38; replace 38 by your Python version if necessary)

Documentation

  • I have updated the relevant documentation related to my changes

Quality

  • I have read the CONTRIBUTING doc
  • My commits messages follow this format
  • My code follows the style guidelines ($ tox -e lint)

Further comments

Please include any additional information or comment that you feel will be helpful to the review of this pull request.

@Delaunay Delaunay force-pushed the orion_ext branch 5 times, most recently from f3634c6 to 0936ac4 Compare October 5, 2021 16:46
@Delaunay Delaunay force-pushed the orion_ext branch 4 times, most recently from 7b909d1 to 9e3068f Compare October 5, 2021 17:06
@Delaunay Delaunay force-pushed the orion_ext branch 7 times, most recently from c4e510d to c366d02 Compare October 13, 2021 14:24
results = self.executor.wait(
[self.executor.submit(fct, **unflatten(kwargs))]
)[0]
self.observe(trial, results=results)
except (KeyboardInterrupt, InvalidResult):
raise
except BaseException as e:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The line below executing on_error should be replaced by the Extension, with an event on_trial_fail or something like that.

)


class OrionExtensionManager:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think extension is a bit too general. It could be an extension of anything. Here we are specifically talking of callbacks for specific events. What would you think of a name like OrionCallbackManager instead? Or something related to events.


# -- Trials
self._get_event("new_trial")
self._get_event("on_trial_error")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be coherent I think there should be no on_ or it should be for all.

self.deferred_calls = []
self.name = name
self.deferred = deferred
self.bad_handlers = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this for?

fun(*args, **kwargs)
except Exception as err:
if self.manager:
self.manager.on_extension_error.broadcast(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't this silence any extension error by default? Perhaps the default should be to raise if there are no callbacks registered for on_extension_error.

def on_trial_error(
self, trial, exception_type, exception_value, exception_traceback
):
"""Called when a error occur during the optimization process"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is expected of the trial status? Will it be changed to broken already? It should be specified in the docstring.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also what will happen afterwards, it this callback supposed to raise an error if we want the worker to stop?

"""Base orion extension interface you need to implement"""

def on_extension_error(self, name, fun, exception, args):
"""Called when an extension callbakc raise an exception
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should clarify that if the callback does not raise the error, then the execution will continue when the callback is done.


def new_trial(self, trial):
"""Called when the trial starts with a new configuration"""
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is when the trial execution start then the name should be start_trial instead. Because it may not be a new trial, it can be a resumed one, so the current name is confusing.

return

def end_trial(self, trial):
"""Called when the trial finished"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it called anytime, even if the trial was interrupted or crashed? Is it called before a status change occurred?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should also be an event interrupt_trial.

def on_experiment_error(
self, experiment, exception_type, exception_value, exception_traceback
):
"""Called when a error occur during the optimization process"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment as for on_trial_error

return

def end_experiment(self, experiment):
"""Called at the end of the optimization process after the worker exits"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar comment as for end_trial.

ext.calls["on_trial_error"] == n_broken
), "failed trial should be reported "

assert ext.calls["start_experiment"] == 1, "experiment should have started"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we use joblib with loky and n_workers > 1?

self._execute(args, kwargs)
return

self.deferred_calls.append((args, kwargs))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be best to leave the implementation of deferred calls for later unless we test it now.

**kwargs
)

def broadcast(self, name, *args, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is indeed not tested nor used yet.

Delaunay and others added 4 commits October 22, 2021 15:15
Co-authored-by: Xavier Bouthillier <[email protected]>
Co-authored-by: Xavier Bouthillier <[email protected]>
Co-authored-by: Xavier Bouthillier <[email protected]>
Co-authored-by: Xavier Bouthillier <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants