Search Methods Enhancements to Avoid Duplicate Evaluated Pipelines 🥈 #211

simonprovost · 2023-12-04T19:54:18Z

After dedicating some time to examine the concern raised in thread #189, I would like to suggest an initial resolution within the thread, which I previously suggested as the narrowed approach.

The narrowed approach could be seen as follow:

Because a search algorithm may be highly specific to itself, I am convinced that re-evaluation of previously evaluated pipelines should be performed by the algorithms themselves. While I am unable to provide a concrete example, a re-evaluated pipeline may be useful for theories built on this aspect of newly designed algorithms; who knows? Although this is not a compelling reason, approaching the problem narrowly allows the algorithm to process it in the manner it prefers. In other words, it may differ from algorithm to algorithm and be implemented in such a way that, for example, a particular candidate is given more prominence than another, as indicated by the duplication number of this candidate, and thus a potential course of action could dynamically shift the algorithm's focus, for example. If I am not mistaken, performing such a check within each algorithm opens up additional avenues for what to do in any way.

Lastly, to avoid confusion for new search algorithm designer, it is possible that we implement a log information warning within the evaluation pipeline module of the GAMA system regarding the re-evaluation of duplicate pipelines and the necessity to refactor the provided search algorithm. Exploring the duplication within each search method would facilitate the processus with what to do basically.

In the interim, this PR enhances the random search uniqueness of the evaluated pipeline. Others, such as @leightonvg, or myself, if I have the time, could investigate alternative algorithms, such as EA, etc., after this current PR's acceptance.

@PGijsbers how do you feel about all this?

🔔 EDIT (Update on Progress – 11 / 12 / 2023):

Following @PGijsbers's comment, available here, here is an update on progress.

I couldn't resist diving into the EvaluationLibrary. I've, therefore, added an out-of-the-box function that immediately tells us if a candidate is already known (evaluated) or not. This should offer flexibility for search methods; they can use this information as they see fit. Right now, we have a simple approach: if a candidate is known, we try another until we reach a max attempt count. However, if designers so desire, this could serve as a foundation for more complex strategies in custom search methods. Or, they may opt to utilise the straightforward approach that we have presently devised (candidate is known, we try another until a max attempt count is reached).

I've implemented this for Random Search, Async EA, and ASHA. I'm confident about Random Search and Async EA. However, I'm slightly less certain about ASHA, so I'd appreciate it if you could give it a once-over to ensure nothing's amiss. As agreed, this PR will stay under the #waiting tag and will follow after #210's merger.

To make things easier, here's a link focusing only on the recent commits for this update: Link to PR changes

Thanks, and looking forward to your feedback!

What contributions have been made

✅ Improving GAMA's Search Methods Out of The Box functions to support no pipeline duplication, iff not wanted per the designer's search algorithm
✅ Random Search && ASHA && Async EA have been refactored to prevent pipeline duplication by using a maximum attempt retrial to produce a new individual if duplicates are identified
✅ Testing passes smoothly
✅ Informative commits

❌ DISCLAIMER ❌

Kindly refrain from merging this PR prior to #210. I will need to perform a rebase from #210 before this PR can be merged!

Cheers,

PGijsbers · 2023-12-05T09:28:32Z

Seems fine, though I would probably write it slightly differently. I do think it makes sense to have a global way to easily check what individuals have already been evaluated, so that search algorithms may make use of it out of the box (search algorithms may decide not to use it). That said, having an explicit check for this in random search while we do not yet have a general plug-and-play mechanism is a step in the right direction I am okay with. (I do think that perhaps the EvaluationLibrary would also be usable for this? But it's been too long to remember how everything ties together :( )

I'll revisit this after it's rebased on a main where #210 is merged.

+ Custom config space is now utilising ConfigSpace + Classification.py is now divided in distinct Classifiers and Preprocesors for better understand and managmeent + Thorough documentation for users to add/modify anything

simonprovost · 2023-12-11T16:59:58Z

Seems fine, though I would probably write it slightly differently. I do think it makes sense to have a global way to easily check what individuals have already been evaluated, so that search algorithms may make use of it out of the box (search algorithms may decide not to use it). That said, having an explicit check for this in random search while we do not yet have a general plug-and-play mechanism is a step in the right direction I am okay with. (I do think that perhaps the EvaluationLibrary would also be usable for this? But it's been too long to remember how everything ties together :( )

I'll revisit this after it's rebased on a main where #210 is merged.

Implemented ! 🎉 Available as in the 🔔 EDIT (Update on Progress – 11 / 12 / 2023) of the initial PR's description.

Cheers.

simonprovost · 2024-06-14T09:26:24Z

BRIEF UDPATE:

I intend to start working back on all this from the end of this month onwards. I have an important deadline for the 20th and had no time to look into those but will definitely afterward.

Cheers for your patience!

Simon

simonprovost changed the title ~~Refactor/improve random search~~ 💡 Improve random search uniqueness 💅 Dec 4, 2023

PGijsbers added the waiting PR or Issue depends on some other event or input as explained in one of the comments. label Dec 5, 2023

simonprovost added 3 commits December 7, 2023 02:31

refactor(configuration): add ConfigSpace

3333b7a

+ Custom config space is now utilising ConfigSpace + Classification.py is now divided in distinct Classifiers and Preprocesors for better understand and managmeent + Thorough documentation for users to add/modify anything

refactor(gama): update internal system to be ConfigSpace compliant

abf174b

refactor(configuration): update regressors to be ConfigSpace compliant

f1bb413

simonprovost force-pushed the refactor/improve_random_search branch 3 times, most recently from a46ec4f to bdbd54a Compare December 9, 2023 01:38

simonprovost changed the title ~~💡 Improve random search uniqueness 💅~~ 💡 Improve Search Methods uniqueness (No Candidates Duplicate) 💅 Dec 9, 2023

simonprovost added 6 commits December 11, 2023 16:30

refactor(tests): update tests to be ConfigSpace compliant

ee3f6e7

refactor(tests): update tests to be ConfigSpace compliant

9dec917

feat(evaluation_library): add is_evaluated candidate

a840863

refactor(search_methods): update random search uniqueness

14cb0dc

refactor(search_methods): update EA uniqueness

125051c

refactor(search_methods): update ASHA uniqueness

d7310cd

simonprovost force-pushed the refactor/improve_random_search branch from bdbd54a to d7310cd Compare December 11, 2023 16:32

simonprovost changed the title ~~💡 Improve Search Methods uniqueness (No Candidates Duplicate) 💅~~ Improve Search Methods uniqueness (No Candidates Duplicate) 🥈 Dec 11, 2023

simonprovost changed the title ~~Improve Search Methods uniqueness (No Candidates Duplicate) 🥈~~ Improve Search Methods uniqueness (No Evaluated Pipelines Duplicate) 🥈 Dec 11, 2023

simonprovost changed the title ~~Improve Search Methods uniqueness (No Evaluated Pipelines Duplicate) 🥈~~ Search Methods Enhancements to Avoid Duplicate Evaluated Pipelines 🥈 Dec 11, 2023

simonprovost mentioned this pull request Dec 11, 2023

SMAC3 Bayesian Optimisation Integration [🆕 Search Method] 🥉 #212

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Search Methods Enhancements to Avoid Duplicate Evaluated Pipelines 🥈 #211

Search Methods Enhancements to Avoid Duplicate Evaluated Pipelines 🥈 #211

simonprovost commented Dec 4, 2023 •

edited

Loading

PGijsbers commented Dec 5, 2023

simonprovost commented Dec 11, 2023 •

edited

Loading

simonprovost commented Jun 14, 2024

Search Methods Enhancements to Avoid Duplicate Evaluated Pipelines 🥈 #211

Are you sure you want to change the base?

Search Methods Enhancements to Avoid Duplicate Evaluated Pipelines 🥈 #211

Conversation

simonprovost commented Dec 4, 2023 • edited Loading

PGijsbers commented Dec 5, 2023

simonprovost commented Dec 11, 2023 • edited Loading

simonprovost commented Jun 14, 2024

simonprovost commented Dec 4, 2023 •

edited

Loading

simonprovost commented Dec 11, 2023 •

edited

Loading