Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: small batch size with categorical variables #454

Open
rajeeja opened this issue Mar 19, 2019 · 6 comments
Open

Bug: small batch size with categorical variables #454

rajeeja opened this issue Mar 19, 2019 · 6 comments

Comments

@rajeeja
Copy link

rajeeja commented Mar 19, 2019

The link below is a standalone script for replicating the error to file the bug fix with mlrMBO

https://github.com/rajeeja/mlrmbo-bug/blob/master/mlrMBOMixedIntegerTest11a.R

Please let me know if you need more details.

@jakob-r
Copy link
Member

jakob-r commented Mar 25, 2019

Hi,
you are using the initial design in a weird way. It is simply too small for your big search space.

Why do you generate the design with max.budget points to then only take the first 5 (propose.points).

Your initial design has to contain each discrete value at least once so that the surrogate can make predictions.

For me it works with design = generateDesign(n = 30, par.set = getParamSet(obj.fun))

@rajeeja
Copy link
Author

rajeeja commented Apr 12, 2019

@jakob-r Thanks!
But "Your initial design has to contain each discrete value at least once so that the surrogate can make predictions." is not sufficient if I use the learner below:

surr.rf = makeLearner("regr.randomForest",
predict.type = "se",
fix.factors.prediction = TRUE,
se.method = "bootstrap",
se.boot = 2)

res = mbo(obj.fun, design = design, learner = surr.rf, control = ctrl, show.info = TRUE)

Complete isolated example is here
https://github.com/rajeeja/mlrmbo-bug/blob/master/learner-discrete-param-bug.R

@jakob-r
Copy link
Member

jakob-r commented Apr 15, 2019

True, my answer is kind of restricted to the surrogate. However, I have doubts that the surrogate will work so well, especially the uncertainty estimation for unknown factors. I am curious to see results of any optimization benchmark using this approach 🙂

@rajeeja
Copy link
Author

rajeeja commented Apr 15, 2019

Even if I increase the propose.points to 1000, I get the error:
Error in predict.randomForest(getLearnerModel(x), newdata = .newdata, :
New factor levels not present in the training data

for this example: https://github.com/rajeeja/mlrmbo-bug/blob/master/learner-discrete-param-bug.R

What should be a fix for getting something like this to work?

@rajeeja
Copy link
Author

rajeeja commented Apr 15, 2019

changing surr.rf = makeLearner("regr.randomForest", 
                      predict.type = "se", 
                      fix.factors.prediction = TRUE,
                      se.method = "bootstrap", 
                      se.boot = 8)

to

surr.rf = makeLearner("regr.randomForest", 
                      predict.type = "se", 
                      fix.factors.prediction = TRUE,
)

it works. I'll update you about results from this approach. Also older version works even with se->

@rajeeja
Copy link
Author

rajeeja commented Apr 16, 2019

just found that changing the se.method = "bootstrap", to

se.method = "jackknife",

works.

@jakob-r jakob-r reopened this Apr 16, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants