-
Notifications
You must be signed in to change notification settings - Fork 410
Commit
Summary: Pull Request resolved: #2024 X-link: facebook/Ax#1871 Reviewed By: saitcakmak Differential Revision: D49604752 fbshipit-source-id: f0180e7b6bf9bf2c446589422e043283b7e7152c
- Loading branch information
There are no files selected for viewing
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Large diffs are not rendered by default.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,149 +1,149 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Optimize acquisition functions using CMA-ES\n", | ||
"\n", | ||
"In this tutorial, we show how to use an external optimizer (in this case [CMA-ES](https://en.wikipedia.org/wiki/CMA-ES)) for optimizing BoTorch acquisition functions. CMA-ES is a zero-th order optimizer, meaning that it only uses function evaluations and does not require gradient information. This is of course very useful if gradient information about the function to be optimized is unavailable. \n", | ||
"\n", | ||
"In BoTorch, we typically do have gradient information available (thanks, autograd!). One is also generally better off using this information, rather than just ignoring it. However, for certain custom models or acquisition functions, we may not be able to backprop through the acquisition function and/or model. In such instances, using a zero-th order optimizer is appropriate.\n", | ||
"\n", | ||
"For this example we use the [PyCMA](https://github.com/CMA-ES/pycma) implementation of CMA-ES. PyCMA is easily installed via pip by running `pip install cma`." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### Setting up the acquisition function\n", | ||
"\n", | ||
"For the purpose of this tutorial, we'll use a basic `UpperConfidenceBound` acquisition function on a basic model fit on synthetic data. Please see the documentation for [Models](../docs/models) and [Acquisition Functions](../docs/acquisition) for more information." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 4, | ||
"metadata": { | ||
"collapsed": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"import math\n", | ||
"import torch\n", | ||
"\n", | ||
"from botorch.fit import fit_gpytorch_mll\n", | ||
"from botorch.models import SingleTaskGP\n", | ||
"from gpytorch.mlls import ExactMarginalLogLikelihood\n", | ||
"\n", | ||
"X = torch.rand(20, 2) - 0.5\n", | ||
"Y = (torch.sin(2 * math.pi * X[:, 0]) + torch.cos(2 * math.pi * X[:, 1])).unsqueeze(-1)\n", | ||
"Y += 0.1 * torch.randn_like(Y)\n", | ||
"\n", | ||
"gp = SingleTaskGP(X, Y)\n", | ||
"mll = ExactMarginalLogLikelihood(gp.likelihood, gp)\n", | ||
"fit_gpytorch_mll(mll);" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 5, | ||
"metadata": { | ||
"collapsed": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"from botorch.acquisition import UpperConfidenceBound\n", | ||
"\n", | ||
"UCB = UpperConfidenceBound(gp, beta=0.1)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### Optimizing the acquisition function using CMA-ES\n", | ||
"\n", | ||
"**Note:** Relative to sequential evaluations, parallel evaluations of the acqusition function are extremely fast in botorch (due to automatic parallelization across batch dimensions). In order to exploit this, we use the \"ask/tell\" interface to `cma` - this way we can batch-evaluate the whole CMA-ES population in parallel.\n", | ||
"\n", | ||
"In this examle we use an initial standard deviation $\\sigma_0 = 0.2$ and a population size $\\lambda = 50$. \n", | ||
"We also constrain the input `X` to the unit cube $[0, 1]^d$.\n", | ||
"See `cma`'s [API Reference](http://cma.gforge.inria.fr/apidocs-pycma/cma.evolution_strategy.CMAEvolutionStrategy.html) for more information on these options.\n", | ||
"\n", | ||
"With this, we can optimize this acquistition function as follows:" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 6, | ||
"metadata": {}, | ||
"outputs": [ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"## Optimize acquisition functions using CMA-ES\n", | ||
"\n", | ||
"In this tutorial, we show how to use an external optimizer (in this case [CMA-ES](https://en.wikipedia.org/wiki/CMA-ES)) for optimizing BoTorch acquisition functions. CMA-ES is a zero-th order optimizer, meaning that it only uses function evaluations and does not require gradient information. This is of course very useful if gradient information about the function to be optimized is unavailable. \n", | ||
"\n", | ||
"In BoTorch, we typically do have gradient information available (thanks, autograd!). One is also generally better off using this information, rather than just ignoring it. However, for certain custom models or acquisition functions, we may not be able to backprop through the acquisition function and/or model. In such instances, using a zero-th order optimizer is appropriate.\n", | ||
"\n", | ||
"For this example we use the [PyCMA](https://github.com/CMA-ES/pycma) implementation of CMA-ES. PyCMA is easily installed via pip by running `pip install cma`." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### Setting up the acquisition function\n", | ||
"\n", | ||
"For the purpose of this tutorial, we'll use a basic `UpperConfidenceBound` acquisition function on a basic model fit on synthetic data. Please see the documentation for [Models](../docs/models) and [Acquisition Functions](../docs/acquisition) for more information." | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 4, | ||
"metadata": { | ||
"collapsed": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"import math\n", | ||
"import torch\n", | ||
"\n", | ||
"from botorch.fit import fit_gpytorch_mll\n", | ||
"from botorch.models import SingleTaskGP\n", | ||
"from gpytorch.mlls import ExactMarginalLogLikelihood\n", | ||
"\n", | ||
"X = torch.rand(20, 2) - 0.5\n", | ||
"Y = (torch.sin(2 * math.pi * X[:, 0]) + torch.cos(2 * math.pi * X[:, 1])).unsqueeze(-1)\n", | ||
"Y += 0.1 * torch.randn_like(Y)\n", | ||
"\n", | ||
"gp = SingleTaskGP(X, Y)\n", | ||
"mll = ExactMarginalLogLikelihood(gp.likelihood, gp)\n", | ||
"fit_gpytorch_mll(mll);" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 5, | ||
"metadata": { | ||
"collapsed": true | ||
}, | ||
"outputs": [], | ||
"source": [ | ||
"from botorch.acquisition import UpperConfidenceBound\n", | ||
"\n", | ||
"UCB = UpperConfidenceBound(gp, beta=0.1)" | ||
] | ||
}, | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"(25_w,50)-aCMA-ES (mu_w=14.0,w_1=14%) in dimension 2 (seed=374178, Thu Aug 8 09:33:08 2019)\n" | ||
] | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"### Optimizing the acquisition function using CMA-ES\n", | ||
"\n", | ||
"**Note:** Relative to sequential evaluations, parallel evaluations of the acqusition function are extremely fast in botorch (due to automatic parallelization across batch dimensions). In order to exploit this, we use the \"ask/tell\" interface to `cma` - this way we can batch-evaluate the whole CMA-ES population in parallel.\n", | ||
"\n", | ||
"In this examle we use an initial standard deviation $\\sigma_0 = 0.2$ and a population size $\\lambda = 50$. \n", | ||
"We also constrain the input `X` to the unit cube $[0, 1]^d$.\n", | ||
"See `cma`'s [API Reference](http://cma.gforge.inria.fr/apidocs-pycma/cma.evolution_strategy.CMAEvolutionStrategy.html) for more information on these options.\n", | ||
"\n", | ||
"With this, we can optimize this acquistition function as follows:" | ||
] | ||
}, | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"tensor([0.2642, 0.0255])" | ||
"cell_type": "code", | ||
"execution_count": 6, | ||
"metadata": {}, | ||
"outputs": [ | ||
{ | ||
"name": "stdout", | ||
"output_type": "stream", | ||
"text": [ | ||
"(25_w,50)-aCMA-ES (mu_w=14.0,w_1=14%) in dimension 2 (seed=374178, Thu Aug 8 09:33:08 2019)\n" | ||
] | ||
}, | ||
{ | ||
"data": { | ||
"text/plain": [ | ||
"tensor([0.2642, 0.0255])" | ||
] | ||
}, | ||
"execution_count": 6, | ||
"metadata": { | ||
"bento_obj_id": "140190506026760" | ||
}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"import cma\n", | ||
"import numpy as np\n", | ||
"\n", | ||
"# get initial condition for CMAES in numpy form\n", | ||
"# note that CMAES expects a different shape (no explicit q-batch dimension)\n", | ||
"x0 = np.random.rand(2)\n", | ||
"\n", | ||
"# create the CMA-ES optimizer\n", | ||
"es = cma.CMAEvolutionStrategy(\n", | ||
" x0=x0,\n", | ||
" sigma0=0.2,\n", | ||
" inopts={\"bounds\": [0, 1], \"popsize\": 50},\n", | ||
")\n", | ||
"\n", | ||
"# speed up things by telling pytorch not to generate a compute graph in the background\n", | ||
"with torch.no_grad():\n", | ||
"\n", | ||
" # Run the optimization loop using the ask/tell interface -- this uses\n", | ||
" # PyCMA's default settings, see the PyCMA documentation for how to modify these\n", | ||
" while not es.stop():\n", | ||
" xs = es.ask() # as for new points to evaluate\n", | ||
" # convert to Tensor for evaluating the acquisition function\n", | ||
" X = torch.tensor(xs, device=X.device, dtype=X.dtype)\n", | ||
" # evaluate the acquisition function (optimizer assumes we're minimizing)\n", | ||
" Y = -UCB(\n", | ||
" X.unsqueeze(-2)\n", | ||
" ) # acquisition functions require an explicit q-batch dimension\n", | ||
" y = Y.view(-1).double().numpy() # convert result to numpy array\n", | ||
" es.tell(xs, y) # return the result to the optimizer\n", | ||
"\n", | ||
"# convert result back to a torch tensor\n", | ||
"best_x = torch.from_numpy(es.best.x).to(X)\n", | ||
"\n", | ||
"best_x" | ||
] | ||
}, | ||
"execution_count": 6, | ||
"metadata": { | ||
"bento_obj_id": "140190506026760" | ||
}, | ||
"output_type": "execute_result" | ||
} | ||
], | ||
"source": [ | ||
"import cma\n", | ||
"import numpy as np\n", | ||
"\n", | ||
"# get initial condition for CMAES in numpy form\n", | ||
"# note that CMAES expects a different shape (no explicit q-batch dimension)\n", | ||
"x0 = np.random.rand(2)\n", | ||
"\n", | ||
"# create the CMA-ES optimizer\n", | ||
"es = cma.CMAEvolutionStrategy(\n", | ||
" x0=x0,\n", | ||
" sigma0=0.2,\n", | ||
" inopts={\"bounds\": [0, 1], \"popsize\": 50},\n", | ||
")\n", | ||
"\n", | ||
"# speed up things by telling pytorch not to generate a compute graph in the background\n", | ||
"with torch.no_grad():\n", | ||
"\n", | ||
" # Run the optimization loop using the ask/tell interface -- this uses\n", | ||
" # PyCMA's default settings, see the PyCMA documentation for how to modify these\n", | ||
" while not es.stop():\n", | ||
" xs = es.ask() # as for new points to evaluate\n", | ||
" # convert to Tensor for evaluating the acquisition function\n", | ||
" X = torch.tensor(xs, device=X.device, dtype=X.dtype)\n", | ||
" # evaluate the acquisition function (optimizer assumes we're minimizing)\n", | ||
" Y = -UCB(\n", | ||
" X.unsqueeze(-2)\n", | ||
" ) # acquisition functions require an explicit q-batch dimension\n", | ||
" y = Y.view(-1).double().numpy() # convert result to numpy array\n", | ||
" es.tell(xs, y) # return the result to the optimizer\n", | ||
"\n", | ||
"# convert result back to a torch tensor\n", | ||
"best_x = torch.from_numpy(es.best.x).to(X)\n", | ||
"\n", | ||
"best_x" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "python3", | ||
"language": "python", | ||
"name": "python3" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 2 | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "python3", | ||
"language": "python", | ||
"name": "python3" | ||
} | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 2 | ||
} |