Implicit Kernel Meta Learning

This repository contains code and instructions for reproducing figures and numbers for the accompanying paper “Implicit kernel meta-learning using kernel integral forms” published as an oral at the proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence (UAI2022). It also contains the necessary functions to generate the datasets (synthetic and real) used in the paper for benchmarking meta-learning algorithms and may be of interest to the meta-learning community independently of the algorithm.

Setting up the environment

There is a supplied conda environment file. Running

conda create --name ikml --file spec-file.txt

will install the conda environment with the necessary dependencies into the conda environment ikml.

All sections below assumes that you are in the ikml conda environment

conda activate ikml

You also need to install the current package locally

pip install .

Datasets

The preprint introduces two new meta-learning regression datasets

Air Quality: Meta-learning tasks derived from Beijing air quality timeseries data spanning a few years
Gas Sensor: Meta-learning tasks derived Experimental timeseries data from gas sensors

Reproducing the results

There is a supplied Makefile that automates most of the pulling and creating of the datasets. Please make sure you have a version of Make installed. Please change directory so that your current working directory is this git repository. All commands below assume that you are in the top level of the repository.

For running, and evaluating experiments I use Guild AI. While the experiments can be run using python directly they been engineered with Guild in mind. The configurable parameters for each experiment can be found in the ./guild.yml file, leaving them as-is gives the experiments of the paper. You can see below how to change the parameters of interest.

Creating the datasets

In a terminal where your current working directory is this git repo, run

make create_datasets

This will pull and build all of the datasets used for benchmarking.

Synthetic Data

Generate the data by choosing a list of input dimensions

L='[d_1, d_2, ..., d_L]'

note that this has to be a list even if it’s just one element in it. The plotting has been hardcoded to work with d being in {1, 2, 5, 10, 20, 30}, so if you want to generate the plots, only pick d from this set. From the command line run

guild run signal_recovery:bochner d=${L}

to generate the data for plotting.

After this has finished running you can look up the guild ID of the batch run

guild runs

which will output something like

[1:d9cd5613]  signal_recovery:bochner   2021-03-07 13:26:52  completed  X_bases_sigma=0.2 X_marginal_sigma=0.2 alpha_sigma=1.0 boch_hidden_d
[2:365f3857]  signal_recovery:bochner+  2021-03-07 13:26:52  completed

and you want the ID of the batch run (you can notice it by the + after the run command), here 365f3857.

You can generate the plots by running

python scripts/toy_regression/signal_recovery/get_learning_curves.py --guild_id 365f3857

and they can be found in plots/toy_regression/signal_recovery/bochner.

Air Quality and Gas Sensor

Make sure that you have generated the datasets by following Creating the datasets.

Reproducing Results

With DATASET being either air_quality or gas_sensor and

SEED='[seed_1, ..., seed_L]'

if you want to reproduce the results set

SEED='[1, 2, 3, 4, 5]'

Get results of IKML by running

guild run ${DATASET}:bochner_ikml seed=${SEED}

which will run IKML with the Bochner kernel over 5 independent runs. You can also get the results of the other algorithm benchmarked, run guild operations to see all of the available options.

After retrieving the Guild ID for the batch run, denoted by ID (see Synthetic Data if you don’t know what this means), you can get the mean and 1 standard deviation of the meta-{val, test} RMSEs by running

python scripts/get_risk.py --guild_id ${ID}

which will print the results.

To generate the plots run the algorithms on your dataset of choice. Consult guild operations to see how to run each algorithm on the dataset you want. The plots can then be generated by running

python scripts/plot_learning_curves.py --mkl_id ${MKL_ID} \
	--lsq_bias_id ${LSQ_BIAS_ID} \
	--maml_id ${MAML_ID} \
	--r2d2_id ${R2D2_ID} \
	--gauss_id ${GAUSS_ID} \
	--gauss_oracle_id ${GAUSS_ORACLE_ID} \
	--bochner_id ${BOCHNER_ID} \
	--y_upper_lim ${Y_UPPER_LIM} \
	--y_lower_lim ${Y_LOWER_LIM} \
	--output_dir ${OUTPUT_DIR}

where the IDs are the batch IDs generated from running guild on the dataset over a list of seeds. Note that leaving out an ID argument just leaves out that algorithm from the plot, so it’s possible to plot a subset of the learning curves. The --output_dir argument is the name of the directory in plots that the plots will be saved to, and will be created if it doesn’t exist. The y limit arguments allows to recreat the plots. For Air Quality the lower and upper limits are 10 and 60, while for Gas Sensor they are 0 and 40.

Contact

If you want to ask a question or reach out to me feel free to use my academic email address [email protected]!

Referencing

If you want to reference this work (please do!) use the following bibentry

@inproceedings{
	falk2022implicit,
	title={Implicit kernel meta-learning using kernel integral forms},
	author={John Isak Texas Falk and Carlo Ciliberto and massimiliano pontil},
	booktitle={The 38th Conference on Uncertainty in Artificial Intelligence},
	year={2022},
	url={https://openreview.net/forum?id=rNgqwPUsqgq}
}

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
data		data
implicit_kernel_meta_learning		implicit_kernel_meta_learning
plots		plots
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.org		README.org
guild.yml		guild.yml
setup.py		setup.py
spec-file.txt		spec-file.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Implicit Kernel Meta Learning

Setting up the environment

Datasets

Reproducing the results

Creating the datasets

Synthetic Data

Air Quality and Gas Sensor

Reproducing Results

Contact

Referencing

About

Releases

Packages

Languages

License

IsakFalk/IKML

Folders and files

Latest commit

History

Repository files navigation

Implicit Kernel Meta Learning

Setting up the environment

Datasets

Reproducing the results

Creating the datasets

Synthetic Data

Air Quality and Gas Sensor

Reproducing Results

Contact

Referencing

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages