Development (#138)

* fix: typo * build: update scripts * ignore python * docs: update install instructions * docs: update install instructions * build: fix makefile typo * docs: fix typos * build: updates for hpobench installation * build: updates for hpobench installation * Add logo to README.md + docs; adapt docs theme colors to logo * Actually add the logo file * fix: changelog entry for initial version * add running data to readme (#139) * add running data to readme * Update README.md --------- Co-authored-by: benjamc <[email protected]> * Update plotting --------- Co-authored-by: benjamc <[email protected]> Co-authored-by: Helena Graf <[email protected]> Co-authored-by: Difan Deng <[email protected]>
automl · Jun 11, 2024 · 9f163c2 · 9f163c2
1 parent 652abf4
commit 9f163c2
Show file tree

Hide file tree

Showing 18 changed files with 1,572 additions and 141 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1 +1,3 @@
-# Version 0.0.1
+# Version 0.1.0
+
+- Initial version of CARP-S.
diff --git a/Makefile b/Makefile
@@ -42,8 +42,8 @@ clean-build:
 
 # Build a distribution in ./dist
 build:
-    $(PYTHON) -m pip install build
-    $(PYTHON) -m build --sdist
+	$(PYTHON) -m pip install build
+	$(PYTHON) -m build --sdist
 
 # Publish to testpypi
 # Will echo the commands to actually publish to be run to publish to actual PyPi
@@ -55,7 +55,7 @@ publish: clean-build build
 	$(PYTHON) -m twine upload --repository testpypi ${DIST}/*
 	@echo
 	@echo "Test with the following:"
-	@echo "* Create a new virtual environment to install the uplaoded distribution into"
+	@echo "* Create a new virtual environment to install the uploaded distribution into"
 	@echo "* Run the following:"
 	@echo "--- pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple/ ${PACKAGE_NAME}==${VERSION}"
 	@echo

diff --git a/README.md b/README.md
@@ -1,3 +1,5 @@
+<img src="docs/images/carps_Logo_wide.png" alt="Logo"/>
+
 # CARP-S
 Welcome to CARP-S! 
 This repository contains a benchmarking framework for optimizers.
@@ -14,6 +16,9 @@ For more details on CARP-S, please have a look at the
 [documentation](https://AutoML.github.io/CARP-S/latest/).
 
 ## Installation
+
+### Installation from PyPI
+
 To install CARP-S, you can simply use `pip`:
 
 ```bash
@@ -32,7 +37,7 @@ pip install carps[smac,bbob]
 
 All possible install options for benchmarks are:
 ```bash
-dummy,bhob,hpob,hpobench,mfpbench,pymoo,yahpoo
+dummy,bhob,hpob,mfpbench,pymoo,yahpo
 ```
 
 All possible install options for optimizers are:
@@ -43,6 +48,8 @@ dummy,dehb,hebo,nevergrad,optuna,skopt,smac,smac14,synetune
 Please note that installing all requirements for all benchmarks and optimizers in a single 
 environment will not be possible due to conflicting dependencies.
 
+### Installation from Source
+
 If you want to install from source, you can clone the repository and install CARP-S via:
 
 ```bash
@@ -65,6 +72,31 @@ If you want to install CARP-S for development, you can use the following command
 make install-dev
 ```
 
+### Additional Steps for Benchmarks
+
+For HPOBench, it is necessary to install the requirements via:
+```bash
+bash container_recipes/benchmarks/HPOBench/install_HPOBench.sh
+```
+
+For some benchmarks, it is necessary to download data, 
+such as surrogate models, in order to run the benchmark: 
+
+-   For HPOB, you can download the surrogate benchmarks with
+    ```bash
+    bash container_recipes/benchmarks/HPOB/download_data.sh
+    ```
+
+-   For MFPBench, you can download the surrogate benchmarks with
+    ```bash
+    bash container_recipes/benchmarks/MFPBench/download_data.sh
+    ```
+
+-   For YAHPO, you can download the required surrogate benchmarks and meta-data with
+    ```bash
+    bash container_recipes/benchmarks/YAHPO/prepare_yahpo.sh
+    ```
+
 ## Minimal Example
 Once the requirements for both an optimizer and a benchmark, e.g. `SMAC2.0` and `BBOB`, are installed, you can run
 one of the following minimal examples to benchmark `SMAC2.0` on `BBOB` directly with Hydra:
@@ -125,3 +157,11 @@ guidelines for
 [benchmarks](https://automl.github.io/CARP-S/latest/contributing/contributing-a-benchmark/)
 and
 [optimizers](https://automl.github.io/CARP-S/latest/contributing/contributing-an-optimizer/).
+
+
+## Evaluation Results
+For each scenario (blackbox, multi-fidelity, multi-objective and multi-fidelity-multi-objective) and set (dev and test), we run selected optimizers and provide the data.
+Here we provide the link to the [meta data](https://drive.google.com/file/d/17pn48ragmWsyRC39sInsh2fEPUHP3BRT/view?usp=sharing) 
+that contains the detailed optimization setting for each run  
+and the [running results](https://drive.google.com/file/d/1yzJRbwRvdLbpZ9SdQN2Vk3yQSdDP_vck/view?usp=drive_link) that 
+records the running results of each optimization-benchmark combination. 
diff --git a/carps/analysis/performance_over_time.py b/carps/analysis/performance_over_time.py
@@ -5,12 +5,21 @@
 import matplotlib
 import matplotlib.pyplot as plt
 import seaborn as sns
+import numpy as np
 
 from carps.analysis.utils import get_color_palette, savefig, setup_seaborn
+from carps.analysis.utils import filter_only_final_performance
+
 
 if TYPE_CHECKING:
     import pandas as pd
 
+def get_order_by_mean(df: pd.DataFrame) -> list[str]:
+    final_df = filter_only_final_performance(df)
+    reduced = final_df.groupby(by="optimizer_id")["trial_value__cost_inc_norm"].apply(np.nanmean)
+    reduced = reduced.sort_values()
+    return reduced.index.tolist()
+
 
 def plot_performance_over_time(
     df: pd.DataFrame,
@@ -20,9 +29,12 @@ def plot_performance_over_time(
     figure_filename: str = "figures/performance_over_time.pdf",
     figsize: tuple[int, int] = (6, 4),
     show_legend: bool = True,
+    title: str | None = None,
     **lineplot_kwargs,
 ) -> tuple[plt.Figure, matplotlib.axes.Axes]:
     setup_seaborn(font_scale=1.5)
+    sorter = get_order_by_mean(df=df)
+    df = df.sort_values(by="optimizer_id", key=lambda column: column.map(lambda e: sorter.index(e)))
     palette = get_color_palette(df)
     fig = plt.figure(figsize=figsize)
     ax = fig.add_subplot(111)
@@ -31,6 +43,8 @@ def plot_performance_over_time(
         ax.legend(loc="center left", bbox_to_anchor=(1.05, 0.5))
     else:
         ax.get_legend().remove()
+    if title is not None:
+        ax.set_title(title)
     savefig(fig, figure_filename)
     return fig, ax
 
@@ -40,13 +54,15 @@ def plot_rank_over_time(
     x="n_trials_norm",
     y="cost_inc_norm",
     hue="optimizer_id",
-    figure_filename: str = "figures/performance_over_time.pdf",
+    figure_filename: str = "figures/performance_over_time",
     figsize: tuple[int, int] = (6, 4),
     show_legend: bool = True,
     **lineplot_kwargs,
 ) -> tuple[plt.Figure, matplotlib.axes.Axes]:
     # TODO
     setup_seaborn(font_scale=1.5)
+    sorter = get_order_by_mean(df=df)
+    df = df.sort_values(by="optimizer_id", key=lambda column: column.map(lambda e: sorter.index(e)))
     palette = get_color_palette(df)
     fig = plt.figure(figsize=figsize)
     ax = fig.add_subplot(111)

diff --git a/carps/analysis/plot_ranking.py b/carps/analysis/plot_ranking.py
@@ -27,56 +27,56 @@ def plot_ranking(
     identifier = f"{scenario}_{set_id}"
     label = f"tab:stat_results_{identifier}"
     result = calc_critical_difference(gdf, identifier=identifier, figsize=(8, 3), perf_col=perf_col)
-    print(result)
-    try: 
-        create_report(result)
-    except Exception as e:
-        print(e)
-    table_str = custom_latex_table(result, label=label)
-    fn = Path("figures/critd/" + label[len("tab:") :] + ".tex")
-    fn.write_text(table_str)
-    print(table_str)
-    plt.show()
+    # print(result)
+    # try: 
+    #     create_report(result)
+    # except Exception as e:
+    #     print(e)
+    # table_str = custom_latex_table(result, label=label)
+    # fn = Path("figures/critd/" + label[len("tab:") :] + ".tex")
+    # fn.write_text(table_str)
+    # print(table_str)
+    # plt.show()
 
     sorted_ranks, names, groups = get_sorted_rank_groups(result, reverse=False)
-    print(sorted_ranks, names, groups)
-
-    # DF on normalized perf values
-    df_crit = get_df_crit(gdf, nan_handling="keep", perf_col=perf_col)
-    df_crit = df_crit.reindex(columns=names)
-    df_crit.index = [i.replace(problem_prefix + "/dev/", "") for i in df_crit.index]
-    df_crit.index = [i.replace(problem_prefix + "/test/", "") for i in df_crit.index]
-    plt.figure(figsize=(12, 12))
-    sns.heatmap(df_crit, annot=False, fmt="g", cmap="viridis_r")
-    plt.title("Performance of Optimizers per Problem (Normalized)")
-    plt.ylabel("Problem ID")
-    plt.xlabel("Optimizer")
-    savefig(plt.gcf(), fpath / f"perf_opt_per_problem_{identifier}")
-    plt.show()
-
-    # Df on raw values
-    # Optionally, plot the ranked data as a heatmap
-    df_crit = get_df_crit(gdf, nan_handling="keep", perf_col=perf_col)
-    df_crit = df_crit.reindex(columns=names)
-    df_crit.index = [i.replace(problem_prefix + "/dev/", "") for i in df_crit.index]
-    df_crit.index = [i.replace(problem_prefix + "/test/", "") for i in df_crit.index]
-    ranked_df = df_crit.rank(axis=1, method="min", ascending=True)
-
-    plt.figure(figsize=(12, 12))
-    sns.heatmap(ranked_df, annot=True, fmt="g", cmap="viridis_r")
-    plt.title("Ranking of Optimizers per Problem")
-    plt.ylabel("Problem ID")
-    plt.xlabel("Optimizer")
-    savefig(plt.gcf(), fpath / f"rank_opt_per_problem_{identifier}")
-    plt.show()
-
-    # Plotting the heatmap of the rank correlation matrix
-    correlation_matrix = ranked_df.corr(method="spearman")
-    plt.figure(figsize=(8, 6))
-    sns.heatmap(correlation_matrix, annot=True, cmap="coolwarm", cbar=True, square=True, fmt=".2f")
-    plt.title("Spearman Rank Correlation Matrix Between Optimizers")
-    savefig(plt.gcf(), fpath / f"spearman_rank_corr_matrix_opt_{identifier}")
-    plt.show()
+    # print(sorted_ranks, names, groups)
+
+    # # DF on normalized perf values
+    # df_crit = get_df_crit(gdf, nan_handling="keep", perf_col=perf_col)
+    # df_crit = df_crit.reindex(columns=names)
+    # df_crit.index = [i.replace(problem_prefix + "/dev/", "") for i in df_crit.index]
+    # df_crit.index = [i.replace(problem_prefix + "/test/", "") for i in df_crit.index]
+    # plt.figure(figsize=(12, 12))
+    # sns.heatmap(df_crit, annot=False, fmt="g", cmap="viridis_r")
+    # plt.title("Performance of Optimizers per Problem (Normalized)")
+    # plt.ylabel("Problem ID")
+    # plt.xlabel("Optimizer")
+    # savefig(plt.gcf(), fpath / f"perf_opt_per_problem_{identifier}")
+    # plt.show()
+
+    # # Df on raw values
+    # # Optionally, plot the ranked data as a heatmap
+    # df_crit = get_df_crit(gdf, nan_handling="keep", perf_col=perf_col)
+    # df_crit = df_crit.reindex(columns=names)
+    # df_crit.index = [i.replace(problem_prefix + "/dev/", "") for i in df_crit.index]
+    # df_crit.index = [i.replace(problem_prefix + "/test/", "") for i in df_crit.index]
+    # ranked_df = df_crit.rank(axis=1, method="min", ascending=True)
+
+    # plt.figure(figsize=(12, 12))
+    # sns.heatmap(ranked_df, annot=True, fmt="g", cmap="viridis_r")
+    # plt.title("Ranking of Optimizers per Problem")
+    # plt.ylabel("Problem ID")
+    # plt.xlabel("Optimizer")
+    # savefig(plt.gcf(), fpath / f"rank_opt_per_problem_{identifier}")
+    # plt.show()
+
+    # # Plotting the heatmap of the rank correlation matrix
+    # correlation_matrix = ranked_df.corr(method="spearman")
+    # plt.figure(figsize=(8, 6))
+    # sns.heatmap(correlation_matrix, annot=True, cmap="coolwarm", cbar=True, square=True, fmt=".2f")
+    # plt.title("Spearman Rank Correlation Matrix Between Optimizers")
+    # savefig(plt.gcf(), fpath / f"spearman_rank_corr_matrix_opt_{identifier}")
+    # plt.show()
 
 
     # combined
@@ -85,11 +85,13 @@ def plot_ranking(
     right = 0.6
     h = 0.9
     w = 1
-    factor = 15
+    factor = 30
+    hspace = 0.3
+    wspace = 0.7
 
     fig = plt.figure(layout=None, facecolor="white", figsize=(w * factor, h * factor))
     gs = fig.add_gridspec(nrows=nrows, ncols=ncols, left=0.05, right=right,
-                        hspace=0.3, wspace=0.3
+                        hspace=hspace, wspace=wspace
                         )
 
     # Perf per problem (normalized)
@@ -98,13 +100,15 @@ def plot_ranking(
     df_crit = df_crit.reindex(columns=names)
     df_crit.index = [i.replace(problem_prefix + "/dev/", "") for i in df_crit.index]
     df_crit.index = [i.replace(problem_prefix + "/test/", "") for i in df_crit.index]
-    ax0 = sns.heatmap(df_crit, annot=False, fmt="g", cmap="viridis_r", ax=ax0)
-    ax0.set_title("Performance of Optimizers per Problem (Normalized)")
+    ax0 = sns.heatmap(df_crit, annot=False, fmt="g", cmap="viridis_r", ax=ax0, cbar_kws={"shrink": 0.8, "aspect": 30})
+    ax0.set_title("Final Performance per Problem (Normalized)")
     ax0.set_ylabel("Problem ID")
     ax0.set_xlabel("Optimizer")
 
 
     df_finalperf = filter_only_final_performance(df=gdf)
+    sorter = names
+    df_finalperf = df_finalperf.sort_values(by="optimizer_id", key=lambda column: column.map(lambda e: sorter.index(e)))
     palette = get_color_palette(df=gdf)
     ax1 = fig.add_subplot(gs[0, -1])
     x="n_trials_norm"
@@ -115,20 +119,22 @@ def plot_ranking(
     ax1 = sns.boxplot(
         data=df_finalperf, y=y, x=x, hue=hue, palette=palette, ax=ax1
     )
+    ax1.set_title("Final Performance (Normalized)")
 
     ax2 = fig.add_subplot(gs[1, -1])
     ax2 = sns.violinplot(
         data=df_finalperf, y=y, x=x, hue=hue, palette=palette, ax=ax2, cut=0
     )
+    ax2.set_title("Final Performance (Normalized)")
 
     # Spearman rank correlation
     ax3 = fig.add_subplot(gs[2, -1])
     ranked_df = df_crit.rank(axis=1, method="min", ascending=True)
     correlation_matrix = ranked_df.corr(method="spearman")
     ax3 = sns.heatmap(correlation_matrix, annot=True, cmap="coolwarm", cbar=True, square=True, fmt=".2f", ax=ax3)
-    ax3.set_title("Spearman Rank Correlation Matrix Between Optimizers")
+    ax3.set_title("Spearman Rank Correlation Matrix\nBetween Optimizers")
 
-    fig.set_tight_layout(True)
+    # fig.set_tight_layout(True)
 
     savefig(fig, fpath / f"final_per_combined_{identifier}")
 

diff --git a/container_recipes/benchmarks/HPOBench/install_HPOBench.sh b/container_recipes/benchmarks/HPOBench/install_HPOBench.sh
@@ -0,0 +1,17 @@
+#!/bin/bash
+
+CONDA_ENV_NAME=$1
+if [ -z "$CONDA_ENV_NAME" ]
+then
+    CONDA_RUN_COMMAND=
+else
+    CONDA_RUN_COMMAND="${CONDA_COMMAND} run ${CONDA_ENV_NAME}"
+fi
+$CONDA_RUN_COMMAND pip install git+https://github.com/automl/HPOBench.git --ignore-requires-python
+$CONDA_RUN_COMMAND pip install tqdm
+$CONDA_RUN_COMMAND pip install pandas==1.2.4
+$CONDA_RUN_COMMAND pip install Cython==0.29.36
+$CONDA_RUN_COMMAND pip install scikit-learn==0.24.2 --no-build-isolation  # <- no buil isolation is important
+$CONDA_RUN_COMMAND pip install openml==0.12.2
+$CONDA_RUN_COMMAND pip install xgboost==1.3.1
+$CONDA_RUN_COMMAND pip install ConfigSpace #==0.6.1
diff --git a/container_recipes/benchmarks/HPOBench/install_HPOBench_requirements.sh b/container_recipes/benchmarks/HPOBench/install_HPOBench_requirements.sh
diff --git a/container_recipes/benchmarks/MFPBench/download_data.sh b/container_recipes/benchmarks/MFPBench/download_data.sh
@@ -1,2 +1,12 @@
-python -m mfpbench download --status --data-dir data
-python -m mfpbench download --benchmark pd1
+#!/bin/bash
+
+CONDA_ENV_NAME=$1
+if [ -z "$CONDA_ENV_NAME" ]
+then
+    CONDA_RUN_COMMAND=
+else
+    CONDA_RUN_COMMAND="${CONDA_COMMAND} run ${CONDA_ENV_NAME}"
+fi
+
+$CONDA_RUN_COMMAND python -m mfpbench download --status --data-dir data
+$CONDA_RUN_COMMAND python -m mfpbench download --benchmark pd1