Skip to content

Latest commit

 

History

History
127 lines (107 loc) · 4.9 KB

README.md

File metadata and controls

127 lines (107 loc) · 4.9 KB

HUB: Holistic Unlearning Benchmark

This repository contains the original code for the Holistic Unlearning Benchmark: A Multi-Faceted Evaluation for Text-to-Image Diffusion Model Unlearning.

News

[2024.10.29] We released HUB: Holistic Unlearning Benchmark 🔥


Environment setup

Installation

To set up the environment, follow these steps:

  1. Clone the repository:
    git clone https://github.com/ml-postech/HUB.git
    cd HUB
  2. Create and activate the conda environment:
    conda env create -f environment.yaml
    conda activate HUB

Update envs.py

Before running the code, make sure to update envs.py with the correct file paths, GPT API configurations, and any other parameters specific to your environment.

Download pre-trained models and datasets


Evaluation

Run the evaluation

To run the evaluation, execute the following command:

python main.py --config YOUR_CONFIG.yaml

Batch to log

We use GPT-4 to evaluate whether the generated images contain specific concepts. For more efficient evaluation, we use batch API. After sending the queries and completing the evaluations, run the following command to organize the logs and results. Replace NUM_BATCHES with the number of batches you want to evaluate.

python batch2log.py --num_batches NUM_BATCHES

How to evaluate own model?

For now, we support the following six unlearning methods: AC, SA, SalUn, UCE, ESD, Receler. To evaluate your own model, you need to modify model.__init__.py to include the loading of your custom model. We recommend that you place your model in models/sd/YOUR_METHOD/


How to evaluate each task individually?

To evaluate each task separately, follow these commands. In the following examples, replace the variables according to the settings you want to evaluate. Make sure to execute below command before evaluating each task.

export PYTHONPATH=$PYTHONPATH:PROJECT_DIR

Effectiveness, Faithfulness, Compliance, and Over-erasing effect

For effectiveness, faithfulness and compliacnce task, we have to generate images first with below command. TASK should be one of simple_prompt, diverse_prompt, MS-COCO, selective_alignment or over_erasing.

python source/image_generation.py --task TASK --method METHOD --target TARGET 

After generate images, execute below command

python source/utils/evaluation_batch.py --task TASK --method METHOD --target TARGET --seed SEED

Side effects: Model bias

python source/bias/bias.py \
    --method METHOD \
    --target TARGET \
    --batch_size BATCH_SIZE \
    --device DEVICE \
    --seed SEED

Downstream application

Sketch-to-image

python source/image_translation/image_translation.py \
    --method METHOD \
    --target TARGET \
    --task sketch2image \
    --device DEVICE \
    --seed SEED

Image-to-image

python source/image_translation/image_translation.py \
    --method METHOD \
    --target TARGET \
    --task image2image \
    --device DEVICE \
    --seed SEED

Concept restoration

python source/concept_restoration/concept_restoration.py \
    --method METHOD \
    --target TARGET \
    --start_t_idx START_T_IDX \ # check the description in the code
    --device DEVICE \
    --seed SEED

After generate images, execute below command. TASK should be one of sketch2image, image2image, or concept_restoration.

python source/utils/evaluation_batch.py --task TASK --method METHOD --target TARGET --seed SEED

Citation

@article{moon2024holistic,
    title={Holistic Unlearning Benchmark: A Multi-Faceted Evaluation for Text-to-Image Diffusion Model Unlearning},
    author={Moon, Saemi and Lee, Minjong and Park, Sangdon and Kim, Dongwoo},
    journal={arXiv preprint arXiv:2410.05664},
    year={2024}
}