Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Docu of move_plans_between_dataset #2235

Merged
merged 3 commits into from
Jun 4, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 11 additions & 11 deletions documentation/pretraining_and_finetuning.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

## Intro

So far nnU-Net only supports supervised pre-training, meaning that you train a regular nnU-Net on some source dataset
So far nnU-Net only supports supervised pre-training, meaning that you train a regular nnU-Net on some pretraining dataset
and then use the final network weights as initialization for your target dataset.

As a reminder, many training hyperparameters such as patch size and network topology differ between datasets as a
Expand All @@ -16,11 +16,11 @@ how the resulting weights can then be used for initialization.

Throughout this README we use the following terminology:

- `source dataset` is the dataset you intend to run the pretraining on
- `pretraining dataset` is the dataset you intend to run the pretraining on (former: source dataset)
- `target dataset` is the dataset you are interested in; the one you wish to fine tune on


## Pretraining on the source dataset
## Training on the pretraining dataset

In order to obtain matching network topologies we need to transfer the plans from one dataset to another. Since we are
only interested in the target dataset, we first need to run experiment planning (and preprocessing) for it:
Expand All @@ -29,19 +29,19 @@ only interested in the target dataset, we first need to run experiment planning
nnUNetv2_plan_and_preprocess -d TARGET_DATASET
```

Then we need to extract the dataset fingerprint of the source dataset, if not yet available:
Then we need to extract the dataset fingerprint of the pretraining dataset, if not yet available:

```bash
nnUNetv2_extract_fingerprint -d SOURCE_DATASET
nnUNetv2_extract_fingerprint -d PRETRAINING_DATASET
```

Now we can take the plans from the target dataset and transfer it to the source:
Now we can take the plans from the target dataset and transfer it to the pretraining dataset:

```bash
nnUNetv2_move_plans_between_datasets -s TARGET_DATASET -t SOURCE_DATASET -sp TARGET_PLANS_IDENTIFIER -tp SOURCE_PLANS_IDENTIFIER
nnUNetv2_move_plans_between_datasets -s PRETRAINING_DATASET -t TARGET_DATASET -sp PRETRAINING_PLANS_IDENTIFIER -tp TARGET_PLANS_IDENTIFIER
```

`SOURCE_PLANS_IDENTIFIER` is hereby probably nnUNetPlans unless you changed the experiment planner in
`PRETRAINING_PLANS_IDENTIFIER` is hereby probably nnUNetPlans unless you changed the experiment planner in
nnUNetv2_plan_and_preprocess. For `TARGET_PLANS_IDENTIFIER` we recommend you set something custom in order to not
overwrite default plans.

Expand All @@ -51,16 +51,16 @@ work well (but it could, depending on the schemes!).

Note on CT normalization: Yes, also the clip values, mean and std are transferred!

Now you can run the preprocessing on the source task:
Now you can run the preprocessing on the pretraining dataset:

```bash
nnUNetv2_preprocess -d SOURCE_DATSET -plans_name TARGET_PLANS_IDENTIFIER
nnUNetv2_preprocess -d PRETRAINING_DATASET -plans_name TARGET_PLANS_IDENTIFIER
```

And run the training as usual:

```bash
nnUNetv2_train SOURCE_DATSET CONFIG all -p TARGET_PLANS_IDENTIFIER
nnUNetv2_train PRETRAINING_DATASET CONFIG all -p TARGET_PLANS_IDENTIFIER
```

Note how we use the 'all' fold to train on all available data. For pretraining it does not make sense to split the data.
Expand Down
Loading