Skip to content

Commit

Permalink
Merge branch 'MIC-DKFZ:master' into master
Browse files Browse the repository at this point in the history
  • Loading branch information
TaWald authored Oct 11, 2023
2 parents 4482ac1 + de48541 commit 7e537d5
Show file tree
Hide file tree
Showing 52 changed files with 391 additions and 156 deletions.
22 changes: 22 additions & 0 deletions .github/workflows/codespell.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
name: Codespell

on:
push:
branches: [master]
pull_request:
branches: [master]

permissions:
contents: read

jobs:
codespell:
name: Check for spelling errors
runs-on: ubuntu-latest

steps:
- name: Checkout
uses: actions/checkout@v3
- name: Codespell
uses: codespell-project/actions-codespell@v2
129 changes: 129 additions & 0 deletions documentation/competitions/AutoPETII.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
# Look Ma, no code: fine tuning nnU-Net for the AutoPET II challenge by only adjusting its JSON plans

Please cite our paper :-*

```text
COMING SOON
```

## Intro

See the [Challenge Website](https://autopet-ii.grand-challenge.org/) for details on the challenge.

Our solution to this challenge rewuires no code changes at all. All we do is optimize nnU-Net's hyperparameters
(architecture, batch size, patch size) through modifying the nnUNetplans.json file.

## Prerequisites
Use the latest pytorch version!

We recommend you use the latest nnU-Net version as well! We ran our trainings with commit 913705f which you can try in case something doesn't work as expected:
`pip install git+https://github.com/MIC-DKFZ/nnUNet.git@913705f`

## How to reproduce our trainings

### Download and convert the data
1. Download and extract the AutoPET II dataset
2. Convert it to nnU-Net format by running `python nnunetv2/dataset_conversion/Dataset221_AutoPETII_2023.py FOLDER` where folder is the extracted AutoPET II dataset.

### Experiment planning and preprocessing
We deviate a little from the standard nnU-Net procedure because all our experiments are based on just the 3d_fullres configuration

Run the following commands:
- `nnUNetv2_extract_fingerprint -d 221` extracts the dataset fingerprint
- `nnUNetv2_plan_experiment -d 221` does the planning for the plain unet
- `nnUNetv2_plan_experiment -d 221 -pl ResEncUNetPlanner` does the planning for the residual encoder unet
- `nnUNetv2_preprocess -d 221 -c 3d_fullres` runs all the preprocessing we need

### Modification of plans files
Please read the [information on how to modify plans files](../explanation_plans_files.md) first!!!


It is easier to have everything in one plans file, so the first thing we do is transfer the ResEnc UNet to the
default plans file. We use the configuration inheritance feature of nnU-Net to make it use the same data as the
3d_fullres configuration.
Add the following to the 'configurations' dict in 'nnUNetPlans.json':

```json
"3d_fullres_resenc": {
"inherits_from": "3d_fullres",
"UNet_class_name": "ResidualEncoderUNet",
"n_conv_per_stage_encoder": [
1,
3,
4,
6,
6,
6
],
"n_conv_per_stage_decoder": [
1,
1,
1,
1,
1
]
},
```

(these values are basically just copied from the 'nnUNetResEncUNetPlans.json' file! With everything redundant being omitted thanks to inheritance from 3d_fullres)

Now we crank up the patch and batch sizes. Add the following configurations:
```json
"3d_fullres_resenc_bs80": {
"inherits_from": "3d_fullres_resenc",
"batch_size": 80
},
"3d_fullres_resenc_192x192x192_b24": {
"inherits_from": "3d_fullres_resenc",
"patch_size": [
192,
192,
192
],
"batch_size": 24
}
```

Save the file (and check for potential Syntax Errors!)

### Run trainings
Training each model requires 8 Nvidia A100 40GB GPUs. Expect training to run for 5-7 days. You'll need a really good
CPU to handle the data augmentation! 128C/256T are a must! If you have less threads available, scale down nnUNet_n_proc_DA accordingly.

```bash
nnUNet_compile=T nnUNet_n_proc_DA=28 nnUNetv2_train 221 3d_fullres_resenc_bs80 0 -num_gpus 8
nnUNet_compile=T nnUNet_n_proc_DA=28 nnUNetv2_train 221 3d_fullres_resenc_bs80 1 -num_gpus 8
nnUNet_compile=T nnUNet_n_proc_DA=28 nnUNetv2_train 221 3d_fullres_resenc_bs80 2 -num_gpus 8
nnUNet_compile=T nnUNet_n_proc_DA=28 nnUNetv2_train 221 3d_fullres_resenc_bs80 3 -num_gpus 8
nnUNet_compile=T nnUNet_n_proc_DA=28 nnUNetv2_train 221 3d_fullres_resenc_bs80 4 -num_gpus 8

nnUNet_compile=T nnUNet_n_proc_DA=28 nnUNetv2_train 221 3d_fullres_resenc_192x192x192_b24 0 -num_gpus 8
nnUNet_compile=T nnUNet_n_proc_DA=28 nnUNetv2_train 221 3d_fullres_resenc_192x192x192_b24 1 -num_gpus 8
nnUNet_compile=T nnUNet_n_proc_DA=28 nnUNetv2_train 221 3d_fullres_resenc_192x192x192_b24 2 -num_gpus 8
nnUNet_compile=T nnUNet_n_proc_DA=28 nnUNetv2_train 221 3d_fullres_resenc_192x192x192_b24 3 -num_gpus 8
nnUNet_compile=T nnUNet_n_proc_DA=28 nnUNetv2_train 221 3d_fullres_resenc_192x192x192_b24 4 -num_gpus 8
```

Done!

(We also provide pretrained weights in case you don't want to invest the GPU resources, see below)

## How to make predictions with pretrained weights
Our final model is an ensemble of two configurations:
- ResEnc UNet with batch size 80
- ResEnc UNet with patch size 192x192x192 and batch size 24

To run inference with these models, do the following:

1. Download the pretrained model weights from [Zenodo](https://zenodo.org/record/8362371)
2. Install both .zip files using `nnUNetv2_install_pretrained_model_from_zip`
3. Make sure
4. Now you can run inference on new cases with `nnUNetv2_predict`:
- `nnUNetv2_predict -i INPUT -o OUTPUT1 -d 221 -c 3d_fullres_resenc_bs80 -f 0 1 2 3 4 -step_size 0.6 --save_probabilities`
- `nnUNetv2_predict -i INPUT -o OUTPUT2 -d 221 -c 3d_fullres_resenc_192x192x192_b24 -f 0 1 2 3 4 --save_probabilities`
- `nnUNetv2_ensemble -i OUTPUT1 OUTPUT2 -o OUTPUT_ENSEMBLE`

Note that our inference Docker omitted TTA via mirroring along the axial direction during prediction (only sagittal +
coronal mirroring). This was
done to keep the inference time below 10 minutes per image on a T4 GPU (we actually never tested whether we could
have left this enabled). Just leave it on! You can also leave the step_size at default for the 3d_fullres_resenc_bs80.
2 changes: 1 addition & 1 deletion documentation/dataset_format.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ images). So these images could for example be a T1 and a T2 MRI (or whatever els
channels MUST have the same geometry (same shape, spacing (if applicable) etc.) and
must be co-registered (if applicable). Input channels are identified by nnU-Net by their FILE_ENDING: a four-digit integer at the end
of the filename. Image files must therefore follow the following naming convention: {CASE_IDENTIFIER}_{XXXX}.{FILE_ENDING}.
Hereby, XXXX is the 4-digit modality/channel identifier (should be unique for each modality/chanel, e.g., “0000” for T1, “0001” for
Hereby, XXXX is the 4-digit modality/channel identifier (should be unique for each modality/channel, e.g., “0000” for T1, “0001” for
T2 MRI, …) and FILE_ENDING is the file extension used by your image format (.png, .nii.gz, ...). See below for concrete examples.
The dataset.json file connects channel names with the channel identifiers in the 'channel_names' key (see below for details).

Expand Down
2 changes: 1 addition & 1 deletion documentation/how_to_use_nnunet.md
Original file line number Diff line number Diff line change
Expand Up @@ -189,7 +189,7 @@ wait
**Important: The first time a training is run nnU-Net will extract the preprocessed data into uncompressed numpy
arrays for speed reasons! This operation must be completed before starting more than one training of the same
configuration! Wait with starting subsequent folds until the first training is using the GPU! Depending on the
dataset size and your System this should oly take a couple of minutes at most.**
dataset size and your System this should only take a couple of minutes at most.**

If you insist on running DDP multi-GPU training, we got you covered:

Expand Down
2 changes: 1 addition & 1 deletion documentation/set_environment_variables.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
nnU-Net requires some environment variables so that it always knows where the raw data, preprocessed data and trained
models are. Depending on the operating system, these environment variables need to be set in different ways.

Variables can either be set permanently (recommended!) or you can decide to set them everytime you call nnU-Net.
Variables can either be set permanently (recommended!) or you can decide to set them every time you call nnU-Net.

# Linux & MacOS

Expand Down
12 changes: 6 additions & 6 deletions nnunetv2/batch_running/collect_results_custom_Decathlon.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ def collect_results(trainers: dict, datasets: List, output_file: str,
expected_output_folder = get_output_folder(d, module, plans, c)
if isdir(expected_output_folder):
results_folds = []
f.write("%s,%s,%s,%s,%s" % (d, c, module, plans, r))
f.write(f"{d},{c},{module},{plans},{r}")
for fl in folds:
expected_output_folder_fold = get_output_folder(d, module, plans, c, fl)
expected_summary_file = join(expected_output_folder_fold, "validation",
Expand All @@ -36,8 +36,8 @@ def collect_results(trainers: dict, datasets: List, output_file: str,
foreground_mean = load_summary_json(expected_summary_file)['foreground_mean'][
'Dice']
results_folds.append(foreground_mean)
f.write(",%02.4f" % foreground_mean)
f.write(",%02.4f\n" % np.nanmean(results_folds))
f.write(f",{foreground_mean:02.4f}")
f.write(f",{np.nanmean(results_folds):02.4f}\n")


def summarize(input_file, output_file, folds: Tuple[int, ...], configs: Tuple[str, ...], datasets, trainers):
Expand All @@ -61,7 +61,7 @@ def summarize(input_file, output_file, folds: Tuple[int, ...], configs: Tuple[st
for t in trainers.keys():
trainer_locs = valid_entries & (txt[:, 2] == t)
for pl in trainers[t]:
f.write("%s__%s" % (t, pl))
f.write(f"{t}__{pl}")
trainer_plan_locs = trainer_locs & (txt[:, 3] == pl)
r = []
for d in valid_configs.keys():
Expand All @@ -83,13 +83,13 @@ def summarize(input_file, output_file, folds: Tuple[int, ...], configs: Tuple[st
r.append(np.nan)
else:
mean_dice = np.mean([float(i) for i in fold_results])
f.write(",%02.4f" % mean_dice)
f.write(f",{mean_dice:02.4f}")
r.append(mean_dice)
else:
print('missing:', t, pl, d, v)
f.write(",nan")
r.append(np.nan)
f.write(",%02.4f\n" % np.mean(r))
f.write(f",{np.mean(r):02.4f}\n")


if __name__ == '__main__':
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ def collect_results(trainers: dict, datasets: List, output_file: str,
expected_output_folder = get_output_folder(d, module, plans, c)
if isdir(expected_output_folder):
results_folds = []
f.write("%s,%s,%s,%s,%s" % (d, c, module, plans, r))
f.write(f"{d},{c},{module},{plans},{r}")
for fl in folds:
expected_output_folder_fold = get_output_folder(d, module, plans, c, fl)
expected_summary_file = join(expected_output_folder_fold, "validation",
Expand All @@ -36,8 +36,8 @@ def collect_results(trainers: dict, datasets: List, output_file: str,
foreground_mean = load_summary_json(expected_summary_file)['foreground_mean'][
'Dice']
results_folds.append(foreground_mean)
f.write(",%02.4f" % foreground_mean)
f.write(",%02.4f\n" % np.nanmean(results_folds))
f.write(f",{foreground_mean:02.4f}")
f.write(f",{np.nanmean(results_folds):02.4f}\n")


def summarize(input_file, output_file, folds: Tuple[int, ...], configs: Tuple[str, ...], datasets, trainers):
Expand All @@ -61,7 +61,7 @@ def summarize(input_file, output_file, folds: Tuple[int, ...], configs: Tuple[st
for t in trainers.keys():
trainer_locs = valid_entries & (txt[:, 2] == t)
for pl in trainers[t]:
f.write("%s__%s" % (t, pl))
f.write(f"{t}__{pl}")
trainer_plan_locs = trainer_locs & (txt[:, 3] == pl)
r = []
for d in valid_configs.keys():
Expand All @@ -83,13 +83,13 @@ def summarize(input_file, output_file, folds: Tuple[int, ...], configs: Tuple[st
r.append(np.nan)
else:
mean_dice = np.mean([float(i) for i in fold_results])
f.write(",%02.4f" % mean_dice)
f.write(f",{mean_dice:02.4f}")
r.append(mean_dice)
else:
print('missing:', t, pl, d, v)
f.write(",nan")
r.append(np.nan)
f.write(",%02.4f\n" % np.mean(r))
f.write(f",{np.mean(r):02.4f}\n")


if __name__ == '__main__':
Expand Down
70 changes: 70 additions & 0 deletions nnunetv2/dataset_conversion/Dataset221_AutoPETII_2023.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
from batchgenerators.utilities.file_and_folder_operations import *
import shutil
from nnunetv2.dataset_conversion.generate_dataset_json import generate_dataset_json
from nnunetv2.paths import nnUNet_raw, nnUNet_preprocessed


def convert_autopet(autopet_base_dir:str = '/media/isensee/My Book1/AutoPET/nifti/FDG-PET-CT-Lesions',
nnunet_dataset_id: int = 221):
task_name = "AutoPETII_2023"

foldername = "Dataset%03.0d_%s" % (nnunet_dataset_id, task_name)

# setting up nnU-Net folders
out_base = join(nnUNet_raw, foldername)
imagestr = join(out_base, "imagesTr")
labelstr = join(out_base, "labelsTr")
maybe_mkdir_p(imagestr)
maybe_mkdir_p(labelstr)

patients = subdirs(autopet_base_dir, prefix='PETCT', join=False)
n = 0
identifiers = []
for pat in patients:
patient_acquisitions = subdirs(join(autopet_base_dir, pat), join=False)
for pa in patient_acquisitions:
n += 1
identifier = f"{pat}_{pa}"
identifiers.append(identifier)
if not isfile(join(imagestr, f'{identifier}_0000.nii.gz')):
shutil.copy(join(autopet_base_dir, pat, pa, 'CTres.nii.gz'), join(imagestr, f'{identifier}_0000.nii.gz'))
if not isfile(join(imagestr, f'{identifier}_0001.nii.gz')):
shutil.copy(join(autopet_base_dir, pat, pa, 'SUV.nii.gz'), join(imagestr, f'{identifier}_0001.nii.gz'))
if not isfile(join(imagestr, f'{identifier}.nii.gz')):
shutil.copy(join(autopet_base_dir, pat, pa, 'SEG.nii.gz'), join(labelstr, f'{identifier}.nii.gz'))

generate_dataset_json(out_base, {0: "CT", 1:"CT"},
labels={
"background": 0,
"tumor": 1
},
num_training_cases=n, file_ending='.nii.gz',
dataset_name=task_name, reference='https://autopet-ii.grand-challenge.org/',
release='release',
# overwrite_image_reader_writer='NibabelIOWithReorient',
description=task_name)

# manual split
splits = []
for fold in range(5):
val_patients = patients[fold :: 5]
splits.append(
{
'train': [i for i in identifiers if not any([i.startswith(v) for v in val_patients])],
'val': [i for i in identifiers if any([i.startswith(v) for v in val_patients])],
}
)
pp_out_dir = join(nnUNet_preprocessed, foldername)
maybe_mkdir_p(pp_out_dir)
save_json(splits, join(pp_out_dir, 'splits_final.json'), sort_keys=False)


if __name__ == '__main__':
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('input_folder', type=str,
help="The downloaded and extracted autopet dataset (must have PETCT_XXX subfolders)")
parser.add_argument('-d', required=False, type=int, default=221, help='nnU-Net Dataset ID, default: 221')
args = parser.parse_args()
amos_base = args.input_folder
convert_autopet(amos_base, args.d)
2 changes: 1 addition & 1 deletion nnunetv2/dataset_conversion/generate_dataset_json.py
Original file line number Diff line number Diff line change
Expand Up @@ -76,7 +76,7 @@ def generate_dataset_json(output_folder: str,
labels[l] = int(labels[l])

dataset_json = {
'channel_names': channel_names, # previously this was called 'modality'. I didnt like this so this is
'channel_names': channel_names, # previously this was called 'modality'. I didn't like this so this is
# channel_names now. Live with it.
'labels': labels,
'numTraining': num_training_cases,
Expand Down
2 changes: 1 addition & 1 deletion nnunetv2/ensembling/ensemble.py
Original file line number Diff line number Diff line change
Expand Up @@ -144,7 +144,7 @@ def ensemble_crossvalidations(list_of_trained_model_folders: List[str],
for f in folds:
if not isdir(join(tr, f'fold_{f}', 'validation')):
raise RuntimeError(f'Expected model output directory does not exist. You must train all requested '
f'folds of the speficied model.\nModel: {tr}\nFold: {f}')
f'folds of the specified model.\nModel: {tr}\nFold: {f}')
files_here = subfiles(join(tr, f'fold_{f}', 'validation'), suffix='.npz', join=False)
if len(files_here) == 0:
raise RuntimeError(f"No .npz files found in folder {join(tr, f'fold_{f}', 'validation')}. Rerun your "
Expand Down
8 changes: 4 additions & 4 deletions nnunetv2/evaluation/evaluate_predictions.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,8 @@ def key_to_label_or_region(key: str):
except ValueError:
key = key.replace('(', '')
key = key.replace(')', '')
splitted = key.split(',')
return tuple([int(i) for i in splitted])
split = key.split(',')
return tuple([int(i) for i in split if len(i) > 0])


def save_summary_json(results: dict, output_file: str):
Expand Down Expand Up @@ -227,7 +227,7 @@ def evaluate_folder_entry_point():
help='Output file. Optional. Default: pred_folder/summary.json')
parser.add_argument('-np', type=int, required=False, default=default_num_processes,
help=f'number of processes used. Optional. Default: {default_num_processes}')
parser.add_argument('--chill', action='store_true', help='dont crash if folder_pred doesnt have all files that are present in folder_gt')
parser.add_argument('--chill', action='store_true', help='dont crash if folder_pred does not have all files that are present in folder_gt')
args = parser.parse_args()
compute_metrics_on_folder2(args.gt_folder, args.pred_folder, args.djfile, args.pfile, args.o, args.np, chill=args.chill)

Expand All @@ -245,7 +245,7 @@ def evaluate_simple_entry_point():
help='Output file. Optional. Default: pred_folder/summary.json')
parser.add_argument('-np', type=int, required=False, default=default_num_processes,
help=f'number of processes used. Optional. Default: {default_num_processes}')
parser.add_argument('--chill', action='store_true', help='dont crash if folder_pred doesnt have all files that are present in folder_gt')
parser.add_argument('--chill', action='store_true', help='dont crash if folder_pred does not have all files that are present in folder_gt')

args = parser.parse_args()
compute_metrics_on_folder_simple(args.gt_folder, args.pred_folder, args.l, args.o, args.np, args.il, chill=args.chill)
Expand Down
Loading

0 comments on commit 7e537d5

Please sign in to comment.