diff --git a/.github/workflows/codespell.yml b/.github/workflows/codespell.yml
new file mode 100644
index 000000000..7373affc3
--- /dev/null
+++ b/.github/workflows/codespell.yml
@@ -0,0 +1,22 @@
+---
+name: Codespell
+
+on:
+  push:
+    branches: [master]
+  pull_request:
+    branches: [master]
+
+permissions:
+  contents: read
+
+jobs:
+  codespell:
+    name: Check for spelling errors
+    runs-on: ubuntu-latest
+
+    steps:
+      - name: Checkout
+        uses: actions/checkout@v3
+      - name: Codespell
+        uses: codespell-project/actions-codespell@v2
diff --git a/documentation/competitions/AutoPETII.md b/documentation/competitions/AutoPETII.md
new file mode 100644
index 000000000..075256a03
--- /dev/null
+++ b/documentation/competitions/AutoPETII.md
@@ -0,0 +1,129 @@
+# Look Ma, no code: fine tuning nnU-Net for the AutoPET II challenge by only adjusting its JSON plans
+
+Please cite our paper :-*
+
+```text
+COMING SOON
+```
+
+## Intro
+
+See the [Challenge Website](https://autopet-ii.grand-challenge.org/) for details on the challenge.
+
+Our solution to this challenge rewuires no code changes at all. All we do is optimize nnU-Net's hyperparameters 
+(architecture, batch size, patch size) through modifying the nnUNetplans.json file.
+
+## Prerequisites
+Use the latest pytorch version!
+
+We recommend you use the latest nnU-Net version as well! We ran our trainings with commit 913705f which you can try in case something doesn't work as expected:
+`pip install git+https://github.com/MIC-DKFZ/nnUNet.git@913705f`
+
+## How to reproduce our trainings
+
+### Download and convert the data
+1. Download and extract the AutoPET II dataset
+2. Convert it to nnU-Net format by running `python nnunetv2/dataset_conversion/Dataset221_AutoPETII_2023.py FOLDER` where folder is the extracted AutoPET II dataset.
+
+### Experiment planning and preprocessing
+We deviate a little from the standard nnU-Net procedure because all our experiments are based on just the 3d_fullres configuration
+
+Run the following commands:
+   - `nnUNetv2_extract_fingerprint -d 221` extracts the dataset fingerprint 
+   - `nnUNetv2_plan_experiment -d 221` does the planning for the plain unet
+   - `nnUNetv2_plan_experiment -d 221 -pl ResEncUNetPlanner` does the planning for the residual encoder unet
+   - `nnUNetv2_preprocess -d 221 -c 3d_fullres` runs all the preprocessing we need
+
+### Modification of plans files
+Please read the [information on how to modify plans files](../explanation_plans_files.md) first!!!
+
+
+It is easier to have everything in one plans file, so the first thing we do is transfer the ResEnc UNet to the 
+default plans file. We use the configuration inheritance feature of nnU-Net to make it use the same data as the 
+3d_fullres configuration.
+Add the following to the 'configurations' dict in 'nnUNetPlans.json':
+
+```json
+        "3d_fullres_resenc": {
+            "inherits_from": "3d_fullres",
+            "UNet_class_name": "ResidualEncoderUNet",
+            "n_conv_per_stage_encoder": [
+                1,
+                3,
+                4,
+                6,
+                6,
+                6
+            ],
+            "n_conv_per_stage_decoder": [
+                1,
+                1,
+                1,
+                1,
+                1
+            ]
+        },
+```
+
+(these values are basically just copied from the 'nnUNetResEncUNetPlans.json' file! With everything redundant being omitted thanks to inheritance from 3d_fullres)
+
+Now we crank up the patch and batch sizes. Add the following configurations:
+```json
+        "3d_fullres_resenc_bs80": {
+            "inherits_from": "3d_fullres_resenc",
+            "batch_size": 80
+            },
+        "3d_fullres_resenc_192x192x192_b24": {
+            "inherits_from": "3d_fullres_resenc",
+            "patch_size": [
+                192,
+                192,
+                192
+            ],
+            "batch_size": 24
+        }
+```
+
+Save the file (and check for potential Syntax Errors!)
+
+### Run trainings
+Training each model requires 8 Nvidia A100 40GB GPUs. Expect training to run for 5-7 days. You'll need a really good 
+CPU to handle the data augmentation! 128C/256T are a must! If you have less threads available, scale down nnUNet_n_proc_DA accordingly.
+
+```bash
+nnUNet_compile=T nnUNet_n_proc_DA=28 nnUNetv2_train 221 3d_fullres_resenc_bs80 0 -num_gpus 8
+nnUNet_compile=T nnUNet_n_proc_DA=28 nnUNetv2_train 221 3d_fullres_resenc_bs80 1 -num_gpus 8
+nnUNet_compile=T nnUNet_n_proc_DA=28 nnUNetv2_train 221 3d_fullres_resenc_bs80 2 -num_gpus 8
+nnUNet_compile=T nnUNet_n_proc_DA=28 nnUNetv2_train 221 3d_fullres_resenc_bs80 3 -num_gpus 8
+nnUNet_compile=T nnUNet_n_proc_DA=28 nnUNetv2_train 221 3d_fullres_resenc_bs80 4 -num_gpus 8
+
+nnUNet_compile=T nnUNet_n_proc_DA=28 nnUNetv2_train 221 3d_fullres_resenc_192x192x192_b24 0 -num_gpus 8
+nnUNet_compile=T nnUNet_n_proc_DA=28 nnUNetv2_train 221 3d_fullres_resenc_192x192x192_b24 1 -num_gpus 8
+nnUNet_compile=T nnUNet_n_proc_DA=28 nnUNetv2_train 221 3d_fullres_resenc_192x192x192_b24 2 -num_gpus 8
+nnUNet_compile=T nnUNet_n_proc_DA=28 nnUNetv2_train 221 3d_fullres_resenc_192x192x192_b24 3 -num_gpus 8
+nnUNet_compile=T nnUNet_n_proc_DA=28 nnUNetv2_train 221 3d_fullres_resenc_192x192x192_b24 4 -num_gpus 8
+```
+
+Done!
+
+(We also provide pretrained weights in case you don't want to invest the GPU resources, see below)
+
+## How to make predictions with pretrained weights
+Our final model is an ensemble of two configurations:
+- ResEnc UNet with batch size 80
+- ResEnc UNet with patch size 192x192x192 and batch size 24
+
+To run inference with these models, do the following:
+
+1. Download the pretrained model weights from [Zenodo](https://zenodo.org/record/8362371)
+2. Install both .zip files using `nnUNetv2_install_pretrained_model_from_zip`
+3. Make sure 
+4. Now you can run inference on new cases with `nnUNetv2_predict`:
+   - `nnUNetv2_predict -i INPUT -o OUTPUT1 -d 221 -c 3d_fullres_resenc_bs80 -f 0 1 2 3 4 -step_size 0.6 --save_probabilities`   
+   - `nnUNetv2_predict -i INPUT -o OUTPUT2 -d 221 -c 3d_fullres_resenc_192x192x192_b24 -f 0 1 2 3 4 --save_probabilities`
+   - `nnUNetv2_ensemble -i OUTPUT1 OUTPUT2 -o OUTPUT_ENSEMBLE`
+
+Note that our inference Docker omitted TTA via mirroring along the axial direction during prediction (only sagittal + 
+coronal mirroring). This was
+done to keep the inference time below 10 minutes per image on a T4 GPU (we actually never tested whether we could 
+have left this enabled). Just leave it on! You can also leave the step_size at default for the 3d_fullres_resenc_bs80.
\ No newline at end of file
diff --git a/documentation/dataset_format.md b/documentation/dataset_format.md
index e11d8b21b..de6c9936b 100644
--- a/documentation/dataset_format.md
+++ b/documentation/dataset_format.md
@@ -21,7 +21,7 @@ images). So these images could for example be a T1 and a T2 MRI (or whatever els
 channels MUST have the same geometry (same shape, spacing (if applicable) etc.) and
 must be co-registered (if applicable). Input channels are identified by nnU-Net by their FILE_ENDING: a four-digit integer at the end 
 of the filename. Image files must therefore follow the following naming convention: {CASE_IDENTIFIER}_{XXXX}.{FILE_ENDING}. 
-Hereby, XXXX is the 4-digit modality/channel identifier (should be unique for each modality/chanel, e.g., “0000” for T1, “0001” for 
+Hereby, XXXX is the 4-digit modality/channel identifier (should be unique for each modality/channel, e.g., “0000” for T1, “0001” for 
 T2 MRI, …) and FILE_ENDING is the file extension used by your image format (.png, .nii.gz, ...). See below for concrete examples.
 The dataset.json file connects channel names with the channel identifiers in the 'channel_names' key (see below for details).
 
diff --git a/documentation/how_to_use_nnunet.md b/documentation/how_to_use_nnunet.md
index d9627681d..e0ffe9f73 100644
--- a/documentation/how_to_use_nnunet.md
+++ b/documentation/how_to_use_nnunet.md
@@ -189,7 +189,7 @@ wait
 **Important: The first time a training is run nnU-Net will extract the preprocessed data into uncompressed numpy 
 arrays for speed reasons! This operation must be completed before starting more than one training of the same 
 configuration! Wait with starting subsequent folds until the first training is using the GPU! Depending on the 
-dataset size and your System this should oly take a couple of minutes at most.**
+dataset size and your System this should only take a couple of minutes at most.**
 
 If you insist on running DDP multi-GPU training, we got you covered:
 
diff --git a/documentation/set_environment_variables.md b/documentation/set_environment_variables.md
index 0b587ec24..71c6d26ad 100644
--- a/documentation/set_environment_variables.md
+++ b/documentation/set_environment_variables.md
@@ -3,7 +3,7 @@
 nnU-Net requires some environment variables so that it always knows where the raw data, preprocessed data and trained 
 models are. Depending on the operating system, these environment variables need to be set in different ways.
 
-Variables can either be set permanently (recommended!) or you can decide to set them everytime you call nnU-Net. 
+Variables can either be set permanently (recommended!) or you can decide to set them every time you call nnU-Net. 
 
 # Linux & MacOS
 
diff --git a/nnunetv2/batch_running/collect_results_custom_Decathlon.py b/nnunetv2/batch_running/collect_results_custom_Decathlon.py
index e5079bd5b..b670661c5 100644
--- a/nnunetv2/batch_running/collect_results_custom_Decathlon.py
+++ b/nnunetv2/batch_running/collect_results_custom_Decathlon.py
@@ -23,7 +23,7 @@ def collect_results(trainers: dict, datasets: List, output_file: str,
                             expected_output_folder = get_output_folder(d, module, plans, c)
                             if isdir(expected_output_folder):
                                 results_folds = []
-                                f.write("%s,%s,%s,%s,%s" % (d, c, module, plans, r))
+                                f.write(f"{d},{c},{module},{plans},{r}")
                                 for fl in folds:
                                     expected_output_folder_fold = get_output_folder(d, module, plans, c, fl)
                                     expected_summary_file = join(expected_output_folder_fold, "validation",
@@ -36,8 +36,8 @@ def collect_results(trainers: dict, datasets: List, output_file: str,
                                         foreground_mean = load_summary_json(expected_summary_file)['foreground_mean'][
                                             'Dice']
                                         results_folds.append(foreground_mean)
-                                        f.write(",%02.4f" % foreground_mean)
-                                f.write(",%02.4f\n" % np.nanmean(results_folds))
+                                        f.write(f",{foreground_mean:02.4f}")
+                                f.write(f",{np.nanmean(results_folds):02.4f}\n")
 
 
 def summarize(input_file, output_file, folds: Tuple[int, ...], configs: Tuple[str, ...], datasets, trainers):
@@ -61,7 +61,7 @@ def summarize(input_file, output_file, folds: Tuple[int, ...], configs: Tuple[st
         for t in trainers.keys():
             trainer_locs = valid_entries & (txt[:, 2] == t)
             for pl in trainers[t]:
-                f.write("%s__%s" % (t, pl))
+                f.write(f"{t}__{pl}")
                 trainer_plan_locs = trainer_locs & (txt[:, 3] == pl)
                 r = []
                 for d in valid_configs.keys():
@@ -83,13 +83,13 @@ def summarize(input_file, output_file, folds: Tuple[int, ...], configs: Tuple[st
                                 r.append(np.nan)
                             else:
                                 mean_dice = np.mean([float(i) for i in fold_results])
-                                f.write(",%02.4f" % mean_dice)
+                                f.write(f",{mean_dice:02.4f}")
                                 r.append(mean_dice)
                         else:
                             print('missing:', t, pl, d, v)
                             f.write(",nan")
                             r.append(np.nan)
-                f.write(",%02.4f\n" % np.mean(r))
+                f.write(f",{np.mean(r):02.4f}\n")
 
 
 if __name__ == '__main__':
diff --git a/nnunetv2/batch_running/release_trainings/nnunetv2_v1/collect_results.py b/nnunetv2/batch_running/release_trainings/nnunetv2_v1/collect_results.py
index f934186c3..828c39640 100644
--- a/nnunetv2/batch_running/release_trainings/nnunetv2_v1/collect_results.py
+++ b/nnunetv2/batch_running/release_trainings/nnunetv2_v1/collect_results.py
@@ -23,7 +23,7 @@ def collect_results(trainers: dict, datasets: List, output_file: str,
                             expected_output_folder = get_output_folder(d, module, plans, c)
                             if isdir(expected_output_folder):
                                 results_folds = []
-                                f.write("%s,%s,%s,%s,%s" % (d, c, module, plans, r))
+                                f.write(f"{d},{c},{module},{plans},{r}")
                                 for fl in folds:
                                     expected_output_folder_fold = get_output_folder(d, module, plans, c, fl)
                                     expected_summary_file = join(expected_output_folder_fold, "validation",
@@ -36,8 +36,8 @@ def collect_results(trainers: dict, datasets: List, output_file: str,
                                         foreground_mean = load_summary_json(expected_summary_file)['foreground_mean'][
                                             'Dice']
                                         results_folds.append(foreground_mean)
-                                        f.write(",%02.4f" % foreground_mean)
-                                f.write(",%02.4f\n" % np.nanmean(results_folds))
+                                        f.write(f",{foreground_mean:02.4f}")
+                                f.write(f",{np.nanmean(results_folds):02.4f}\n")
 
 
 def summarize(input_file, output_file, folds: Tuple[int, ...], configs: Tuple[str, ...], datasets, trainers):
@@ -61,7 +61,7 @@ def summarize(input_file, output_file, folds: Tuple[int, ...], configs: Tuple[st
         for t in trainers.keys():
             trainer_locs = valid_entries & (txt[:, 2] == t)
             for pl in trainers[t]:
-                f.write("%s__%s" % (t, pl))
+                f.write(f"{t}__{pl}")
                 trainer_plan_locs = trainer_locs & (txt[:, 3] == pl)
                 r = []
                 for d in valid_configs.keys():
@@ -83,13 +83,13 @@ def summarize(input_file, output_file, folds: Tuple[int, ...], configs: Tuple[st
                                 r.append(np.nan)
                             else:
                                 mean_dice = np.mean([float(i) for i in fold_results])
-                                f.write(",%02.4f" % mean_dice)
+                                f.write(f",{mean_dice:02.4f}")
                                 r.append(mean_dice)
                         else:
                             print('missing:', t, pl, d, v)
                             f.write(",nan")
                             r.append(np.nan)
-                f.write(",%02.4f\n" % np.mean(r))
+                f.write(f",{np.mean(r):02.4f}\n")
 
 
 if __name__ == '__main__':
diff --git a/nnunetv2/dataset_conversion/Dataset221_AutoPETII_2023.py b/nnunetv2/dataset_conversion/Dataset221_AutoPETII_2023.py
new file mode 100644
index 000000000..56ef16e59
--- /dev/null
+++ b/nnunetv2/dataset_conversion/Dataset221_AutoPETII_2023.py
@@ -0,0 +1,70 @@
+from batchgenerators.utilities.file_and_folder_operations import *
+import shutil
+from nnunetv2.dataset_conversion.generate_dataset_json import generate_dataset_json
+from nnunetv2.paths import nnUNet_raw, nnUNet_preprocessed
+
+
+def convert_autopet(autopet_base_dir:str = '/media/isensee/My Book1/AutoPET/nifti/FDG-PET-CT-Lesions',
+                     nnunet_dataset_id: int = 221):
+    task_name = "AutoPETII_2023"
+
+    foldername = "Dataset%03.0d_%s" % (nnunet_dataset_id, task_name)
+
+    # setting up nnU-Net folders
+    out_base = join(nnUNet_raw, foldername)
+    imagestr = join(out_base, "imagesTr")
+    labelstr = join(out_base, "labelsTr")
+    maybe_mkdir_p(imagestr)
+    maybe_mkdir_p(labelstr)
+
+    patients = subdirs(autopet_base_dir, prefix='PETCT', join=False)
+    n = 0
+    identifiers = []
+    for pat in patients:
+        patient_acquisitions = subdirs(join(autopet_base_dir, pat), join=False)
+        for pa in patient_acquisitions:
+            n += 1
+            identifier = f"{pat}_{pa}"
+            identifiers.append(identifier)
+            if not isfile(join(imagestr, f'{identifier}_0000.nii.gz')):
+                shutil.copy(join(autopet_base_dir, pat, pa, 'CTres.nii.gz'), join(imagestr, f'{identifier}_0000.nii.gz'))
+            if not isfile(join(imagestr, f'{identifier}_0001.nii.gz')):
+                shutil.copy(join(autopet_base_dir, pat, pa, 'SUV.nii.gz'), join(imagestr, f'{identifier}_0001.nii.gz'))
+            if not isfile(join(imagestr, f'{identifier}.nii.gz')):
+                shutil.copy(join(autopet_base_dir, pat, pa, 'SEG.nii.gz'), join(labelstr, f'{identifier}.nii.gz'))
+
+    generate_dataset_json(out_base, {0: "CT", 1:"CT"},
+                          labels={
+                              "background": 0,
+                              "tumor": 1
+                          },
+                          num_training_cases=n, file_ending='.nii.gz',
+                          dataset_name=task_name, reference='https://autopet-ii.grand-challenge.org/',
+                          release='release',
+                          # overwrite_image_reader_writer='NibabelIOWithReorient',
+                          description=task_name)
+
+    # manual split
+    splits = []
+    for fold in range(5):
+        val_patients = patients[fold :: 5]
+        splits.append(
+            {
+                'train': [i for i in identifiers if not any([i.startswith(v) for v in val_patients])],
+                'val': [i for i in identifiers if any([i.startswith(v) for v in val_patients])],
+            }
+        )
+    pp_out_dir = join(nnUNet_preprocessed, foldername)
+    maybe_mkdir_p(pp_out_dir)
+    save_json(splits, join(pp_out_dir, 'splits_final.json'), sort_keys=False)
+
+
+if __name__ == '__main__':
+    import argparse
+    parser = argparse.ArgumentParser()
+    parser.add_argument('input_folder', type=str,
+                        help="The downloaded and extracted autopet dataset (must have PETCT_XXX subfolders)")
+    parser.add_argument('-d', required=False, type=int, default=221, help='nnU-Net Dataset ID, default: 221')
+    args = parser.parse_args()
+    amos_base = args.input_folder
+    convert_autopet(amos_base, args.d)
diff --git a/nnunetv2/dataset_conversion/generate_dataset_json.py b/nnunetv2/dataset_conversion/generate_dataset_json.py
index 2f2e11503..429fa05a8 100644
--- a/nnunetv2/dataset_conversion/generate_dataset_json.py
+++ b/nnunetv2/dataset_conversion/generate_dataset_json.py
@@ -76,7 +76,7 @@ def generate_dataset_json(output_folder: str,
             labels[l] = int(labels[l])
 
     dataset_json = {
-        'channel_names': channel_names,  # previously this was called 'modality'. I didnt like this so this is
+        'channel_names': channel_names,  # previously this was called 'modality'. I didn't like this so this is
         # channel_names now. Live with it.
         'labels': labels,
         'numTraining': num_training_cases,
diff --git a/nnunetv2/ensembling/ensemble.py b/nnunetv2/ensembling/ensemble.py
index 68b378b25..d4a9be407 100644
--- a/nnunetv2/ensembling/ensemble.py
+++ b/nnunetv2/ensembling/ensemble.py
@@ -144,7 +144,7 @@ def ensemble_crossvalidations(list_of_trained_model_folders: List[str],
         for f in folds:
             if not isdir(join(tr, f'fold_{f}', 'validation')):
                 raise RuntimeError(f'Expected model output directory does not exist. You must train all requested '
-                                   f'folds of the speficied model.\nModel: {tr}\nFold: {f}')
+                                   f'folds of the specified model.\nModel: {tr}\nFold: {f}')
             files_here = subfiles(join(tr, f'fold_{f}', 'validation'), suffix='.npz', join=False)
             if len(files_here) == 0:
                 raise RuntimeError(f"No .npz files found in folder {join(tr, f'fold_{f}', 'validation')}. Rerun your "
diff --git a/nnunetv2/evaluation/evaluate_predictions.py b/nnunetv2/evaluation/evaluate_predictions.py
index e692f781e..80e4d242f 100644
--- a/nnunetv2/evaluation/evaluate_predictions.py
+++ b/nnunetv2/evaluation/evaluate_predictions.py
@@ -27,8 +27,8 @@ def key_to_label_or_region(key: str):
     except ValueError:
         key = key.replace('(', '')
         key = key.replace(')', '')
-        splitted = key.split(',')
-        return tuple([int(i) for i in splitted])
+        split = key.split(',')
+        return tuple([int(i) for i in split if len(i) > 0])
 
 
 def save_summary_json(results: dict, output_file: str):
@@ -227,7 +227,7 @@ def evaluate_folder_entry_point():
                         help='Output file. Optional. Default: pred_folder/summary.json')
     parser.add_argument('-np', type=int, required=False, default=default_num_processes,
                         help=f'number of processes used. Optional. Default: {default_num_processes}')
-    parser.add_argument('--chill', action='store_true', help='dont crash if folder_pred doesnt have all files that are present in folder_gt')
+    parser.add_argument('--chill', action='store_true', help='dont crash if folder_pred does not have all files that are present in folder_gt')
     args = parser.parse_args()
     compute_metrics_on_folder2(args.gt_folder, args.pred_folder, args.djfile, args.pfile, args.o, args.np, chill=args.chill)
 
@@ -245,7 +245,7 @@ def evaluate_simple_entry_point():
                         help='Output file. Optional. Default: pred_folder/summary.json')
     parser.add_argument('-np', type=int, required=False, default=default_num_processes,
                         help=f'number of processes used. Optional. Default: {default_num_processes}')
-    parser.add_argument('--chill', action='store_true', help='dont crash if folder_pred doesnt have all files that are present in folder_gt')
+    parser.add_argument('--chill', action='store_true', help='dont crash if folder_pred does not have all files that are present in folder_gt')
 
     args = parser.parse_args()
     compute_metrics_on_folder_simple(args.gt_folder, args.pred_folder, args.l, args.o, args.np, args.il, chill=args.chill)
diff --git a/nnunetv2/evaluation/find_best_configuration.py b/nnunetv2/evaluation/find_best_configuration.py
index c36008b47..7e9f77420 100644
--- a/nnunetv2/evaluation/find_best_configuration.py
+++ b/nnunetv2/evaluation/find_best_configuration.py
@@ -285,7 +285,7 @@ def find_best_configuration_entry_point():
                         help='Set this flag to disable ensembling')
     parser.add_argument('--no_overwrite', action='store_true',
                         help='If set we will not overwrite already ensembled files etc. May speed up concecutive '
-                             'runs of this command (why would oyu want to do that?) at the risk of not updating '
+                             'runs of this command (why would you want to do that?) at the risk of not updating '
                              'outdated results.')
     args = parser.parse_args()
 
diff --git a/nnunetv2/experiment_planning/dataset_fingerprint/fingerprint_extractor.py b/nnunetv2/experiment_planning/dataset_fingerprint/fingerprint_extractor.py
index bffc77dea..a4bec96f9 100644
--- a/nnunetv2/experiment_planning/dataset_fingerprint/fingerprint_extractor.py
+++ b/nnunetv2/experiment_planning/dataset_fingerprint/fingerprint_extractor.py
@@ -44,8 +44,8 @@ def collect_foreground_intensities(segmentation: np.ndarray, images: np.ndarray,
         """
         images=image with multiple channels = shape (c, x, y(, z))
         """
-        assert len(images.shape) == 4
-        assert len(segmentation.shape) == 4
+        assert images.ndim == 4
+        assert segmentation.ndim == 4
 
         assert not np.any(np.isnan(segmentation)), "Segmentation contains NaN values. grrrr.... :-("
         assert not np.any(np.isnan(images)), "Images contains NaN values. grrrr.... :-("
diff --git a/nnunetv2/experiment_planning/experiment_planners/default_experiment_planner.py b/nnunetv2/experiment_planning/experiment_planners/default_experiment_planner.py
index 55d841e1f..2b1c41247 100644
--- a/nnunetv2/experiment_planning/experiment_planners/default_experiment_planner.py
+++ b/nnunetv2/experiment_planning/experiment_planners/default_experiment_planner.py
@@ -516,12 +516,12 @@ def save_plans(self, plans):
 
         maybe_mkdir_p(join(nnUNet_preprocessed, self.dataset_name))
         save_json(plans, plans_file, sort_keys=False)
-        print('Plans were saved to %s' % join(nnUNet_preprocessed, self.dataset_name, self.plans_identifier + '.json'))
+        print(f"Plans were saved to {join(nnUNet_preprocessed, self.dataset_name, self.plans_identifier + '.json')}")
 
     def generate_data_identifier(self, configuration_name: str) -> str:
         """
-        configurations are unique within each plans file but differnet plans file can have configurations with the
-        same name. In order to distinguish the assiciated data we need a data identifier that reflects not just the
+        configurations are unique within each plans file but different plans file can have configurations with the
+        same name. In order to distinguish the associated data we need a data identifier that reflects not just the
         config but also the plans it originates from
         """
         return self.plans_identifier + '_' + configuration_name
diff --git a/nnunetv2/experiment_planning/plan_and_preprocess_entrypoints.py b/nnunetv2/experiment_planning/plan_and_preprocess_entrypoints.py
index bb3be1338..556f04a4f 100644
--- a/nnunetv2/experiment_planning/plan_and_preprocess_entrypoints.py
+++ b/nnunetv2/experiment_planning/plan_and_preprocess_entrypoints.py
@@ -21,7 +21,7 @@ def extract_fingerprint_entry():
                         help='[OPTIONAL] Set this flag to overwrite existing fingerprints. If this flag is not set and a '
                              'fingerprint already exists, the fingerprint extractor will not run.')
     parser.add_argument('--verbose', required=False, action='store_true',
-                        help='Set this to print a lot of stuff. Useful for debugging. Will disable progrewss bar! '
+                        help='Set this to print a lot of stuff. Useful for debugging. Will disable progress bar! '
                              'Recommended for cluster environments')
     args, unrecognized_args = parser.parse_known_args()
     extract_fingerprints(args.d, args.fpe, args.np, args.verify_dataset_integrity, args.clean, args.verbose)
@@ -91,7 +91,7 @@ def preprocess_entry():
                              "DECREASE -np IF YOUR RAM FILLS UP TOO MUCH!. Default: 8 processes for 2d, 4 "
                              "for 3d_fullres, 8 for 3d_lowres and 4 for everything else")
     parser.add_argument('--verbose', required=False, action='store_true',
-                        help='Set this to print a lot of stuff. Useful for debugging. Will disable progrewss bar! '
+                        help='Set this to print a lot of stuff. Useful for debugging. Will disable progress bar! '
                              'Recommended for cluster environments')
     args, unrecognized_args = parser.parse_known_args()
     if args.np is None:
@@ -173,7 +173,7 @@ def plan_and_preprocess_entry():
                              "DECREASE -np IF YOUR RAM FILLS UP TOO MUCH!. Default: 8 processes for 2d, 4 "
                              "for 3d_fullres, 8 for 3d_lowres and 4 for everything else")
     parser.add_argument('--verbose', required=False, action='store_true',
-                        help='Set this to print a lot of stuff. Useful for debugging. Will disable progrewss bar! '
+                        help='Set this to print a lot of stuff. Useful for debugging. Will disable progress bar! '
                              'Recommended for cluster environments')
     args = parser.parse_args()
 
diff --git a/nnunetv2/experiment_planning/plans_for_pretraining/move_plans_between_datasets.py b/nnunetv2/experiment_planning/plans_for_pretraining/move_plans_between_datasets.py
index d78b68963..7bda3cb1e 100644
--- a/nnunetv2/experiment_planning/plans_for_pretraining/move_plans_between_datasets.py
+++ b/nnunetv2/experiment_planning/plans_for_pretraining/move_plans_between_datasets.py
@@ -6,6 +6,7 @@
 from nnunetv2.imageio.reader_writer_registry import determine_reader_writer_from_dataset_json
 from nnunetv2.paths import nnUNet_preprocessed, nnUNet_raw
 from nnunetv2.utilities.file_path_utilities import maybe_convert_to_dataset_name
+from nnunetv2.utilities.plans_handling.plans_handler import PlansManager
 from nnunetv2.utilities.utils import get_filenames_of_train_images_and_targets
 
 
@@ -34,12 +35,13 @@ def move_plans_between_datasets(
     # we need to change data_identifier to use target_plans_identifier
     if target_plans_identifier != source_plans_identifier:
         for c in source_plans['configurations'].keys():
-            old_identifier = source_plans['configurations'][c]["data_identifier"]
-            if old_identifier.startswith(source_plans_identifier):
-                new_identifier = target_plans_identifier + old_identifier[len(source_plans_identifier):]
-            else:
-                new_identifier = target_plans_identifier + '_' + old_identifier
-            source_plans['configurations'][c]["data_identifier"] = new_identifier
+            if 'data_identifier' in source_plans['configurations'][c].keys():
+                old_identifier = source_plans['configurations'][c]["data_identifier"]
+                if old_identifier.startswith(source_plans_identifier):
+                    new_identifier = target_plans_identifier + old_identifier[len(source_plans_identifier):]
+                else:
+                    new_identifier = target_plans_identifier + '_' + old_identifier
+                source_plans['configurations'][c]["data_identifier"] = new_identifier
 
     # we need to change the reader writer class!
     target_raw_data_dir = join(nnUNet_raw, target_dataset_name)
@@ -53,6 +55,8 @@ def move_plans_between_datasets(
                                                    verbose=False)
 
     source_plans["image_reader_writer"] = rw.__name__
+    if target_plans_identifier is not None:
+        source_plans["plans_name"] = target_plans_identifier
 
     save_json(source_plans, join(nnUNet_preprocessed, target_dataset_name, target_plans_identifier + '.json'),
               sort_keys=False)
diff --git a/nnunetv2/experiment_planning/verify_dataset_integrity.py b/nnunetv2/experiment_planning/verify_dataset_integrity.py
index 502611cc2..61175d069 100644
--- a/nnunetv2/experiment_planning/verify_dataset_integrity.py
+++ b/nnunetv2/experiment_planning/verify_dataset_integrity.py
@@ -64,7 +64,7 @@ def check_cases(image_files: List[str], label_file: str, expected_num_channels:
     # check shapes
     shape_image = images.shape[1:]
     shape_seg = segmentation.shape[1:]
-    if not all([i == j for i, j in zip(shape_image, shape_seg)]):
+    if shape_image != shape_seg:
         print('Error: Shape mismatch between segmentation and corresponding images. \nShape images: %s. '
               '\nShape seg: %s. \nImage files: %s. \nSeg file: %s\n' %
               (shape_image, shape_seg, image_files, label_file))
@@ -125,12 +125,12 @@ def verify_dataset_integrity(folder: str, num_processes: int = 8) -> None:
     :param folder:
     :return:
     """
-    assert isfile(join(folder, "dataset.json")), "There needs to be a dataset.json file in folder, folder=%s" % folder
+    assert isfile(join(folder, "dataset.json")), f"There needs to be a dataset.json file in folder, folder={folder}"
     dataset_json = load_json(join(folder, "dataset.json"))
 
     if not 'dataset' in dataset_json.keys():
-        assert isdir(join(folder, "imagesTr")), "There needs to be a imagesTr subfolder in folder, folder=%s" % folder
-        assert isdir(join(folder, "labelsTr")), "There needs to be a labelsTr subfolder in folder, folder=%s" % folder
+        assert isdir(join(folder, "imagesTr")), f"There needs to be a imagesTr subfolder in folder, folder={folder}"
+        assert isdir(join(folder, "labelsTr")), f"There needs to be a labelsTr subfolder in folder, folder={folder}"
 
     # make sure all required keys are there
     dataset_keys = list(dataset_json.keys())
@@ -172,7 +172,7 @@ def verify_dataset_integrity(folder: str, num_processes: int = 8) -> None:
                 missing_labels.append(dataset[k]['label'])
                 ok = False
         if not ok:
-            raise FileNotFoundError(f"Some expeted files were missing. Make sure you are properly referencing them "
+            raise FileNotFoundError(f"Some expected files were missing. Make sure you are properly referencing them "
                                     f"in the dataset.json. Or use imagesTr & labelsTr folders!\nMissing images:"
                                     f"\n{missing_images}\n\nMissing labels:\n{missing_labels}")
     else:
@@ -181,7 +181,7 @@ def verify_dataset_integrity(folder: str, num_processes: int = 8) -> None:
         label_identifiers = [i[:-len(file_ending)] for i in labelfiles]
         labels_present = [i in label_identifiers for i in dataset.keys()]
         missing = [i for j, i in enumerate(dataset.keys()) if not labels_present[j]]
-        assert all(labels_present), 'not all training cases have a label file in labelsTr. Fix that. Missing: %s' % missing
+        assert all(labels_present), f'not all training cases have a label file in labelsTr. Fix that. Missing: {missing}'
 
     labelfiles = [v['label'] for v in dataset.values()]
     image_files = [v['images'] for v in dataset.values()]
diff --git a/nnunetv2/imageio/base_reader_writer.py b/nnunetv2/imageio/base_reader_writer.py
index d71226fa7..2847478ae 100644
--- a/nnunetv2/imageio/base_reader_writer.py
+++ b/nnunetv2/imageio/base_reader_writer.py
@@ -23,10 +23,7 @@ class BaseReaderWriter(ABC):
     def _check_all_same(input_list):
         # compare all entries to the first
         for i in input_list[1:]:
-            if not len(i) == len(input_list[0]):
-                return False
-            all_same = all(i[j] == input_list[0][j] for j in range(len(i)))
-            if not all_same:
+            if i != input_list[0]:
                 return False
         return True
 
@@ -34,10 +31,7 @@ def _check_all_same(input_list):
     def _check_all_same_array(input_list):
         # compare all entries to the first
         for i in input_list[1:]:
-            if not all([a == b for a, b in zip(i.shape, input_list[0].shape)]):
-                return False
-            all_same = np.allclose(i, input_list[0])
-            if not all_same:
+            if i.shape != input_list[0].shape or not np.allclose(i, input_list[0]):
                 return False
         return True
 
@@ -67,7 +61,7 @@ def read_images(self, image_fnames: Union[List[str], Tuple[str, ...]]) -> Tuple[
         :return:
             1) a np.ndarray of shape (c, x, y, z) where c is the number of image channels (can be 1) and x, y, z are
             the spatial dimensions (set x=1 for 2D! Example: (3, 1, 224, 224) for RGB image).
-            2) a dictionary with metadata. This can be anything. BUT it HAS to inclue a {'spacing': (a, b, c)} where a
+            2) a dictionary with metadata. This can be anything. BUT it HAS to include a {'spacing': (a, b, c)} where a
             is the spacing of x, b of y and c of z! If an image doesn't have spacing, just set this to 1. For 2D, set
             a=999 (largest spacing value! Make it larger than b and c)
 
@@ -85,7 +79,7 @@ def read_seg(self, seg_fname: str) -> Tuple[np.ndarray, dict]:
         :return:
             1) a np.ndarray of shape (1, x, y, z) where x, y, z are
             the spatial dimensions (set x=1 for 2D! Example: (1, 1, 224, 224) for 2D segmentation).
-            2) a dictionary with metadata. This can be anything. BUT it HAS to inclue a {'spacing': (a, b, c)} where a
+            2) a dictionary with metadata. This can be anything. BUT it HAS to include a {'spacing': (a, b, c)} where a
             is the spacing of x, b of y and c of z! If an image doesn't have spacing, just set this to 1. For 2D, set
             a=999 (largest spacing value! Make it larger than b and c)
         """
diff --git a/nnunetv2/imageio/natural_image_reager_writer.py b/nnunetv2/imageio/natural_image_reager_writer.py
index 6dd7718a8..11946c3ca 100644
--- a/nnunetv2/imageio/natural_image_reager_writer.py
+++ b/nnunetv2/imageio/natural_image_reager_writer.py
@@ -37,7 +37,7 @@ def read_images(self, image_fnames: Union[List[str], Tuple[str, ...]]) -> Tuple[
         images = []
         for f in image_fnames:
             npy_img = io.imread(f)
-            if len(npy_img.shape) == 3:
+            if npy_img.ndim == 3:
                 # rgb image, last dimension should be the color channel and the size of that channel should be 3
                 # (or 4 if we have alpha)
                 assert npy_img.shape[-1] == 3 or npy_img.shape[-1] == 4, "If image has three dimensions then the last " \
@@ -45,7 +45,7 @@ def read_images(self, image_fnames: Union[List[str], Tuple[str, ...]]) -> Tuple[
                                                                          f"(RGB or RGBA). Image shape here is {npy_img.shape}"
                 # move RGB(A) to front, add additional dim so that we have shape (1, c, X, Y), where c is either 3 or 4
                 images.append(npy_img.transpose((2, 0, 1))[:, None])
-            elif len(npy_img.shape) == 2:
+            elif npy_img.ndim == 2:
                 # grayscale image
                 images.append(npy_img[None, None])
 
diff --git a/nnunetv2/imageio/nibabel_reader_writer.py b/nnunetv2/imageio/nibabel_reader_writer.py
index e4fa3f50d..8faafb709 100644
--- a/nnunetv2/imageio/nibabel_reader_writer.py
+++ b/nnunetv2/imageio/nibabel_reader_writer.py
@@ -41,7 +41,7 @@ def read_images(self, image_fnames: Union[List[str], Tuple[str, ...]]) -> Tuple[
         spacings_for_nnunet = []
         for f in image_fnames:
             nib_image = nibabel.load(f)
-            assert len(nib_image.shape) == 3, 'only 3d images are supported by NibabelIO'
+            assert nib_image.ndim == 3, 'only 3d images are supported by NibabelIO'
             original_affine = nib_image.affine
 
             original_affines.append(original_affine)
@@ -120,7 +120,7 @@ def read_images(self, image_fnames: Union[List[str], Tuple[str, ...]]) -> Tuple[
         spacings_for_nnunet = []
         for f in image_fnames:
             nib_image = nibabel.load(f)
-            assert len(nib_image.shape) == 3, 'only 3d images are supported by NibabelIO'
+            assert nib_image.ndim == 3, 'only 3d images are supported by NibabelIO'
             original_affine = nib_image.affine
             reoriented_image = nib_image.as_reoriented(io_orientation(original_affine))
             reoriented_affine = reoriented_image.affine
diff --git a/nnunetv2/imageio/reader_writer_registry.py b/nnunetv2/imageio/reader_writer_registry.py
index bdbee5dfc..e2921e688 100644
--- a/nnunetv2/imageio/reader_writer_registry.py
+++ b/nnunetv2/imageio/reader_writer_registry.py
@@ -29,10 +29,10 @@ def determine_reader_writer_from_dataset_json(dataset_json_content: dict, exampl
         # trying to find that class in the nnunetv2.imageio module
         try:
             ret = recursive_find_reader_writer_by_name(ioclass_name)
-            if verbose: print('Using %s reader/writer' % ret)
+            if verbose: print(f'Using {ret} reader/writer')
             return ret
         except RuntimeError:
-            if verbose: print('Warning: Unable to find ioclass specified in dataset.json: %s' % ioclass_name)
+            if verbose: print(f'Warning: Unable to find ioclass specified in dataset.json: {ioclass_name}')
             if verbose: print('Trying to automatically determine desired class')
     return determine_reader_writer_from_file_ending(dataset_json_content['file_ending'], example_file,
                                                     allow_nonmatching_filename, verbose)
@@ -47,27 +47,27 @@ def determine_reader_writer_from_file_ending(file_ending: str, example_file: str
                 try:
                     tmp = rw()
                     _ = tmp.read_images((example_file,))
-                    if verbose: print('Using %s as reader/writer' % rw)
+                    if verbose: print(f'Using {rw} as reader/writer')
                     return rw
                 except:
                     if verbose: print(f'Failed to open file {example_file} with reader {rw}:')
                     traceback.print_exc()
                     pass
             else:
-                if verbose: print('Using %s as reader/writer' % rw)
+                if verbose: print(f'Using {rw} as reader/writer')
                 return rw
         else:
             if allow_nonmatching_filename and example_file is not None:
                 try:
                     tmp = rw()
                     _ = tmp.read_images((example_file,))
-                    if verbose: print('Using %s as reader/writer' % rw)
+                    if verbose: print(f'Using {rw} as reader/writer')
                     return rw
                 except:
                     if verbose: print(f'Failed to open file {example_file} with reader {rw}:')
                     if verbose: traceback.print_exc()
                     pass
-    raise RuntimeError("Unable to determine a reader for file ending %s and file %s (file None means no file provided)." % (file_ending, example_file))
+    raise RuntimeError(f"Unable to determine a reader for file ending {file_ending} and file {example_file} (file None means no file provided).")
 
 
 def recursive_find_reader_writer_by_name(rw_class_name: str) -> Type[BaseReaderWriter]:
diff --git a/nnunetv2/imageio/simpleitk_reader_writer.py b/nnunetv2/imageio/simpleitk_reader_writer.py
index 2b9b168c3..6a9afc24f 100644
--- a/nnunetv2/imageio/simpleitk_reader_writer.py
+++ b/nnunetv2/imageio/simpleitk_reader_writer.py
@@ -39,21 +39,21 @@ def read_images(self, image_fnames: Union[List[str], Tuple[str, ...]]) -> Tuple[
             origins.append(itk_image.GetOrigin())
             directions.append(itk_image.GetDirection())
             npy_image = sitk.GetArrayFromImage(itk_image)
-            if len(npy_image.shape) == 2:
+            if npy_image.ndim == 2:
                 # 2d
                 npy_image = npy_image[None, None]
                 max_spacing = max(spacings[-1])
                 spacings_for_nnunet.append((max_spacing * 999, *list(spacings[-1])[::-1]))
-            elif len(npy_image.shape) == 3:
+            elif npy_image.ndim == 3:
                 # 3d, as in original nnunet
                 npy_image = npy_image[None]
                 spacings_for_nnunet.append(list(spacings[-1])[::-1])
-            elif len(npy_image.shape) == 4:
+            elif npy_image.ndim == 4:
                 # 4d, multiple modalities in one file
                 spacings_for_nnunet.append(list(spacings[-1])[::-1][1:])
                 pass
             else:
-                raise RuntimeError("Unexpected number of dimensions: %d in file %s" % (len(npy_image.shape), f))
+                raise RuntimeError(f"Unexpected number of dimensions: {npy_image.ndim} in file {f}")
 
             images.append(npy_image)
             spacings_for_nnunet[-1] = list(np.abs(spacings_for_nnunet[-1]))
@@ -115,7 +115,7 @@ def read_seg(self, seg_fname: str) -> Tuple[np.ndarray, dict]:
         return self.read_images((seg_fname, ))
 
     def write_seg(self, seg: np.ndarray, output_fname: str, properties: dict) -> None:
-        assert len(seg.shape) == 3, 'segmentation must be 3d. If you are exporting a 2d segmentation, please provide it as shape 1,x,y'
+        assert seg.ndim == 3, 'segmentation must be 3d. If you are exporting a 2d segmentation, please provide it as shape 1,x,y'
         output_dimension = len(properties['sitk_stuff']['spacing'])
         assert 1 < output_dimension < 4
         if output_dimension == 2:
@@ -126,4 +126,4 @@ def write_seg(self, seg: np.ndarray, output_fname: str, properties: dict) -> Non
         itk_image.SetOrigin(properties['sitk_stuff']['origin'])
         itk_image.SetDirection(properties['sitk_stuff']['direction'])
 
-        sitk.WriteImage(itk_image, output_fname)
+        sitk.WriteImage(itk_image, output_fname, True)
diff --git a/nnunetv2/imageio/tif_reader_writer.py b/nnunetv2/imageio/tif_reader_writer.py
index 0aa5ff3d0..19ad882a3 100644
--- a/nnunetv2/imageio/tif_reader_writer.py
+++ b/nnunetv2/imageio/tif_reader_writer.py
@@ -45,15 +45,15 @@ def read_images(self, image_fnames: Union[List[str], Tuple[str, ...]]) -> Tuple[
         images = []
         for f in image_fnames:
             image = tifffile.imread(f)
-            if len(image.shape) != 3:
-                raise RuntimeError("Only 3D images are supported! File: %s" % f)
+            if image.ndim != 3:
+                raise RuntimeError(f"Only 3D images are supported! File: {f}")
             images.append(image[None])
 
         # see if aux file can be found
         expected_aux_file = image_fnames[0][:-truncate_length] + '.json'
         if isfile(expected_aux_file):
             spacing = load_json(expected_aux_file)['spacing']
-            assert len(spacing) == 3, 'spacing must have 3 entries, one for each dimension of the image. File: %s' % expected_aux_file
+            assert len(spacing) == 3, f'spacing must have 3 entries, one for each dimension of the image. File: {expected_aux_file}'
         else:
             print(f'WARNING no spacing file found for images {image_fnames}\nAssuming spacing (1, 1, 1).')
             spacing = (1, 1, 1)
@@ -83,7 +83,7 @@ def read_seg(self, seg_fname: str) -> Tuple[np.ndarray, dict]:
         ending_length = len(ending)
 
         seg = tifffile.imread(seg_fname)
-        if len(seg.shape) != 3:
+        if seg.ndim != 3:
             raise RuntimeError(f"Only 3D images are supported! File: {seg_fname}")
         seg = seg[None]
 
@@ -91,7 +91,7 @@ def read_seg(self, seg_fname: str) -> Tuple[np.ndarray, dict]:
         expected_aux_file = seg_fname[:-ending_length] + '.json'
         if isfile(expected_aux_file):
             spacing = load_json(expected_aux_file)['spacing']
-            assert len(spacing) == 3, 'spacing must have 3 entries, one for each dimension of the image. File: %s' % expected_aux_file
+            assert len(spacing) == 3, f'spacing must have 3 entries, one for each dimension of the image. File: {expected_aux_file}'
             assert all([i > 0 for i in spacing]), f"Spacing must be > 0, spacing: {spacing}"
         else:
             print(f'WARNING no spacing file found for segmentation {seg_fname}\nAssuming spacing (1, 1, 1).')
diff --git a/nnunetv2/inference/data_iterators.py b/nnunetv2/inference/data_iterators.py
index 3b287a171..9dfee4e26 100644
--- a/nnunetv2/inference/data_iterators.py
+++ b/nnunetv2/inference/data_iterators.py
@@ -28,7 +28,7 @@ def preprocess_fromfiles_save_to_queue(list_of_lists: List[List[str]],
         label_manager = plans_manager.get_label_manager(dataset_json)
         preprocessor = configuration_manager.preprocessor_class(verbose=verbose)
         for idx in range(len(list_of_lists)):
-            data, seg, data_properites = preprocessor.run_case(list_of_lists[idx],
+            data, seg, data_properties = preprocessor.run_case(list_of_lists[idx],
                                                                list_of_segs_from_prev_stage_files[
                                                                    idx] if list_of_segs_from_prev_stage_files is not None else None,
                                                                plans_manager,
@@ -40,7 +40,7 @@ def preprocess_fromfiles_save_to_queue(list_of_lists: List[List[str]],
 
             data = torch.from_numpy(data).contiguous().float()
 
-            item = {'data': data, 'data_properites': data_properites,
+            item = {'data': data, 'data_properties': data_properties,
                     'ofile': output_filenames_truncated[idx] if output_filenames_truncated is not None else None}
             success = False
             while not success:
@@ -150,7 +150,7 @@ def generate_train_batch(self):
         # if we have a segmentation from the previous stage we have to process it together with the images so that we
         # can crop it appropriately (if needed). Otherwise it would just be resized to the shape of the data after
         # preprocessing and then there might be misalignments
-        data, seg, data_properites = self.preprocessor.run_case(files, seg_prev_stage, self.plans_manager,
+        data, seg, data_properties = self.preprocessor.run_case(files, seg_prev_stage, self.plans_manager,
                                                                 self.configuration_manager,
                                                                 self.dataset_json)
         if seg_prev_stage is not None:
@@ -159,7 +159,7 @@ def generate_train_batch(self):
 
         data = torch.from_numpy(data)
 
-        return {'data': data, 'data_properites': data_properites, 'ofile': ofile}
+        return {'data': data, 'data_properties': data_properties, 'ofile': ofile}
 
 
 class PreprocessAdapterFromNpy(DataLoader):
@@ -207,7 +207,7 @@ def generate_train_batch(self):
 
         data = torch.from_numpy(data)
 
-        return {'data': data, 'data_properites': props, 'ofile': ofname}
+        return {'data': data, 'data_properties': props, 'ofile': ofname}
 
 
 def preprocess_fromnpy_save_to_queue(list_of_images: List[np.ndarray],
@@ -238,7 +238,7 @@ def preprocess_fromnpy_save_to_queue(list_of_images: List[np.ndarray],
 
             data = torch.from_numpy(data).contiguous().float()
 
-            item = {'data': data, 'data_properites': list_of_image_properties[idx],
+            item = {'data': data, 'data_properties': list_of_image_properties[idx],
                     'ofile': truncated_ofnames[idx] if truncated_ofnames is not None else None}
             success = False
             while not success:
diff --git a/nnunetv2/inference/examples.py b/nnunetv2/inference/examples.py
index 8e8f264b9..b57a39831 100644
--- a/nnunetv2/inference/examples.py
+++ b/nnunetv2/inference/examples.py
@@ -80,7 +80,7 @@
     img4, props4 = SimpleITKIO().read_images([join(nnUNet_raw, 'Dataset003_Liver/imagesTs/liver_144_0000.nii.gz')])
 
 
-    # each element returned by data_iterator must be a dict with 'data', 'ofile' and 'data_properites' keys!
+    # each element returned by data_iterator must be a dict with 'data', 'ofile' and 'data_properties' keys!
     # If 'ofile' is None, the result will be returned instead of written to a file
     # the iterator is responsible for performing the correct preprocessing!
     # note how the iterator here does not use multiprocessing -> preprocessing will be done in the main thread!
@@ -95,7 +95,7 @@ def my_iterator(list_of_input_arrs, list_of_input_props):
                                                   predictor.plans_manager,
                                                   predictor.configuration_manager,
                                                   predictor.dataset_json)
-            yield {'data': torch.from_numpy(data).contiguous().pin_memory(), 'data_properites': p, 'ofile': None}
+            yield {'data': torch.from_numpy(data).contiguous().pin_memory(), 'data_properties': p, 'ofile': None}
 
 
     ret = predictor.predict_from_data_iterator(my_iterator([img, img2, img3, img4], [props, props2, props3, props4]),
diff --git a/nnunetv2/inference/export_prediction.py b/nnunetv2/inference/export_prediction.py
index 6578be061..33035676b 100644
--- a/nnunetv2/inference/export_prediction.py
+++ b/nnunetv2/inference/export_prediction.py
@@ -31,8 +31,8 @@ def convert_predicted_logits_to_segmentation_with_correct_shape(predicted_logits
                                             properties_dict['shape_after_cropping_and_before_resampling'],
                                             current_spacing,
                                             properties_dict['spacing'])
-    # return value of resampling_fn_probabilities can be ndarray or Tensor but that doesnt matter because
-    # apply_inference_nonlin will covnert to torch
+    # return value of resampling_fn_probabilities can be ndarray or Tensor but that does not matter because
+    # apply_inference_nonlin will convert to torch
     predicted_probabilities = label_manager.apply_inference_nonlin(predicted_logits)
     del predicted_logits
     segmentation = label_manager.convert_probabilities_to_segmentation(predicted_probabilities)
diff --git a/nnunetv2/inference/predict_from_raw_data.py b/nnunetv2/inference/predict_from_raw_data.py
index 63b966c0f..3e3e3dd56 100644
--- a/nnunetv2/inference/predict_from_raw_data.py
+++ b/nnunetv2/inference/predict_from_raw_data.py
@@ -333,7 +333,7 @@ def predict_from_data_iterator(self,
                                    save_probabilities: bool = False,
                                    num_processes_segmentation_export: int = default_num_processes):
         """
-        each element returned by data_iterator must be a dict with 'data', 'ofile' and 'data_properites' keys!
+        each element returned by data_iterator must be a dict with 'data', 'ofile' and 'data_properties' keys!
         If 'ofile' is None, the result will be returned instead of written to a file
         """
         with multiprocessing.get_context("spawn").Pool(num_processes_segmentation_export) as export_pool:
@@ -354,7 +354,7 @@ def predict_from_data_iterator(self,
 
                 print(f'perform_everything_on_gpu: {self.perform_everything_on_gpu}')
 
-                properties = preprocessed['data_properites']
+                properties = preprocessed['data_properties']
 
                 # let's not get into a runaway situation where the GPU predicts so fast that the disk has to b swamped with
                 # npy files
@@ -430,14 +430,14 @@ def predict_single_npy_array(self, input_image: np.ndarray, image_properties: di
         if self.verbose:
             print('resampling to original shape')
         if output_file_truncated is not None:
-            export_prediction_from_logits(predicted_logits, dct['data_properites'], self.configuration_manager,
+            export_prediction_from_logits(predicted_logits, dct['data_properties'], self.configuration_manager,
                                           self.plans_manager, self.dataset_json, output_file_truncated,
                                           save_or_return_probabilities)
         else:
             ret = convert_predicted_logits_to_segmentation_with_correct_shape(predicted_logits, self.plans_manager,
                                                                               self.configuration_manager,
                                                                               self.label_manager,
-                                                                              dct['data_properites'],
+                                                                              dct['data_properties'],
                                                                               return_probabilities=
                                                                               save_or_return_probabilities)
             if save_or_return_probabilities:
@@ -546,7 +546,7 @@ def _internal_maybe_mirror_and_predict(self, x: torch.Tensor) -> torch.Tensor:
         if mirror_axes is not None:
             # check for invalid numbers in mirror_axes
             # x should be 5d for 3d images and 4d for 2d. so the max value of mirror_axes cannot exceed len(x.shape) - 3
-            assert max(mirror_axes) <= len(x.shape) - 3, 'mirror_axes does not match the dimension of the input!'
+            assert max(mirror_axes) <= x.ndim - 3, 'mirror_axes does not match the dimension of the input!'
 
             num_predictons = 2 ** len(mirror_axes)
             if 0 in mirror_axes:
@@ -582,7 +582,7 @@ def predict_sliding_window_return_logits(self, input_image: torch.Tensor) \
         # So autocast will only be active if we have a cuda device.
         with torch.no_grad():
             with torch.autocast(self.device.type, enabled=True) if self.device.type == 'cuda' else dummy_context():
-                assert len(input_image.shape) == 4, 'input_image must be a 4D np.ndarray or torch.Tensor (c, x, y, z)'
+                assert input_image.ndim == 4, 'input_image must be a 4D np.ndarray or torch.Tensor (c, x, y, z)'
 
                 if self.verbose: print(f'Input shape: {input_image.shape}')
                 if self.verbose: print("step_size:", self.tile_step_size)
@@ -805,7 +805,7 @@ def predict_entry_point():
     if not isdir(args.o):
         maybe_mkdir_p(args.o)
 
-    # slightly passive agressive haha
+    # slightly passive aggressive haha
     assert args.part_id < args.num_parts, 'Do you even read the documentation? See nnUNetv2_predict -h.'
 
     assert args.device in ['cpu', 'cuda',
diff --git a/nnunetv2/inference/readme.md b/nnunetv2/inference/readme.md
index 984b4a98f..721952888 100644
--- a/nnunetv2/inference/readme.md
+++ b/nnunetv2/inference/readme.md
@@ -82,7 +82,7 @@ need for the _0000 suffix anymore! This can be useful in situations where you ha
 Remember that the files must be given as 'list of lists' where each entry in the outer list is a case to be predicted 
 and the inner list contains all the files belonging to that case. There is just one file for datasets with just one 
 input modality (such as CT) but may be more files for others (such as MRI where there is sometimes T1, T2, Flair etc). 
-IMPORTANT: the order in wich the files for each case are given must match the order of the channels as defined in the 
+IMPORTANT: the order in which the files for each case are given must match the order of the channels as defined in the 
 dataset.json!
 
 If you give files as input, you need to give individual output files as output!
@@ -184,7 +184,7 @@ cons:
     img2, props2 = SimpleITKIO().read_images([join(nnUNet_raw, 'Dataset003_Liver/imagesTs/liver_146_0000.nii.gz')])
     img3, props3 = SimpleITKIO().read_images([join(nnUNet_raw, 'Dataset003_Liver/imagesTs/liver_145_0000.nii.gz')])
     img4, props4 = SimpleITKIO().read_images([join(nnUNet_raw, 'Dataset003_Liver/imagesTs/liver_144_0000.nii.gz')])
-    # each element returned by data_iterator must be a dict with 'data', 'ofile' and 'data_properites' keys!
+    # each element returned by data_iterator must be a dict with 'data', 'ofile' and 'data_properties' keys!
     # If 'ofile' is None, the result will be returned instead of written to a file
     # the iterator is responsible for performing the correct preprocessing!
     # note how the iterator here does not use multiprocessing -> preprocessing will be done in the main thread!
@@ -199,7 +199,7 @@ cons:
                                                   predictor.plans_manager,
                                                   predictor.configuration_manager,
                                                   predictor.dataset_json)
-            yield {'data': torch.from_numpy(data).contiguous().pin_memory(), 'data_properites': p, 'ofile': None}
+            yield {'data': torch.from_numpy(data).contiguous().pin_memory(), 'data_properties': p, 'ofile': None}
     ret = predictor.predict_from_data_iterator(my_iterator([img, img2, img3, img4], [props, props2, props3, props4]),
                                                save_probabilities=False, num_processes_segmentation_export=3)
 ```
\ No newline at end of file
diff --git a/nnunetv2/model_sharing/model_download.py b/nnunetv2/model_sharing/model_download.py
index d845ab5c3..02dac5f40 100644
--- a/nnunetv2/model_sharing/model_download.py
+++ b/nnunetv2/model_sharing/model_download.py
@@ -20,7 +20,7 @@ def download_and_install_from_url(url):
     import os
     home = os.path.expanduser('~')
     random_number = int(time() * 1e7)
-    tempfile = join(home, '.nnunetdownload_%s' % str(random_number))
+    tempfile = join(home, f'.nnunetdownload_{str(random_number)}')
 
     try:
         download_file(url=url, local_filename=tempfile, chunk_size=8192 * 16)
diff --git a/nnunetv2/model_sharing/model_export.py b/nnunetv2/model_sharing/model_export.py
index 2db8e24f9..51eb455f2 100644
--- a/nnunetv2/model_sharing/model_export.py
+++ b/nnunetv2/model_sharing/model_export.py
@@ -23,7 +23,7 @@ def export_pretrained_model(dataset_name_or_id: Union[int, str], output_file: st
                 else:
                     continue
 
-            expected_fold_folder = ["fold_%s" % i if i != 'all' else 'fold_all' for i in folds]
+            expected_fold_folder = [f"fold_{i}" if i != 'all' else 'fold_all' for i in folds]
             assert all([isdir(join(trainer_output_dir, i)) for i in expected_fold_folder]), \
                 f"not all requested folds are present; {dataset_name} {c}; requested folds: {folds}"
 
diff --git a/nnunetv2/postprocessing/remove_connected_components.py b/nnunetv2/postprocessing/remove_connected_components.py
index 94724fc05..df299326b 100644
--- a/nnunetv2/postprocessing/remove_connected_components.py
+++ b/nnunetv2/postprocessing/remove_connected_components.py
@@ -71,7 +71,7 @@ def determine_postprocessing(folder_predictions: str,
     if plans_file_or_dict is None:
         expected_plans_file = join(folder_predictions, 'plans.json')
         if not isfile(expected_plans_file):
-            raise RuntimeError(f"Expected plans file missing: {expected_plans_file}. The plans fils should have been "
+            raise RuntimeError(f"Expected plans file missing: {expected_plans_file}. The plans files should have been "
                                f"created while running nnUNetv2_predict. Sadge.")
         plans_file_or_dict = load_json(expected_plans_file)
     plans_manager = PlansManager(plans_file_or_dict)
@@ -80,7 +80,7 @@ def determine_postprocessing(folder_predictions: str,
         expected_dataset_json_file = join(folder_predictions, 'dataset.json')
         if not isfile(expected_dataset_json_file):
             raise RuntimeError(
-                f"Expected plans file missing: {expected_dataset_json_file}. The plans fils should have been "
+                f"Expected plans file missing: {expected_dataset_json_file}. The plans files should have been "
                 f"created while running nnUNetv2_predict. Sadge.")
         dataset_json_file_or_dict = load_json(expected_dataset_json_file)
 
diff --git a/nnunetv2/preprocessing/cropping/cropping.py b/nnunetv2/preprocessing/cropping/cropping.py
index cb6052c7a..96fe7b7db 100644
--- a/nnunetv2/preprocessing/cropping/cropping.py
+++ b/nnunetv2/preprocessing/cropping/cropping.py
@@ -12,7 +12,7 @@ def create_nonzero_mask(data):
     :return: the mask is True where the data is nonzero
     """
     from scipy.ndimage import binary_fill_holes
-    assert len(data.shape) == 4 or len(data.shape) == 3, "data must have shape (C, X, Y, Z) or shape (C, X, Y)"
+    assert data.ndim in (3, 4), "data must have shape (C, X, Y, Z) or shape (C, X, Y)"
     nonzero_mask = np.zeros(data.shape[1:], dtype=bool)
     for c in range(data.shape[0]):
         this_mask = data[c] != 0
diff --git a/nnunetv2/preprocessing/preprocessors/default_preprocessor.py b/nnunetv2/preprocessing/preprocessors/default_preprocessor.py
index d863b1937..1cf6e489f 100644
--- a/nnunetv2/preprocessing/preprocessors/default_preprocessor.py
+++ b/nnunetv2/preprocessing/preprocessors/default_preprocessor.py
@@ -128,7 +128,7 @@ def run_case(self, image_files: List[str], seg_file: Union[str, None], plans_man
         rw = plans_manager.image_reader_writer_class()
 
         # load image(s)
-        data, data_properites = rw.read_images(image_files)
+        data, data_properties = rw.read_images(image_files)
 
         # if possible, load seg
         if seg_file is not None:
@@ -136,9 +136,9 @@ def run_case(self, image_files: List[str], seg_file: Union[str, None], plans_man
         else:
             seg = None
 
-        data, seg = self.run_case_npy(data, seg, data_properites, plans_manager, configuration_manager,
+        data, seg = self.run_case_npy(data, seg, data_properties, plans_manager, configuration_manager,
                                       dataset_json)
-        return data, seg, data_properites
+        return data, seg, data_properties
 
     def run_case_save(self, output_filename_truncated: str, image_files: List[str], seg_file: str,
                       plans_manager: PlansManager, configuration_manager: ConfigurationManager,
@@ -185,7 +185,7 @@ def _normalize(self, data: np.ndarray, seg: np.ndarray, configuration_manager: C
                                                            scheme,
                                                            'nnunetv2.preprocessing.normalization')
             if normalizer_class is None:
-                raise RuntimeError('Unable to locate class \'%s\' for normalization' % scheme)
+                raise RuntimeError(f'Unable to locate class \'{scheme}\' for normalization')
             normalizer = normalizer_class(use_mask_for_norm=configuration_manager.use_mask_for_norm[c],
                                           intensityproperties=foreground_intensity_properties_per_channel[str(c)])
             data[c] = normalizer.run(data[c], seg[0])
diff --git a/nnunetv2/preprocessing/resampling/default_resampling.py b/nnunetv2/preprocessing/resampling/default_resampling.py
index ecb0435e9..e83f61463 100644
--- a/nnunetv2/preprocessing/resampling/default_resampling.py
+++ b/nnunetv2/preprocessing/resampling/default_resampling.py
@@ -65,7 +65,7 @@ def resample_data_or_seg_to_spacing(data: np.ndarray,
             pass
 
     if data is not None:
-        assert len(data.shape) == 4, "data must be c x y z"
+        assert data.ndim == 4, "data must be c x y z"
 
     shape = np.array(data[0].shape)
     new_shape = compute_new_shape(shape[1:], current_spacing, new_spacing)
@@ -116,7 +116,7 @@ def resample_data_or_seg_to_shape(data: Union[torch.Tensor, np.ndarray],
             pass
 
     if data is not None:
-        assert len(data.shape) == 4, "data must be c x y z"
+        assert data.ndim == 4, "data must be c x y z"
 
     data_reshaped = resample_data_or_seg(data, new_shape, is_seg, axis, order, do_separate_z, order_z=order_z)
     return data_reshaped
@@ -136,8 +136,8 @@ def resample_data_or_seg(data: np.ndarray, new_shape: Union[Tuple[float, ...], L
     :param order_z: only applies if do_separate_z is True
     :return:
     """
-    assert len(data.shape) == 4, "data must be (c, x, y, z)"
-    assert len(new_shape) == len(data.shape) - 1
+    assert data.ndim == 4, "data must be (c, x, y, z)"
+    assert len(new_shape) == data.ndim - 1
 
     if is_seg:
         resize_fn = resize_segmentation
diff --git a/nnunetv2/run/load_pretrained_weights.py b/nnunetv2/run/load_pretrained_weights.py
index b5c51bf48..bb26e41e1 100644
--- a/nnunetv2/run/load_pretrained_weights.py
+++ b/nnunetv2/run/load_pretrained_weights.py
@@ -9,7 +9,7 @@ def load_pretrained_weights(network, fname, verbose=False):
     shape is also the same. Segmentation layers (the 1x1(x1) layers that produce the segmentation maps)
     identified by keys ending with '.seg_layers') are not transferred!
 
-    If the pretrained weights were optained with a training outside nnU-Net and DDP or torch.optimize was used,
+    If the pretrained weights were obtained with a training outside nnU-Net and DDP or torch.optimize was used,
     you need to change the keys of the pretrained state_dict. DDP adds a 'module.' prefix and torch.optim adds
     '_orig_mod'. You DO NOT need to worry about this if pretraining was done with nnU-Net as
     nnUNetTrainer.save_checkpoint takes care of that!
diff --git a/nnunetv2/training/data_augmentation/custom_transforms/deep_supervision_donwsampling.py b/nnunetv2/training/data_augmentation/custom_transforms/deep_supervision_donwsampling.py
index 6469ee23e..d31881fb3 100644
--- a/nnunetv2/training/data_augmentation/custom_transforms/deep_supervision_donwsampling.py
+++ b/nnunetv2/training/data_augmentation/custom_transforms/deep_supervision_donwsampling.py
@@ -26,7 +26,7 @@ def __init__(self, ds_scales: Union[List, Tuple],
 
     def __call__(self, **data_dict):
         if self.axes is None:
-            axes = list(range(2, len(data_dict[self.input_key].shape)))
+            axes = list(range(2, data_dict[self.input_key].ndim))
         else:
             axes = self.axes
 
diff --git a/nnunetv2/training/dataloading/data_loader_2d.py b/nnunetv2/training/dataloading/data_loader_2d.py
index b44004f64..aab84384d 100644
--- a/nnunetv2/training/dataloading/data_loader_2d.py
+++ b/nnunetv2/training/dataloading/data_loader_2d.py
@@ -16,6 +16,7 @@ def generate_train_batch(self):
             # (Lung for example)
             force_fg = self.get_do_oversample(j)
             data, seg, properties = self._data.load_case(current_key)
+            case_properties.append(properties)
 
             # select a class/region first, then a slice where this class is present, then crop to that area
             if not force_fg:
diff --git a/nnunetv2/training/dataloading/data_loader_3d.py b/nnunetv2/training/dataloading/data_loader_3d.py
index ab755e3ec..e8345f8d6 100644
--- a/nnunetv2/training/dataloading/data_loader_3d.py
+++ b/nnunetv2/training/dataloading/data_loader_3d.py
@@ -17,6 +17,7 @@ def generate_train_batch(self):
             force_fg = self.get_do_oversample(j)
 
             data, seg, properties = self._data.load_case(i)
+            case_properties.append(properties)
 
             # If we are doing the cascade then the segmentation from the previous stage will already have been loaded by
             # self._data.load_case(i) (see nnUNetDataset.load_case)
diff --git a/nnunetv2/training/dataloading/nnunet_dataset.py b/nnunetv2/training/dataloading/nnunet_dataset.py
index ae27fc30f..153a00531 100644
--- a/nnunetv2/training/dataloading/nnunet_dataset.py
+++ b/nnunetv2/training/dataloading/nnunet_dataset.py
@@ -43,10 +43,10 @@ def __init__(self, folder: str, case_identifiers: List[str] = None,
         self.dataset = {}
         for c in case_identifiers:
             self.dataset[c] = {}
-            self.dataset[c]['data_file'] = join(folder, "%s.npz" % c)
-            self.dataset[c]['properties_file'] = join(folder, "%s.pkl" % c)
+            self.dataset[c]['data_file'] = join(folder, f"{c}.npz")
+            self.dataset[c]['properties_file'] = join(folder, f"{c}.pkl")
             if folder_with_segs_from_previous_stage is not None:
-                self.dataset[c]['seg_from_prev_stage_file'] = join(folder_with_segs_from_previous_stage, "%s.npz" % c)
+                self.dataset[c]['seg_from_prev_stage_file'] = join(folder_with_segs_from_previous_stage, f"{c}.npz")
 
         if len(case_identifiers) <= num_images_properties_loading_threshold:
             for i in self.dataset.keys():
@@ -123,7 +123,7 @@ def load_case(self, key):
 
     # this should have the properties
     ds = nnUNetDataset(folder, num_images_properties_loading_threshold=1000)
-    # now rename the properties file so that it doesnt exist anymore
+    # now rename the properties file so that it does not exist anymore
     shutil.move(join(folder, 'liver_0.pkl'), join(folder, 'liver_XXX.pkl'))
     # now we should still be able to access the properties because they have already been loaded
     ks = ds['liver_0'].keys()
@@ -133,7 +133,7 @@ def load_case(self, key):
 
     # this should not have the properties
     ds = nnUNetDataset(folder, num_images_properties_loading_threshold=0)
-    # now rename the properties file so that it doesnt exist anymore
+    # now rename the properties file so that it does not exist anymore
     shutil.move(join(folder, 'liver_0.pkl'), join(folder, 'liver_XXX.pkl'))
     # now this should crash
     try:
diff --git a/nnunetv2/training/loss/deep_supervision.py b/nnunetv2/training/loss/deep_supervision.py
index db71e8088..03141e809 100644
--- a/nnunetv2/training/loss/deep_supervision.py
+++ b/nnunetv2/training/loss/deep_supervision.py
@@ -16,7 +16,7 @@ def __init__(self, loss, weight_factors=None):
 
     def forward(self, *args):
         for i in args:
-            assert isinstance(i, (tuple, list)), "all args must be either tuple or list, got %s" % type(i)
+            assert isinstance(i, (tuple, list)), f"all args must be either tuple or list, got {type(i)}"
             # we could check for equal lengths here as well but we really shouldn't overdo it with checks because
             # this code is executed a lot of times!
 
diff --git a/nnunetv2/training/loss/dice.py b/nnunetv2/training/loss/dice.py
index d5f0c5bd1..af554908b 100644
--- a/nnunetv2/training/loss/dice.py
+++ b/nnunetv2/training/loss/dice.py
@@ -70,33 +70,32 @@ def __init__(self, apply_nonlin: Callable = None, batch_dice: bool = False, do_b
         self.ddp = ddp
 
     def forward(self, x, y, loss_mask=None):
-        shp_x, shp_y = x.shape, y.shape
-
         if self.apply_nonlin is not None:
             x = self.apply_nonlin(x)
 
-        if not self.do_bg:
-            x = x[:, 1:]
-
         # make everything shape (b, c)
-        axes = list(range(2, len(shp_x)))
-
+        axes = list(range(2, len(x.shape)))
         with torch.no_grad():
-            if len(shp_x) != len(shp_y):
-                y = y.view((shp_y[0], 1, *shp_y[1:]))
+            if len(x.shape) != len(y.shape):
+                y = y.view((y.shape[0], 1, *y.shape[1:]))
 
-            if all([i == j for i, j in zip(shp_x, shp_y)]):
+            if x.shape == y.shape:
                 # if this is the case then gt is probably already a one hot encoding
                 y_onehot = y
             else:
                 gt = y.long()
-                y_onehot = torch.zeros(shp_x, device=x.device, dtype=torch.bool)
+                y_onehot = torch.zeros(x.shape, device=x.device, dtype=torch.bool)
                 y_onehot.scatter_(1, gt, 1)
 
             if not self.do_bg:
                 y_onehot = y_onehot[:, 1:]
+
             sum_gt = y_onehot.sum(axes) if loss_mask is None else (y_onehot * loss_mask).sum(axes)
 
+        # this one MUST be outside the with torch.no_grad(): context. Otherwise no gradients for you
+        if not self.do_bg:
+            x = x[:, 1:]
+
         intersect = (x * y_onehot).sum(axes) if loss_mask is None else (x * y_onehot * loss_mask).sum(axes)
         sum_pred = x.sum(axes) if loss_mask is None else (x * loss_mask).sum(axes)
 
@@ -138,7 +137,7 @@ def get_tp_fp_fn_tn(net_output, gt, axes=None, mask=None, square=False):
         if len(shp_x) != len(shp_y):
             gt = gt.view((shp_y[0], 1, *shp_y[1:]))
 
-        if all([i == j for i, j in zip(net_output.shape, gt.shape)]):
+        if net_output.shape == gt.shape:
             # if this is the case then gt is probably already a one hot encoding
             y_onehot = gt
         else:
diff --git a/nnunetv2/training/nnUNetTrainer/nnUNetTrainer.py b/nnunetv2/training/nnUNetTrainer/nnUNetTrainer.py
index 28c875ab2..57aa904b2 100644
--- a/nnunetv2/training/nnUNetTrainer/nnUNetTrainer.py
+++ b/nnunetv2/training/nnUNetTrainer/nnUNetTrainer.py
@@ -80,7 +80,9 @@ def __init__(self, plans: dict, configuration: str, fold: int, dataset_json: dic
         # one day code base understandable and grug can get work done, everything good!
         # next day impossible: complexity demon spirit has entered code and very dangerous situation!
 
-        # OK OK I am guilty. But I tried. http://tiny.cc/gzgwuz
+        # OK OK I am guilty. But I tried.
+        # https://www.osnews.com/images/comics/wtfm.jpg
+        # https://i.pinimg.com/originals/26/b2/50/26b250a738ea4abc7a5af4d42ad93af0.jpg
 
         self.is_ddp = dist.is_available() and dist.is_initialized()
         self.local_rank = 0 if not self.is_ddp else dist.get_rank()
@@ -426,7 +428,7 @@ def print_to_log_file(self, *args, also_print_to_console=True, add_timestamp=Tru
             dt_object = datetime.fromtimestamp(timestamp)
 
             if add_timestamp:
-                args = ("%s:" % dt_object, *args)
+                args = (f"{dt_object}:", *args)
 
             successful = False
             max_attempts = 5
@@ -440,7 +442,7 @@ def print_to_log_file(self, *args, also_print_to_console=True, add_timestamp=Tru
                         f.write("\n")
                     successful = True
                 except IOError:
-                    print("%s: failed to log: " % datetime.fromtimestamp(timestamp), sys.exc_info())
+                    print(f"{datetime.fromtimestamp(timestamp)}: failed to log: ", sys.exc_info())
                     sleep(0.5)
                     ctr += 1
             if also_print_to_console:
@@ -539,7 +541,7 @@ def do_split(self):
             else:
                 self.print_to_log_file("Using splits from existing split file:", splits_file)
                 splits = load_json(splits_file)
-                self.print_to_log_file("The split file contains %d splits." % len(splits))
+                self.print_to_log_file(f"The split file contains {len(splits)} splits.")
 
             self.print_to_log_file("Desired fold for training: %d" % self.fold)
             if self.fold < len(splits):
@@ -829,7 +831,12 @@ def on_train_start(self):
         # print(f"oversample: {self.oversample_foreground_percent}")
 
     def on_train_end(self):
+        # dirty hack because on_epoch_end increments the epoch counter and this is executed afterwards.
+        # This will lead to the wrong current epoch to be stored
+        self.current_epoch -= 1
         self.save_checkpoint(join(self.output_folder, "checkpoint_final.pth"))
+        self.current_epoch += 1
+
         # now we can delete latest
         if self.local_rank == 0 and isfile(join(self.output_folder, "checkpoint_latest.pth")):
             os.remove(join(self.output_folder, "checkpoint_latest.pth"))
@@ -867,7 +874,7 @@ def train_step(self, batch: dict) -> dict:
         else:
             target = target.to(self.device, non_blocking=True)
 
-        self.optimizer.zero_grad()
+        self.optimizer.zero_grad(set_to_none=True)
         # Autocast is a little bitch.
         # If the device_type is 'cpu' then it's slow as heck and needs to be disabled.
         # If the device_type is 'mps' then it will complain that mps is not implemented, even if enabled=False is set. Whyyyyyyy. (this is why we don't make use of enabled=False)
@@ -928,7 +935,7 @@ def validation_step(self, batch: dict) -> dict:
         target = target[0]
 
         # the following is needed for online evaluation. Fake dice (green line)
-        axes = [0] + list(range(2, len(output.shape)))
+        axes = [0] + list(range(2, output.ndim))
 
         if self.label_manager.has_regions:
             predicted_segmentation_onehot = (torch.sigmoid(output) > 0.5).long()
diff --git a/nnunetv2/training/nnUNetTrainer/variants/network_architecture/nnUNetTrainerBN.py b/nnunetv2/training/nnUNetTrainer/variants/network_architecture/nnUNetTrainerBN.py
index b2f26e2d6..5f6190c1b 100644
--- a/nnunetv2/training/nnUNetTrainer/variants/network_architecture/nnUNetTrainerBN.py
+++ b/nnunetv2/training/nnUNetTrainer/variants/network_architecture/nnUNetTrainerBN.py
@@ -45,7 +45,7 @@ def build_network_architecture(plans_manager: PlansManager,
                                                                   'is non-standard (maybe your own?). Yo\'ll have to dive ' \
                                                                   'into either this ' \
                                                                   'function (get_network_from_plans) or ' \
-                                                                  'the init of your nnUNetModule to accomodate that.'
+                                                                  'the init of your nnUNetModule to accommodate that.'
         network_class = mapping[segmentation_network_class_name]
 
         conv_or_blocks_per_stage = {
diff --git a/nnunetv2/training/nnUNetTrainer/variants/network_architecture/nnUNetTrainerNoDeepSupervision.py b/nnunetv2/training/nnUNetTrainer/variants/network_architecture/nnUNetTrainerNoDeepSupervision.py
index a07ff8ab1..34f9b554f 100644
--- a/nnunetv2/training/nnUNetTrainer/variants/network_architecture/nnUNetTrainerNoDeepSupervision.py
+++ b/nnunetv2/training/nnUNetTrainer/variants/network_architecture/nnUNetTrainerNoDeepSupervision.py
@@ -62,7 +62,7 @@ def validation_step(self, batch: dict) -> dict:
         else:
             target = target.to(self.device, non_blocking=True)
 
-        self.optimizer.zero_grad()
+        self.optimizer.zero_grad(set_to_none=True)
 
         # Autocast is a little bitch.
         # If the device_type is 'cpu' then it's slow as heck and needs to be disabled.
@@ -74,7 +74,7 @@ def validation_step(self, batch: dict) -> dict:
             l = self.loss(output, target)
 
         # the following is needed for online evaluation. Fake dice (green line)
-        axes = [0] + list(range(2, len(output.shape)))
+        axes = [0] + list(range(2, output.ndim))
 
         if self.label_manager.has_regions:
             predicted_segmentation_onehot = (torch.sigmoid(output) > 0.5).long()
diff --git a/nnunetv2/utilities/dataset_name_id_conversion.py b/nnunetv2/utilities/dataset_name_id_conversion.py
index 1f2c35078..29ea58ab2 100644
--- a/nnunetv2/utilities/dataset_name_id_conversion.py
+++ b/nnunetv2/utilities/dataset_name_id_conversion.py
@@ -70,5 +70,5 @@ def maybe_convert_to_dataset_name(dataset_name_or_id: Union[int, str]) -> str:
         except ValueError:
             raise ValueError("dataset_name_or_id was a string and did not start with 'Dataset' so we tried to "
                              "convert it to a dataset ID (int). That failed, however. Please give an integer number "
-                             "('1', '2', etc) or a correct tast name. Your input: %s" % dataset_name_or_id)
-    return convert_id_to_dataset_name(dataset_name_or_id)
\ No newline at end of file
+                             "('1', '2', etc) or a correct dataset name. Your input: %s" % dataset_name_or_id)
+    return convert_id_to_dataset_name(dataset_name_or_id)
diff --git a/nnunetv2/utilities/file_path_utilities.py b/nnunetv2/utilities/file_path_utilities.py
index 611f6e24d..a1c962265 100644
--- a/nnunetv2/utilities/file_path_utilities.py
+++ b/nnunetv2/utilities/file_path_utilities.py
@@ -39,10 +39,10 @@ def parse_dataset_trainer_plans_configuration_from_path(path: str):
         assert len(folders[:idx]) >= 2, 'Bad path, cannot extract what I need. Your path needs to be at least ' \
                                         'DatasetXXX/MODULE__PLANS__CONFIGURATION for this to work'
         if folders[idx - 2].startswith('Dataset'):
-            splitted = folders[idx - 1].split('__')
-            assert len(splitted) == 3, 'Bad path, cannot extract what I need. Your path needs to be at least ' \
+            split = folders[idx - 1].split('__')
+            assert len(split) == 3, 'Bad path, cannot extract what I need. Your path needs to be at least ' \
                                         'DatasetXXX/MODULE__PLANS__CONFIGURATION for this to work'
-            return folders[idx - 2], *splitted
+            return folders[idx - 2], *split
     else:
         # we can only check for dataset followed by a string that is separable into three strings by splitting with '__'
         # look for DatasetXXX
@@ -51,10 +51,10 @@ def parse_dataset_trainer_plans_configuration_from_path(path: str):
             idx = dataset_folder.index(True)
             assert len(folders) >= (idx + 1), 'Bad path, cannot extract what I need. Your path needs to be at least ' \
                                         'DatasetXXX/MODULE__PLANS__CONFIGURATION for this to work'
-            splitted = folders[idx + 1].split('__')
-            assert len(splitted) == 3, 'Bad path, cannot extract what I need. Your path needs to be at least ' \
+            split = folders[idx + 1].split('__')
+            assert len(split) == 3, 'Bad path, cannot extract what I need. Your path needs to be at least ' \
                                        'DatasetXXX/MODULE__PLANS__CONFIGURATION for this to work'
-            return folders[idx], *splitted
+            return folders[idx], *split
 
 
 def get_ensemble_name(model1_folder, model2_folder, folds: Tuple[int, ...]):
diff --git a/nnunetv2/utilities/get_network_from_plans.py b/nnunetv2/utilities/get_network_from_plans.py
index 447d1d5e9..1dd1dd2ec 100644
--- a/nnunetv2/utilities/get_network_from_plans.py
+++ b/nnunetv2/utilities/get_network_from_plans.py
@@ -49,7 +49,7 @@ def get_network_from_plans(plans_manager: PlansManager,
                                                               'is non-standard (maybe your own?). Yo\'ll have to dive ' \
                                                               'into either this ' \
                                                               'function (get_network_from_plans) or ' \
-                                                              'the init of your nnUNetModule to accomodate that.'
+                                                              'the init of your nnUNetModule to accommodate that.'
     network_class = mapping[segmentation_network_class_name]
 
     conv_or_blocks_per_stage = {
diff --git a/nnunetv2/utilities/json_export.py b/nnunetv2/utilities/json_export.py
index faed954f4..5ea463c27 100644
--- a/nnunetv2/utilities/json_export.py
+++ b/nnunetv2/utilities/json_export.py
@@ -18,7 +18,7 @@ def recursive_fix_for_json_export(my_dict: dict):
         if isinstance(my_dict[k], dict):
             recursive_fix_for_json_export(my_dict[k])
         elif isinstance(my_dict[k], np.ndarray):
-            assert len(my_dict[k].shape) == 1, 'only 1d arrays are supported'
+            assert my_dict[k].ndim == 1, 'only 1d arrays are supported'
             my_dict[k] = fix_types_iterable(my_dict[k], output_type=list)
         elif isinstance(my_dict[k], (np.bool_,)):
             my_dict[k] = bool(my_dict[k])
diff --git a/nnunetv2/utilities/label_handling/label_handling.py b/nnunetv2/utilities/label_handling/label_handling.py
index 333296d18..58b2513ce 100644
--- a/nnunetv2/utilities/label_handling/label_handling.py
+++ b/nnunetv2/utilities/label_handling/label_handling.py
@@ -50,7 +50,7 @@ def __init__(self, label_dict: dict, regions_class_order: Union[List[int], None]
 
     def _sanity_check(self, label_dict: dict):
         if not 'background' in label_dict.keys():
-            raise RuntimeError('Background label not declared (remeber that this should be label 0!)')
+            raise RuntimeError('Background label not declared (remember that this should be label 0!)')
         bg_label = label_dict['background']
         if isinstance(bg_label, (tuple, list)):
             raise RuntimeError(f"Background label must be 0. Not a list. Not a tuple. Your background label: {bg_label}")
@@ -157,7 +157,7 @@ def convert_probabilities_to_segmentation(self, predicted_probabilities: Union[n
             # check correct number of outputs
         assert predicted_probabilities.shape[0] == self.num_segmentation_heads, \
             f'unexpected number of channels in predicted_probabilities. Expected {self.num_segmentation_heads}, ' \
-            f'got {predicted_probabilities.shape[0]}. Remeber that predicted_probabilities should have shape ' \
+            f'got {predicted_probabilities.shape[0]}. Remember that predicted_probabilities should have shape ' \
             f'(c, x, y(, z)).'
 
         if self.has_regions:
diff --git a/nnunetv2/utilities/overlay_plots.py b/nnunetv2/utilities/overlay_plots.py
index 8b0b9d1ac..6f90a5ac9 100644
--- a/nnunetv2/utilities/overlay_plots.py
+++ b/nnunetv2/utilities/overlay_plots.py
@@ -65,9 +65,9 @@ def generate_overlay(input_image: np.ndarray, segmentation: np.ndarray, mapping:
     # create a copy of image
     image = np.copy(input_image)
 
-    if len(image.shape) == 2:
+    if image.ndim == 2:
         image = np.tile(image[:, :, None], (1, 1, 3))
-    elif len(image.shape) == 3:
+    elif image.ndim == 3:
         if image.shape[2] == 1:
             image = np.tile(image, (1, 1, 3))
         else:
@@ -136,10 +136,10 @@ def plot_overlay(image_file: str, segmentation_file: str, image_reader_writer: B
     seg, props_seg = image_reader_writer.read_seg(segmentation_file)
     seg = seg[0]
 
-    assert all([i == j for i, j in zip(image.shape, seg.shape)]), "image and seg do not have the same shape: %s, %s" % (
+    assert image.shape == seg.shape, "image and seg do not have the same shape: %s, %s" % (
         image_file, segmentation_file)
 
-    assert len(image.shape) == 3, 'only 3D images/segs are supported'
+    assert image.ndim == 3, 'only 3D images/segs are supported'
 
     selected_slice = select_slice_to_plot2(image, seg)
     # print(image.shape, selected_slice)
diff --git a/pyproject.toml b/pyproject.toml
index 0669ccf3d..8d1369414 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -1,6 +1,6 @@
 [project]
 name = "nnunetv2"
-version = "2.1.2"
+version = "2.2"
 requires-python = ">=3.9"
 description = "nnU-Net is a framework for out-of-the box image segmentation."
 readme = "readme.md"
@@ -72,8 +72,8 @@ nnUNetv2_ensemble = "nnunetv2.ensembling.ensemble:entry_point_ensemble_folders"
 nnUNetv2_accumulate_crossval_results = "nnunetv2.evaluation.find_best_configuration:accumulate_crossval_results_entry_point"
 nnUNetv2_plot_overlay_pngs = "nnunetv2.utilities.overlay_plots:entry_point_generate_overlay"
 nnUNetv2_download_pretrained_model_by_url = "nnunetv2.model_sharing.entry_points:download_by_url"
-nnUNetv2_install_pretrained_model_from_zip = "nnunetv2.model_sharing.entry_points:install_from_zip_entry_poin"
-nnUNetv2_export_model_to_zip = "nnunetv2.model_sharing.entry_points:export_pretrained_model_entr"
+nnUNetv2_install_pretrained_model_from_zip = "nnunetv2.model_sharing.entry_points:install_from_zip_entry_point"
+nnUNetv2_export_model_to_zip = "nnunetv2.model_sharing.entry_points:export_pretrained_model_entry"
 nnUNetv2_move_plans_between_datasets = "nnunetv2.experiment_planning.plans_for_pretraining.move_plans_between_datasets:entry_point_move_plans_between_datasets"
 nnUNetv2_evaluate_folder = "nnunetv2.evaluation.evaluate_predictions:evaluate_folder_entry_point"
 nnUNetv2_evaluate_simple = "nnunetv2.evaluation.evaluate_predictions:evaluate_simple_entry_point"
@@ -84,4 +84,9 @@ dev = [
     "black",
     "ruff",
     "pre-commit"
-]
\ No newline at end of file
+]
+
+[tool.codespell]
+skip = '.git,*.pdf,*.svg'
+#
+# ignore-words-list = ''
diff --git a/readme.md b/readme.md
index 1b6ba5f1f..d79c58c4b 100644
--- a/readme.md
+++ b/readme.md
@@ -96,9 +96,12 @@ Additional information:
 - [Extending nnU-Net](documentation/extending_nnunet.md)
 - [What is different in V2?](documentation/changelog.md)
 
+Competitions:
+- [AutoPET II](documentation/competitions/AutoPETII.md)
+
 [//]: # (- [Ignore label]&#40;documentation/ignore_label.md&#41;)
 
-## Where does nnU-net perform well and where does it not perform?
+## Where does nnU-Net perform well and where does it not perform?
 nnU-Net excels in segmentation problems that need to be solved by training from scratch, 
 for example: research applications that feature non-standard image modalities and input channels,
 challenge datasets from the biomedical domain, majority of 3D segmentation problems, etc . We have yet to find a