Skip to content

Latest commit

 

History

History
170 lines (116 loc) · 7.66 KB

README.md

File metadata and controls

170 lines (116 loc) · 7.66 KB

In-silico image generation through SDM for Nuclei Segmentation

This work extends on "Semantic Image Synthesis via Diffusion Models" for histopathological synthetic image generation.

model

Abstract

This repository provides a PyTorch implemntation for synthetic H&E stained image generation. . It modifies the architecture of the semantic diffusion model for style transfer, enabling the generation of synthetic images with diverse texture and structural information using the MoNuSeg dataset. Experimental results show that training segmentation models on this synthetic dataset improve performance, yielding a 1.8% increase in F1 score for full-sized datasets and a 7.5% improvement for smaller datasets.

Example Results

Prerequisites

  • Linux
  • Python 3
  • CPU or NVIDIA GPU + CUDA CuDNN

Setup

Data preparation

Download the data following the instructions in the MoNuSeg Dataset repository.

Steps

The data generation step has 4 stages :

  1. Feature extraction & Clustering
  2. SDM Training
  3. SDM inference
  4. OOD filtering

The training and inference steps are described in this section. The other steps are explained in the respective repositories.

Installation

Build the docker image and then run it as follows.

docker build -t semantic-diffusion-model -f Dockerfile .

docker run --gpus all --rm -it \
  -v $PATH_TO_DATASET:/mnt/dataset 
  --name patchcore patchcore bash

Alternatively install the necessary packages manually without docker by using the requirements file.

Training

  • The model is pretrained on the entire dataset
OPENAI_LOGDIR='OUTPUT/SDM-MoNuSeg' \
OPENAI_LOG_FORMAT='stdout,log,csv,tensorboard' \
mpiexec -n 8 python image_train.py \
 --data_dir $training_data --dataset_mode monuseg --image_size 128 \
 --lr 1e-4 --batch_size 8  --attention_resolutions 32,16,8 --diffusion_steps 1000 \
 --learn_sigma True --noise_schedule cosine --num_channels 128 --num_head_channels 32 --num_res_blocks 2 \
 --resblock_updown True --use_fp16 True --use_scale_shift_norm True --use_checkpoint True \
 --num_classes 6 --class_cond False --use_hv_map True --use_col_map True --no_instance False \
 --save_interval 005000
  • The model is then fine tuned on a single cluster
OPENAI_LOGDIR='OUTPUT/SDM-MoNuSeg-clus6' \
OPENAI_LOG_FORMAT='stdout,log,csv,tensorboard' \
mpiexec -n 8 python image_train.py \
 --data_dir $training_data_clus6 --dataset_mode monuseg --image_size 128 \
 --lr 1e-4 --batch_size 8  --attention_resolutions 32,16,8 --diffusion_steps 1000 \
 --learn_sigma True --noise_schedule cosine --num_channels 128 --num_head_channels 32 --num_res_blocks 2 \
 --resblock_updown True --use_fp16 True --use_scale_shift_norm True --use_checkpoint True \
 --num_classes 6 --class_cond False --use_hv_map True --use_col_map True --no_instance False \
 --save_interval 005000 --drop_rate .2

Refer to the train.sh script for further details.

Inference

The model is then tested to generate synthetic images

python image_sample.py \
--data_dir $training_data_clus6 --dataset_mode monuseg --use_train True --image_size 128  \
--shuffle_masks False --match_struct True --match_app False \
--attention_resolutions 32,16,8 --diffusion_steps 1000 \
--learn_sigma True --noise_schedule cosine --num_channels 128 --num_head_channels 32 --num_res_blocks 2 \
--resblock_updown True --use_fp16 True --use_scale_shift_norm True \
--num_classes 6 --class_cond False --use_hv_map True --use_col_map True --no_instance False \
--batch_size 16 --num_samples 1000 \
--model_path $model_path \
--results_path $out_dir --s 1.5

Refer to the sample_single.sh script for further details.

Synthetic set generation

The final synthetic set is generated using multiple combination of guidance scale (s-value), and training iteration points. Refer to the sample.sh script for further details.

Citation

Please cite the original Paper of SDM if you find this work useful.