This project is a framework for benchmarking several state-of-the-art synthetic image detection models.
✨ 12/3/2024 Integrate DeFake model
Create a new environment named pytorch_env
:
conda create --name pytorch_env python=3.11
conda activate pytorch_env
and then install pytorch dependecies
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
Note ✨: This framework has been tested with Python version
3.11.7
and PyTorch2.1.2
. However, it should work with other versions as well. Please ensure that your PyTorch version is greater than 2.0.⚠️
Install additional dependencies:
pip install -r requirements.txt
To run the Dire model, mpi4py is required. You can install it using Conda with the following command:
conda install -c conda-forge mpi4py mpich
The following models have been integrated. Note that for some of these models, there are multiple pretrained instances, trained on different images generated by various generative models such as ProGAN, StyleGAN, and Latent Diffusion, etc.
Model Name | Paper Title | Original Code |
---|---|---|
CNNDetect | CNN-generated images are surprisingly easy to spot...for now | 🔗 |
DIMD | On the detection of synthetic images generated by diffusion models. | 🔗 |
FreDetect | Leveraging Frequency Analysis for Deep Fake Image Recognition | 🔗 |
Fusing | Fusing global and local features for generalized AI-synthesized image detection | 🔗 |
GramNet | Global Texture Enhancement for Fake Face Detection In the Wild | 🔗 |
LGrad | Learning on Gradients: Generalized Artifacts Representation for GAN-Generated Images Detection | 🔗 |
Dire | DIRE for Diffusion-Generated Image Detection | 🔗 |
UnivFD | Towards Universal Fake Image Detectors that Generalize Across Generative Models | 🔗 |
NPR | Rethinking the Up-Sampling Operations in CNN-based Generative Network for Generalizable Deepfake Detection | 🔗 |
PatchCraft | PatchCraft: Exploring Texture Patch for Efficient AI-generated Image Detection | 🔗 |
DeFake | DE-FAKE: Detection and Attribution of Fake Images Generated by Text-to-Image Generation Models | 🔗 |
Rine | Leveraging Representations from Intermediate Encoder-blocks for Synthetic Image Detection | 🔗 |
To run the model on a directory containing images, use the following command:
python test.py --dataPath <root_path_to_images>
This command executes the model using the default selection, which is UnivFD. If you wish to use a different model, you can specify it using the --modelName
flag. For example, to use the CNNDetect model, the command would be:
python test.py --dataPath <root_path_to_images> --modelName=CNNDetect
The models supported by this framework are listed in the table above. When selecting a model using the --modelName
flag, ensure you use one of the valid names as specified below. These names correspond to the models' implementations and must be used exactly as shown to ensure proper function invocation:
VALID_MODELS = ['CNNDetect', 'FreqDetect', 'Fusing', 'GramNet', 'LGrad', 'UnivFD', 'PatchCraft', 'Rine', 'DIMD', 'NPR', 'Dire']
You need also to define the path to the pretrained weights with the --cptk
flag. Ensure you replace <path_to_pretrained_weights>
with the actual file path to your pretrained model weights.
python test.py --dataPath <root_path_to_images> --modelName=CNNDetect --cptk <path_to_pretrained_weights>
Replace <root_path_to_images>
with the actual path to your directory of images, and <path_to_pretrained_weights>
with the path to the pretrained weights file.
You can download the pretrained weights here: Google Drive
Model Name | Pretrained Weights File Name | Trained On |
---|---|---|
CNNDetect | weights/cnndetect/blur_jpg_prob0.1.pth | proGAN augmented (recompressed) with 10% probability |
weights/cnndetect/blur_jpg_prob0.5.pth | proGAN augmented (recompressed) with 50% probability | |
DIMD | weights/dimd/corvi22_latent_model.pth | Latent Diffusion |
weights/dimd/corvi22_progan_model.pth | proGAN | |
weights/dimd/gandetection_resnet50nodown_stylegan2.pth | styleGAN2 images | |
Dire | weights/dire/lsun_adm.pth | adm (diffusion) |
weights/dire/lsun_iddpm.pth | iddpm (Denoising diffusion probabilistic model) | |
weights/dire/lsun_pndm.pth | pndm (Pseudo Numerical Methods for Diffusion Models) | |
weights/dire/lsun_stylegan.pth | stylegan | |
FreqDetect | weights/freqdetect/DCTAnalysis.pth | |
UnivFD | weights/univfd/fc_weights..pth | proGAN |
Fusing | weights/fusing/PSM.pth | |
GramNet | weights/gramnet/Gram.pth | |
LGrad | weights/lgrad/LGrad-1class-Trainon-Progan_horse.pth | proGAN with one class images |
weights/lgrad/LGrad-2class-Trainon-Progan_chair_horse.pth | proGAN with two class images | |
weights/lgrad/LGrad-4class-Trainon-Progan_car_cat_chair_horse.pth | proGAN with four class images | |
NPR | weights/npr/NPR.pth | |
DeFake | weights/defake/clip_linear.pth | hybrid detector (image + text) tranined in diffusion images |
Rine | weights/rine/model_1class_trainable.pth | proGAN with one class images |
weights/rine/model_2class_trainable.pth | proGAN with two class images | |
weights/rine/model_4class_trainable.pth | proGAN with four class images | |
weights/rine/model_ldm_trainable.pth | latent diffusion with one class images | |
PatchCraft | weights/rptc/RPTC.pth | proGAN |
Some models require additional parameters to be defined.
FreqDetect requires two additional files, which are to be specified by the flags --dctMean
and --dctVar
. By default, these are set to ./weights/freqdetect/dct_mean
and ./weights/freqdetect/dct_var
, respectively. If you have downloaded the weights directory and placed it in the root directory of the framework, these parameters can remain unchanged.
LGrad requires the initialization of a StyleGAN discriminator, which is used to extract image gradients serving as image features. To specify the path to the pretrained discriminator, use the flag --LGradGenerativeModelPath
. The default pretrained weights are provided in the file karras2019stylegan-bedrooms-256x256_discriminator.pth
, located within the ./weights/preprocessing
directory.
Dire requires the initialization of a diffusion generative model that extracts image features. To specify the path to the pretrained discriminator, use the flag --DireGenerativeModelPath
. The default pretrained weights are provided in the file lsun_bedroom.pt
, located within the ./weights/preprocessing
directory.
DeFake requires the initialization of a fine-tuned CLIP encoder (--defakeClipEncoderPath=./weights/defake/finetune_clip.pt
) and a BLIP decoder for the generation of images captions (--defakeBlipPath=./weights/defake/model_base_capfilt_large.pth
).
To save the predictions, specify an output file using the --predictionsFile
flag. For example:
python test.py --dataPath <root_path_to_images> --predictionsFile <path_to_output_file>
To resize images, use the --resizeSize
flag followed by the desired dimension. f no resize size is specified, the default behavior is to apply no resizing. For example, to resize images to 256x256 pixels, you would use:
--resizeSize=256
Important Note on Resizing: The impact of resizing on the results cannot be overstated. For certain models, resizing can significantly improve outcomes, while for others, it may detract from performance. This effect is closely tied to the original resolution of the input images. Specifically, in the case of high-resolution images, resizing becomes a crucial factor to consider. For a deeper insight into how resizing affects model performance, please refer to the Evaluation Section.
To crop images, utilize the --cropSize
flag similarly. The default crop size is set to 256 pixels, meaning if no crop size is specified, images will be cropped to 256x256 pixels by default. For example, to crop images to 256x256 pixels, the correct flag is:
--cropSize=256
⚠️ Important Note: For models such asUnivFD
andRine
, which are based on CLIP, the input size must be set to 224x224 pixels due to CLIP's specific input size requirements. Therefore, use--resizeSize=224
for these models.
You can run evaluations and save the results by executing the following command:
python validate.py --modelName=UnivFD --ckpt=./weights/univfd/fc_weights.pth --resultFolder=results/
This command uses the UnivFD
model with its corresponding pretrained weights located at ./weights/univfd/fc_weights.pth
, and saves the results in the results/
folder.
You can utilize the same list of models and pretrained weights, adjusting the --modelName
and --ckpt
flags as needed. Similarly, you can modify the behavior for cropping and resizing according to your requirements.
In addition, you can use the following flags to further process the images:
--jpegQuality
: Applies JPEG recompression with the defined quality level e.g. 95, 90, 50, ...
--gaussianSigma
: Applies Gaussian blurring with the defined sigma value e.g. 1,2,4.
These flags allow for additional image transformation during evaluation, potentially influencing the model's performance based on your specific experimental setup.
To specify the input paths for your evaluation, utilize the following options:
--realPath
: Use this flag to define the path to the directory containing real images.
--fakePath
: Use this flag to define the path to the directory containing fake (generated) images.
For enhanced post-processing convenience, consider including the following flags in your metrics file generation:
--generativeModel
: Use this flag to specify the generative model used for creating the fake images e.g. proGAN.
--family
: Utilize this flag to denote the family or category of the generative model e.g. GAN.
By incorporating these flags, you can enrich your metrics files with valuable context, making subsequent analysis and interpretation of results more straightforward and insightful.
Alternatively, if you prefer a singular input path, you can use the --dataPath
flag. However, ensure that this directory contains subdirectories named 0_real
and 1_fake
for organizing real and fake images, respectively. The dataPath can contain multiple 0_real
and 1_fake
subdirecctories.
⚠️ Note: If your subdirectories are generated using different generative models and you wish to analyze the results separately for each model, be aware that the evaluation process aggregates all input from the directories under dataPath. This means you won't be able to automatically distinguish results by generative algorithm based solely on subdirectory structure.
To conduct evaluations across various generative algorithms and accurately monitor the performance of each, it's recommended to bypass the use of --realPath
, --fakePath
, and --dataPath
flags. Instead, direct modifications should be made to the dataset configuration file:
Edit the dataset/dataset_paths.py file to specify the paths to your datasets, alongside information about generative model anf family.
For example:
DATASET_PATHS = [
dict(
real_path='/path/to/real/images/for/this/dataset',
fake_path='/path/to/images/generated/by/biggan',
source='wang2020',
family='gan',
generative_model='biggan'
),
dict(
real_path='/path/to/real/images/for/this/dataset',
fake_path='/path/to/images/generated/by/cyclegan',
source='wang2020',
family='gan',
generative_model='cyclegan'
),
...
]
source
is a reference to the source of the dataset. In the abobe example source wang2020 dataset used for training and testing the CNNDetect method.
If you use our framework in your research, please cite our paper:
Schinas, M., & Papadopoulos, S. (2024). SIDBench: A Python Framework for Reliably Assessing Synthetic Image Detection Methods. MAD ?24, June 10, 2024, Phuket, Thailand. arXiv preprint arXiv:2404.18552.
Manos Schinas ([email protected])
Symeon (Akis) Papadopoulos ([email protected])