Aerial Image Labeling addresses a core topic in remote sensing: the automatic pixel-wise labelling of aerial imagery. The UNet leads to more advanced design in Aerial Image Segmentation. Future updates will gradually apply those methods into this repository.
This repo used only one sample (kitsap11.tif
) from the public dataset (Inria Aerial Image Labelling ) to demonstrate the power of deep learning.
Data processing steps and codes can be find in this file. The original sample has been preprocessed into 1000x1000 with 1.5 meter resolution.
Anaconda
: Your data science toolkit.mini-conda
: a free minimal installer for conda. It is a small, bootstrap version of Anaconda that includes only conda,
Create a new virtual environment to install the required libraries, use 'conda' or 'pyenv' with the following Python Packages:
- numpy==1.19.2
- matplotlib==3.3.2
- pillow==8.10
- pytorch==1.7.1
- torchsummary==1.5.1
- torchvision==0.8.2
- tqdm==4.51.0
- opencv-python==4.5.1
You can start the App with GUI with the following command:
python app_gui.py
# or pythonw GUI_DL_AIM.py # on MacOS
The GUI is designed with respect to the original python argparse
setting with Gooey Packages.
- Data IO: input and output file and paths.
- Model Options: Time, Date, GPU, Pretrained model, Epochs, Batch Size, Learning Rate, Regularization.
- Cancel/ Start Button to Execute the program.
Notification: When Epochs Number is 0, it will load the pre-trained model to predict the masks only without training.
I test this repo with my XPS 15 Intel-i7-CPU with GPU-1050TI-maxQ-4GB-RAM & CUDA 11. ** It is fair to say GPU is at least 10 times faster than the CPU.**
Here are the training experimental Records on the sample image pair:
Metric | 5 Epochs | 100 Epochs | 200 Epochs |
---|---|---|---|
# Accuracy | 0.801270 | 0.937253 | 0.963028 |
# Sensitivity | 0.811900 | 0.860760 | 0.686358 |
# Precision | 0.260489 | 0.570543 | 0.820489 |
# Specificity | 0.800349 | 0.943879 | 0.986993 |
# Fmeasure | 0.394429 | 0.686229 | 0.747454 |
# MCC | 0.383756 | 0.670022 | 0.731054 |
# Dice | 0.394429 | 0.686229 | 0.747454 |
# IoU (Jacard) | 0.245663 | 0.522335 | 0.596748 |
$ python train.py --epoch 1 # 1 (default) or 200
or add arguments for different model inference configurations.
$ python train.py -h
usage: AIL: Aerial Image Labelling [options]
Aerial Image Labelling with Deep Learning.
optional arguments:
-h, --help show this help message and exit
--input_RGB Input RGB Images:
string of RGB image file path
--input_GT INPUT_GT string of Ground Truce (GT image file path
--output_model_path OUTPUT_MODEL_PATH
--output_loss_plot OUTPUT_LOSS_PLOT
--output_images OUTPUT_IMAGES
string of output image file path
--version, -v show program's version number and exit
--use_gpu USE_GPU Use GPU in Traning the Model (default: False)
--use_pretrain USE_PRETRAIN
Use Pre-Trained ConvNets in Traning the Model (default: True)
--tile_size TILE_SIZE TILE_SIZE
input tile size
--epochs EPOCHS epoch number
--batch_size BATCH_SIZE
batch size (?, channel, width, height)
--learning_rate LEARNING_RATE
model training learning rate
--weight_decay WEIGHT_DECAY
model weight decay / l2 regularization
And that's how AI can help
Output
Epoch 1/200: 100%|██████████████████████| 223/223 [01:53<00:00, 1.96it/s, loss=0.653383]
Epoch 2/200: 100%|██████████████████████| 223/223 [01:47<00:00, 2.07it/s, loss=0.461838]
Epoch 3/200: 100%|██████████████████████| 223/223 [01:53<00:00, 1.97it/s, loss=0.445231]
... ...
(i) Model saved at ./weights/model.pt
(i) Loss plot saved at ./images/output/loss_plot.png
Use the training data as input to test the model.
$ python predict.py
# python predict.py -h
Output
(base) ➜ aerial-image-segmentation git:(master) ✗ python predict.py
GPU is not available and using CPU instead.
use pre-trained model!
Image Labelling Complete in 0m 6s
(i) Prediction and Mask image saved at output//prediction.png
(ii) Mask image saved at output//mask.png
================================================================
[Metric Computation]
Accuracy => 0.963028
Sensitivity => 0.686358
Precision => 0.820489
Specificity => 0.986993
Fmeasure => 0.747454
MCC => 0.731054
Dice => 0.747454
IoU (Jacard) => 0.596748
----------------------------------------------------------------
model start: 22:16:56 end: 22:17:03.
RGB, GT, Mask, Prediction:
Binary with mask:
This is a binary mask, you can see there are extra works needed to improve the results. The improments can be from the Deep Learning model
or the Image Postprocessing method
.
Add UNet Model:
(rs) ➜ aerial-image-segmentation git:(master) ✗ python predict.py
GPU is not available and using CPU instead.
use pretrained model!
Image Labelling Complete in 0m 15s
(i) Prediction and Mask image saved at output//prediction.png
(ii) Mask image saved at output//mask.png
================================================================
[Metric Computation]
Accuracy => 0.976881
Sensitivity => 0.816324
Precision => 0.884739
Specificity => 0.990788
Fmeasure => 0.849156
MCC => 0.837453
Dice => 0.849156
IoU (Jacard) => 0.737855
----------------------------------------------------------------
model start: 18:13:42 end: 18:13:57.
- Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation, Liang-Chieh Chen, Yukun Zhu, George Papandreou, Florian Schroff, and Hartwig Adam, arXiv: 1802.02611, 2018.
- Xception: Deep Learning with Depthwise Separable Convolutions, François Chollet, Proc. of CVPR, 2017.
- Deformable Convolutional Networks — COCO Detection and Segmentation Challenge 2017 Entry, Haozhi Qi, Zheng Zhang, Bin Xiao, Han Hu, Bowen Cheng, Yichen Wei, and Jifeng Dai, ICCV COCO Challenge Workshop, 2017.
- Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs, Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L. Yuille, Proc. of ICLR, 2015.
- Deeplab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs, Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L. Yuille, TPAMI, 2017.
- Rethinking Atrous Convolution for Semantic Image Segmentation, Liang-Chieh Chen, George Papandreou, Florian Schroff, and Hartwig Adam, arXiv:1706.05587, 2017.
This repository is based on the original repository by Roman Roibu. Improvements and changes have been made due to the rapid development in deep learning community.