Skip to content

The classification of remote sensing scene images using differentiable parameterised transformer network (i.e., spatial transformer network), second-order statistics of local CNN features (i.e., Bilinear CNN, covariance matrix, SPD matrix, Riemannian manifold, Stiefel manifold, etc.), and the irreducible invariant tensor representation of the fi…

Notifications You must be signed in to change notification settings

a14en9/Remote-Sensing-Image-Classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RTN, MG-CAP and IDCCP

The repository exhibits the re-implementation of main scripts of three papers, the core of which is to summarise the second-order statistical data of local CNN features to gradually improve the performance of classifying remote sensing scene images. The frameworks included in the repository are shown as below:

"Recurrent Transformer Networks for Remote Sensing Scene Categorisation"-PDF:

"Multi-Granularity Canonical Appearance Pooling for Remote Sensing Scene Classification"-PDF:

"Invariant Deep Compressible Covariance Pooling for Aerial Scene Categorization"-PDF:

Dependencies

  • Python == 3.6
  • Tensorflow == 1.10.0
pip install -r requirements.txt

Data Preparation

Download Datasets

The experimental datasets can be downloaded from:

-AID:A Benchmark Dataset for Performance Evaluation of Aerial Scene Classification,

-UC Merced Land Use Dataset,

-NWPU-RESISC45,

-OPTIMAL-31.

Please cite their original papers if you are going to use these datasets.

Data Preprocessing

The downloaded datasets need to be placed in the corresponding directories in ../datasets/ and unzipped. Then, using create_tfrecord.py under the utils folder to split and process the data. An example of processing AID has been given in the script. Please note that the validation_size is set to 0.5001 since railwaystation_7.jpg fails to be encoded from my side.

Please download the checkpoints of pretrained VGG 16 and ResNet V1 50 from tensorflow.slim and place it to the net folder. Then, you need to generate three different scale network weights based on VGG16 (named them as vgg_16_1, vgg_16_2 and vgg_16_3 in example scripts), or call it three times during training.

Training and Test

Example training and testing codes can be found in models/. For simplicity, each model only shows one dataset partitioning scenario. The rest of the experiments are expected to be similar. Before training the MG-CAP model, please find the _SelfAdjointEigV2Grad function contained in the linalg_grad.py file of TensorFlow, and place the following code in the position before calculating grad_a (Only in the if statements) to avoid NaNs when solving the gradient of the degenerate covariance matrix during backpropagation:

# Replacing NaNs and inf with 0s
  f = tf.where(tf.is_nan(f), tf.zeros_like(f), f)
  f = tf.where(tf.is_inf(f), tf.zeros_like(f), f)

To Do

Experimental hyperparameters such as cropping percentages and more details are coming soon!

Acknowledgements

Some related repos include:

Thanks for the works of the original authors!

Citation

If you find the work useful, please kindly cite the following papers:

@inproceedings{chen2018recurrent,
            Author = {Chen, Zan and Wang, Shidong and Hou, Xingsong and Shao, Ling},
            Title = {Recurrent Transformer Networks for Remote Sensing Scene Categorisation},
            Booktitle = {in BMVC},
            Page = {266},
            Year = {2018}
          }
@article{wang2020multi,
        title={Multi-granularity canonical appearance pooling for remote sensing scene classification},
        author={Wang, Shidong and Guan, Yu and Shao, Ling},
        journal={IEEE Transactions on Image Processing},
        volume={29},
        pages={5396--5407},
        year={2020},
        publisher={IEEE}
      }
 @article{wang2020invariant,
        title={Invariant Deep Compressible Covariance Pooling for Aerial Scene Categorization},
        author={Wang, Shidong and Ren, Yi and Parr, Gerard and Guan, Yu and Shao, Ling},
        journal={IEEE Transactions on Geoscience and Remote Sensing},
        year={2020},
        publisher={IEEE}
      }

About

The classification of remote sensing scene images using differentiable parameterised transformer network (i.e., spatial transformer network), second-order statistics of local CNN features (i.e., Bilinear CNN, covariance matrix, SPD matrix, Riemannian manifold, Stiefel manifold, etc.), and the irreducible invariant tensor representation of the fi…

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages