Skip to content

Scene Classification of 365 Scenes by fine-tuning Vision Transformer on Places365 Standard dataset

License

Notifications You must be signed in to change notification settings

kishanmurthy/scene-recognition

Repository files navigation

Scene Recognition

Classification of 365 Scenes by fine-tuning Vision Transformer on Places365 Standard dataset.

Example Scenes

The Places365 dataset contains 365 Scenes. Below are a few example scenes.

Images Labels
baseball_field_00000517 Baseball Field
balcony-interior_00003593 Balcony Interior
embassy_00002301 Embassy
fire_escape_00000768 Fire Escape
kitchen_00003271 Kitchen
lake-natural_00004559 Lake Natural
skyscraper_00004143 Skyscraper
office_cubicles_00000716 Office Cubicles
reception_00002637 Reception

Training

The Vision Transformer model was finetuned on the Places365 dataset with the hyperparmeters as follows:

Hyperparameter Value
Batch Size 32
Learning Rate 2e-4
Optimizer AdamW
No of Epochs 5
Evaluate Validation
after
5000
batches

The metrics were logged with the help of Weights and Biases. This specific run can be found here.

Screenshot 2022-10-31 at 10 44 22 AM

Screenshot 2022-10-31 at 10 38 15 AM

Evaluation on the Test dataset

The Trained Vision Transformer Model was evaluated on the Places365 test dataset and obtained the following results:

Metric Value
AUROC 98.90
Accuracy Top 5 83.52
Accuracy Top 1 52.47
F1-Score 51.71
Precision 52.70
Recall 52.47

How to Run

  1. Download the Places365 Standard dataset.
  2. Install the requirements from requirements.txt.
  3. Update the paths of
    • DATASET_TRAIN_PATH : Path of Places365 Standard training dataset.
    • DATASET_TEST_PATH : Path of Places365 Standard validation dataset.
    • DATASET_MAPPINGS_PATH : Path to store the dataset mappings for train and test datasets.
    • WANDB_PATH : Path to initialize Weights and Bias runs.
  4. Run preprocess_dataset.py to create a mapping of images.
  5. Train the model by running the train.py script.
  6. Evaluate the model on the test dataset by running the evaluate_test.py script.

About

Scene Classification of 365 Scenes by fine-tuning Vision Transformer on Places365 Standard dataset

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages