Scene Recognition

Classification of 365 Scenes by fine-tuning Vision Transformer on Places365 Standard dataset.

Example Scenes

The Places365 dataset contains 365 Scenes. Below are a few example scenes.

The Vision Transformer model was finetuned on the Places365 dataset with the hyperparmeters as follows:

The metrics were logged with the help of Weights and Biases. This specific run can be found here.

The Trained Vision Transformer Model was evaluated on the Places365 test dataset and obtained the following results:

Download the Places365 Standard dataset.
Install the requirements from requirements.txt.
Update the paths of
- DATASET_TRAIN_PATH : Path of Places365 Standard training dataset.
- DATASET_TEST_PATH : Path of Places365 Standard validation dataset.
- DATASET_MAPPINGS_PATH : Path to store the dataset mappings for train and test datasets.
- WANDB_PATH : Path to initialize Weights and Bias runs.
Run preprocess_dataset.py to create a mapping of images.
Train the model by running the train.py script.
Evaluate the model on the test dataset by running the evaluate_test.py script.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
data		data
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
config.py		config.py
dataset.py		dataset.py
evaluate_test.py		evaluate_test.py
process_dataset.py		process_dataset.py
requirements.txt		requirements.txt
train.py		train.py
utils.py		utils.py