This repository contains the code relative to the paper "MAFER: a Multi-resolution Approach to FacialExpression Recognition" by Fabio Valerio Massoli (ISTI - CNR), Donato Cafarelli (ISTI - CNR), Claudio Gennaro (ISTI - CNR), Giuseppe Amato (ISTI - CNR), and Fabrizio Falchi (ISTI - CNR).
We propose a multi-resolution two-step training procedure for Deep Learning models tasked with the Facial Expression Recognition (FER) objective. We prove the benefits of such an approach by extensively test our models on several publicly available datasets.
Please note: We are researchers, not a software company, and have no personnel devoted to documenting and maintaing this research code. Therefore this code is offered "AS IS". Exact reproduction of the numbers in the paper depends on exact reproduction of many factors, including the version of all software dependencies and the choice of underlying hardware (GPU model, etc). Therefore you should expect to need to re-tune your hyperparameters slightly for your new setup.
The image below shows the multi-resolution training phase that represents the first step of our learning procedure for FER.
The confusion matrices below report the performance of our models on the Oulu-CASIA dataset. The quoted numbers are the accuracies, 10-fold averaged, for each expression class.
Below we report the t-SNE embedding of deep representations produced by the base model (leftmost) and by the models trained with a multi-resolution training.
For more details loot at our paper: "MAFER: a Multi-resolution Approach to FacialExpression Recognition"
Before to run the code, make sure that your system has the proper packages installed. You can have a look at the requirements.txt file.
python main_fer_rafdb.py -tr -bp <path_to_base_model> -dn <dataset_name> -df <path_to_dataset>
The base_model in this case refers to the original Se-ResNet-50 (model here) from Cao et al. (paper, github repo).
To test the models we use two different scripts. A former one, to test models on a single test set (fer2013 and rafdb datasets), and a latter one, to test models over k-folds (oulucasia dataset).
To test models on the fer2013 or the rafdb datasets:
python test_model_fer_raf.py -bp <base_model_ckp> -ck <model_ckp> -dn <dataset_name> -df <dataset_folder> -bs <batch_size>
To test models on the oulucasia dataset:
python test_model_oulu.py -bp <base_model_ckp> -ck <models_main_folder> -dn <dataset_name> -df <dataset_folder> -bs <batch_size>
All models' checkpoints are available here.
FER2013
Model | Accuracy (%) |
---|---|
base | 60.82 |
CR | 73.06 |
CR-Simiplified | 72.33 |
CR+AffWild2 | 73.45 |
RAF-DB
Model | Overall Acc. (%) | Average Acc. (%) |
---|---|---|
base | 77.09 | 65.39 ± .10 |
CR | 88.43 | 81.90 ± .04 |
CR-Simplified | 88.14 | 83.16 ± .03 |
CR+AffWild2 | 88.07 | 82.40 ± .04 |
OULU-Casia
The linked folders cotain all the 10(-fold) models' checkpoints.
Model | Acc. 10-fold avg. |
---|---|
base | 59.84 ± 1.29 |
CR | 98.40 ± .11 |
CR-Simplified | 96.72 ± .24 |
CR+AffWild2 | 98.95 ± .15 |
The table below reports the arrays for the mean values that we use in our study to center the input data.
Dataset | Mean |
---|---|
FER-2013 | [133.05986, 133.05986, 133.05986] |
RAF-DB | [102.15835, 114.51117, 146.58075] |
OULU-Casia | [131.07828, 131.07828, 131.07828] |
For all the details about the training procedure and the experimental results, please have a look at the paper.
To cite our work, please use the following form
@article{massoli2021mafer,
title={MAFER: a Multi-resolution Approach to Facial Expression Recognition},
author={Massoli, Fabio Valerio and Cafarelli, Donato and Gennaro, Claudio and Amato, Giuseppe and Falchi, Fabrizio},
journal={arXiv preprint arXiv:2105.02481},
year={2021}
}
If you have any question about our work, please contact Dr. Fabio Valerio Massoli.
Have fun! :-D