It is implementation of training module in C++ from article Boosting Sex Identification Performance http://www.cs.cmu.edu/~har/iaai2005.pdf
Datasets consist of images with 3 different sizes of man and woman face images.
- 16x16 - about 1136 images
- 20x20 - about 4505 images
- 24x24 - about 1139 images
file: xxsamples00.vec Where xx equal to 16, 20 or 24
- 1 byte - width of dataset (w)
- 1 byte - height of dataset (h)
And images to the end of the file
- w*h bytes - image (first row - last row of image)
- 1 byte - gender (1 for man and 255 for woman)
All samples are normalized to mean = 128 and variance = 64.
To train model: *.exe --train 1 --data_filename "16samples00.vec" --percent_trainset 0.8 --num_classifiers 20 --save_period 10
To evaluate accuracy of trained model: *.exe --data_filename "16samples00.vec" --model_filename "model.bin" --percent_trainset 0.8
--train <1 or 0> 1 - to train model 0 - to evaluate accuracy of trained model (default)
--data_filename File with images in *.vec format
--model_filename Filename of trained model to evaluate accuracy
--percent_trainset Percent of data as trainset should be in [0.5; 1], to evaluate accuracy [0; 1] (default - 0.0 (0 %)
--percent_classifiers Percent of classifiers to observe in every iterations, should be from 0.01 to 1. (default - 0.8 (80%))
--num_classifiers Amount of classifiers to train. Should be from 1 to 1000.
--save_period Save trained classifiers every 'period' iterations (default - 100)
--balanced <1 or 0> Set flag to 1 if you want balanced train data (default - 0)
file: model_HxW_NC_PTS_B_PC.bin
- H, W - height and width of training samples
- NC - number of weak classifiers
- PTS - persent of dataset for training
- B - balanced or unbalanced train set
- PC - percent of weak classifiers to choose random from all classifiers on every step
model_HxW_NC_PTS_B_PC.bin
- 4 byte - Int - number of weak classifiers
Number of weak classifiers times
- 4 byte - Int - first pixel row
- 4 byte - Int - first pixel column
- 4 byte - Int - second pixel row
- 4 byte - Int - second pixel column
- 8 byte - Double - Log(1/betta)
- 4 byte - Int - feature type of weak classifier
- 4 byte - Int - feature parity of weak classifier
- 8 byte - Double - accuracy on train set after adding this weak classifier
- 8 byte - Double - accuracy on test set after adding this weak classifier