This example is used to demonstrate how to use default user-facing APIs to quantize a model.
pip install -r requirements.txt
TensorFlow models repo provides scripts and instructions to download, process and convert the ImageNet dataset to the TF records format. We also prepared related scripts in TF image_recognition example.
wget https://storage.googleapis.com/intel-optimized-tensorflow/models/v1_6/mobilenet_v1_1.0_224_frozen.pb
The configuration will will create a TopK metric function for evaluation and configure the batch size, instance number and core number for performance measurement.
evaluation: # optional. required if user doesn't provide eval_func in Quantization.
accuracy: # optional. required if user doesn't provide eval_func in Quantization.
metric:
topk: 1 # built-in metrics are topk, map, f1, allow user to register new metric.
dataloader:
batch_size: 32
dataset:
ImageRecord:
root: /path/to/imagenet/ # NOTE: modify to evaluation dataset location if needed
transform:
BilinearImagenet:
height: 224
width: 224
performance: # optional. used to benchmark performance of passing model.
configs:
cores_per_instance: 4
num_of_instance: 7
dataloader:
batch_size: 1
last_batch: discard
dataset:
ImageRecord:
root: /path/to/imagenet/ # NOTE: modify to evaluation dataset location if needed
transform:
ResizeCropImagenet:
height: 224
width: 224
mean_value: [123.68, 116.78, 103.94]
- Run quantization
python test.py --tune
- Run benchmark, please make sure benchmark the model should after tuning.
python test.py --benchmark
- We only need to add the following lines for quantization to create an int8 model.
from neural_compressor import Quantization
quantizer = Quantization('./conf.yaml')
quantized_model = quantizer('./mobilenet_v1_1.0_224_frozen.pb')
tf.io.write_graph(graph_or_graph_def=quantized_model,
logdir='./',
name='int8.pb',
as_text=False)
- Run benchmark according to config.
# Optional, run benchmark
from neural_compressor import Benchmark
evaluator = Benchmark('./conf.yaml')
results = evaluator('./int8.pb')