Skip to content

Latest commit

 

History

History
82 lines (73 loc) · 3.02 KB

File metadata and controls

82 lines (73 loc) · 3.02 KB

tf_example6 example

This example is used to demonstrate how to use default user-facing APIs to quantize a model.

1. Installation

pip install -r requirements.txt

2. Prepare Dataset

TensorFlow models repo provides scripts and instructions to download, process and convert the ImageNet dataset to the TF records format. We also prepared related scripts in TF image_recognition example.

3. Download the FP32 model

wget https://storage.googleapis.com/intel-optimized-tensorflow/models/v1_6/mobilenet_v1_1.0_224_frozen.pb

4. Update the root of dataset in conf.yaml

The configuration will will create a TopK metric function for evaluation and configure the batch size, instance number and core number for performance measurement.

evaluation:                                          # optional. required if user doesn't provide eval_func in Quantization.
 accuracy:                                           # optional. required if user doesn't provide eval_func in Quantization.
    metric:
      topk: 1                                        # built-in metrics are topk, map, f1, allow user to register new metric.
    dataloader:
      batch_size: 32 
      dataset:
        ImageRecord:
          root: /path/to/imagenet/                   # NOTE: modify to evaluation dataset location if needed
      transform:
        BilinearImagenet: 
          height: 224
          width: 224

 performance:                                        # optional. used to benchmark performance of passing model.
    configs:
      cores_per_instance: 4
      num_of_instance: 7
    dataloader:
      batch_size: 1 
      last_batch: discard 
      dataset:
        ImageRecord:
          root: /path/to/imagenet/                   # NOTE: modify to evaluation dataset location if needed
      transform:
        ResizeCropImagenet: 
          height: 224
          width: 224
          mean_value: [123.68, 116.78, 103.94]

5. Run Command

  • Run quantization
python test.py --tune
  • Run benchmark, please make sure benchmark the model should after tuning.
python test.py --benchmark

6. Introduction

  • We only need to add the following lines for quantization to create an int8 model.
    from neural_compressor import Quantization
    quantizer = Quantization('./conf.yaml')
    quantized_model = quantizer('./mobilenet_v1_1.0_224_frozen.pb')
    tf.io.write_graph(graph_or_graph_def=quantized_model,
                      logdir='./',
                      name='int8.pb',
                      as_text=False)
  • Run benchmark according to config.
     # Optional, run benchmark 
    from neural_compressor import Benchmark
    evaluator = Benchmark('./conf.yaml')
    results = evaluator('./int8.pb')