Skip to content

Commit

Permalink
Feature/azaytsev/changes from baychub colin q2 (openvinotoolkit#6437)
Browse files Browse the repository at this point in the history
* Q2 changes

* Changed Convert_RNNT.md

Co-authored-by: baychub <[email protected]>
  • Loading branch information
andrew-zaytsev and baychub authored Jun 29, 2021
1 parent cca5778 commit af2fec9
Show file tree
Hide file tree
Showing 7 changed files with 105 additions and 97 deletions.
18 changes: 9 additions & 9 deletions docs/IE_DG/Int8Inference.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,25 +17,25 @@ Low-precision 8-bit inference is optimized for:

## Introduction

A lot of investigation was made in the field of deep learning with the idea of using low precision computations during inference in order to boost deep learning pipelines and gather higher performance. For example, one of the popular approaches is to shrink the precision of activations and weights values from `fp32` precision to smaller ones, for example, to `fp11` or `int8`. For more information about this approach, refer to
A lot of investigation was made in the field of deep learning with the idea of using low-precision computation during inference in order to boost deep learning pipelines and achieve higher performance. For example, one of the popular approaches is to shrink the precision of activations and weights values from `fp32` precision to smaller ones, for example, to `fp11` or `int8`. For more information about this approach, refer to the
**Brief History of Lower Precision in Deep Learning** section in [this whitepaper](https://software.intel.com/en-us/articles/lower-numerical-precision-deep-learning-inference-and-training).

8-bit computations (referred to as `int8`) offer better performance compared to the results of inference in higher precision (for example, `fp32`), because they allow loading more data into a single processor instruction. Usually the cost for significant boost is a reduced accuracy. However, it is proved that an accuracy drop can be negligible and depends on task requirements, so that the application engineer can set up the maximum accuracy drop that is acceptable.
8-bit computation (referred to as `int8`) offers better performance compared to the results of inference in higher precision (for example, `fp32`), because they allow loading more data into a single processor instruction. Usually the cost for significant boost is reduced accuracy. However, it has been proven that the drop in accuracy can be negligible and depends on task requirements, so that an application engineer configure the maximum accuracy drop that is acceptable.


Let's explore quantized [TensorFlow* implementation of ResNet-50](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/resnet-50-tf) model. Use [Model Downloader](@ref omz_tools_downloader) tool to download the `fp16` model from [OpenVINO™ Toolkit - Open Model Zoo repository](https://github.com/openvinotoolkit/open_model_zoo):
Let's explore the quantized [TensorFlow* implementation of ResNet-50](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/resnet-50-tf) model. Use the [Model Downloader](@ref omz_tools_downloader) tool to download the `fp16` model from [OpenVINO™ Toolkit - Open Model Zoo repository](https://github.com/openvinotoolkit/open_model_zoo):
```sh
./downloader.py --name resnet-50-tf --precisions FP16-INT8
cd $INTEL_OPENVINO_DIR/deployment_tools/tools/model_downloader
./downloader.py --name resnet-50-tf --precisions FP16-INT8 --output_dir <your_model_directory>
```
After that you should quantize model by the [Model Quantizer](@ref omz_tools_downloader) tool.
After that, you should quantize the model by the [Model Quantizer](@ref omz_tools_downloader) tool. For the dataset, you can choose to download the ImageNet dataset from [here](https://www.image-net.org/download.php).
```sh
./quantizer.py --model_dir public/resnet-50-tf --dataset_dir <DATASET_DIR> --precisions=FP16-INT8
./quantizer.py --model_dir --name public/resnet-50-tf --dataset_dir <DATASET_DIR> --precisions=FP16-INT8
```
The simplest way to infer the model and collect performance counters is [C++ Benchmark Application](../../inference-engine/samples/benchmark_app/README.md).
The simplest way to infer the model and collect performance counters is the [C++ Benchmark Application](../../inference-engine/samples/benchmark_app/README.md).
```sh
./benchmark_app -m resnet-50-tf.xml -d CPU -niter 1 -api sync -report_type average_counters -report_folder pc_report_dir
```
If you infer the model with the OpenVINO™ CPU plugin and collect performance counters, all operations (except last not quantized SoftMax) are executed in INT8 precision.
If you infer the model with the OpenVINO™ CPU plugin and collect performance counters, all operations (except the last non-quantized SoftMax) are executed in INT8 precision.

## Low-Precision 8-bit Integer Inference Workflow

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,20 @@

[F3Net](https://github.com/weijun88/F3Net): Fusion, Feedback and Focus for Salient Object Detection

## Clone the F3Net Repository

To clone the repository, run the following command:

```sh
git clone http://github.com/weijun88/F3Net.git"
```
## Download and Convert the Model to ONNX*
To download the pre-trained model or train the model yourself, refer to the
[instruction](https://github.com/weijun88/F3Net/blob/master/README.md) in the F3Net model repository. Firstly,
convert the model to ONNX\* format. Create and run the script with the following content in the `src`
directory of the model repository:
[instruction](https://github.com/weijun88/F3Net/blob/master/README.md) in the F3Net model repository. First, convert the model to ONNX\* format. Create and run the following Python script in the `src` directory of the model repository:
```python
import torch

from dataset import Config
from net import F3Net
Expand All @@ -19,7 +24,7 @@ net = F3Net(cfg)
image = torch.zeros([1, 3, 352, 352])
torch.onnx.export(net, image, 'f3net.onnx', export_params=True, do_constant_folding=True, opset_version=11)
```
The script generates the ONNX\* model file f3net.onnx. The model conversion was tested with the repository hash commit `eecace3adf1e8946b571a4f4397681252f9dc1b8`.
The script generates the ONNX\* model file f3net.onnx. This model conversion was tested with the repository hash commit `eecace3adf1e8946b571a4f4397681252f9dc1b8`.
## Convert ONNX* F3Net Model to IR
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,15 +20,15 @@ mkdir rnnt_for_openvino
cd rnnt_for_openvino
```

**Step 3**. Download pretrained weights for PyTorch implementation from https://zenodo.org/record/3662521#.YG21DugzZaQ.
For UNIX*-like systems you can use wget:
**Step 3**. Download pretrained weights for PyTorch implementation from [https://zenodo.org/record/3662521#.YG21DugzZaQ](https://zenodo.org/record/3662521#.YG21DugzZaQ).
For UNIX*-like systems you can use `wget`:
```bash
wget https://zenodo.org/record/3662521/files/DistributedDataParallel_1576581068.9962234-epoch-100.pt
```
The link was taken from `setup.sh` in the `speech_recoginitin/rnnt` subfolder. You will get exactly the same weights as
if you were following the steps from https://github.com/mlcommons/inference/tree/master/speech_recognition/rnnt.
if you were following the steps from [https://github.com/mlcommons/inference/tree/master/speech_recognition/rnnt](https://github.com/mlcommons/inference/tree/master/speech_recognition/rnnt).

**Step 4**. Install required python* packages:
**Step 4**. Install required Python packages:
```bash
pip3 install torch toml
```
Expand All @@ -37,7 +37,7 @@ pip3 install torch toml
`export_rnnt_to_onnx.py` and run it in the current directory `rnnt_for_openvino`:

> **NOTE**: If you already have a full clone of MLCommons inference repository, you need to
> specify `mlcommons_inference_path` variable.
> specify the `mlcommons_inference_path` variable.
```python
import toml
Expand Down Expand Up @@ -92,8 +92,7 @@ torch.onnx.export(model.joint, (f, g), "rnnt_joint.onnx", opset_version=12,
python3 export_rnnt_to_onnx.py
```

After completing this step, the files rnnt_encoder.onnx, rnnt_prediction.onnx, and rnnt_joint.onnx will be saved in
the current directory.
After completing this step, the files `rnnt_encoder.onnx`, `rnnt_prediction.onnx`, and `rnnt_joint.onnx` will be saved in the current directory.

**Step 6**. Run the conversion command:

Expand All @@ -102,6 +101,6 @@ python3 {path_to_openvino}/mo.py --input_model rnnt_encoder.onnx --input "input.
python3 {path_to_openvino}/mo.py --input_model rnnt_prediction.onnx --input "input.1[1 1],1[2 1 320],2[2 1 320]"
python3 {path_to_openvino}/mo.py --input_model rnnt_joint.onnx --input "0[1 1 1024],1[1 1 320]"
```
Please note that hardcoded value for sequence length = 157 was taken from the MLCommons, but conversion to IR preserves
network [reshapeability](../../../../IE_DG/ShapeInference.md); this means you can change input shapes manually to any value either during conversion or
inference.
Please note that hardcoded value for sequence length = 157 was taken from the MLCommons but conversion to IR preserves
network [reshapeability](../../../../IE_DG/ShapeInference.md), this means you can change input shapes manually to any value either during conversion or
inference.
6 changes: 5 additions & 1 deletion docs/install_guides/installing-openvino-conda.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,10 @@ This guide provides installation steps for Intel® Distribution of OpenVINO™ t
conda update --all
```
3. Install the Intel® Distribution of OpenVINO™ Toolkit:
- Ubuntu* 20.04
```sh
conda install openvino-ie4py-ubuntu20 -c intel
```
- Ubuntu* 18.04
```sh
conda install openvino-ie4py-ubuntu18 -c intel
Expand All @@ -47,7 +51,7 @@ This guide provides installation steps for Intel® Distribution of OpenVINO™ t
```sh
python -c "import openvino"
```

Now you can start to develop and run your application.


Expand Down
14 changes: 7 additions & 7 deletions docs/install_guides/installing-openvino-pip.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
# Install Intel® Distribution of OpenVINO™ Toolkit from PyPI Repository {#openvino_docs_install_guides_installing_openvino_pip}

OpenVINO™ toolkit is a comprehensive toolkit for quickly developing applications and solutions that solve a variety of tasks including emulation of human vision, automatic speech recognition, natural language processing, recommendation systems, and many others. Based on latest generations of artificial neural networks, including Convolutional Neural Networks (CNNs), recurrent and attention-based networks, the toolkit extends computer vision and non-vision workloads across Intel® hardware, maximizing performance. It accelerates applications with high-performance, AI and deep learning inference deployed from edge to cloud.
OpenVINO™ toolkit is a comprehensive toolkit for quickly developing applications and solutions that solve a variety of tasks including emulation of human vision, automatic speech recognition, natural language processing, recommendation systems, and many others. Based on the latest generations of artificial neural networks, including Convolutional Neural Networks (CNNs), recurrent and attention-based networks, the toolkit extends computer vision and non-vision workloads across Intel® hardware, maximizing performance. It accelerates applications with high-performance AI and deep learning inference deployed from edge to cloud.

Intel® Distribution of OpenVINO™ Toolkit provides the following packages available for installation through the PyPI repository:

* Runtime package with the Inference Engine inside: [https://pypi.org/project/openvino/](https://pypi.org/project/openvino/).
* Developer package that includes the runtime package as a dependency, Model Optimizer and other developer tools: [https://pypi.org/project/openvino-dev](https://pypi.org/project/openvino-dev).
* Runtime package with the Inference Engine inside: [https://pypi.org/project/openvino/](https://pypi.org/project/openvino/)
* Developers package (including the runtime package as a dependency), Model Optimizer, Accuracy Checker and Post-Training Optimization Tool: [https://pypi.org/project/openvino-dev](https://pypi.org/project/openvino-dev)

## Additional Resources

- [Intel® Distribution of OpenVINO™ toolkit](https://software.intel.com/en-us/openvino-toolkit).
- [Model Optimizer Developer Guide](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md).
- [Inference Engine Developer Guide](../IE_DG/Deep_Learning_Inference_Engine_DevGuide.md).
- [Inference Engine Samples Overview](../IE_DG/Samples_Overview.md).
- [Intel® Distribution of OpenVINO™ toolkit](https://software.intel.com/en-us/openvino-toolkit)
- [Model Optimizer Developer Guide](../MO_DG/Deep_Learning_Model_Optimizer_DevGuide.md)
- [Inference Engine Developer Guide](../IE_DG/Deep_Learning_Inference_Engine_DevGuide.md)
- [Inference Engine Samples Overview](../IE_DG/Samples_Overview.md)
9 changes: 5 additions & 4 deletions docs/install_guides/pypi-openvino-dev.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ license terms for third party or open source software included in or with the So

OpenVINO™ toolkit is a comprehensive toolkit for quickly developing applications and solutions that solve a variety of tasks including emulation of human vision, automatic speech recognition, natural language processing, recommendation systems, and many others. Based on latest generations of artificial neural networks, including Convolutional Neural Networks (CNNs), recurrent and attention-based networks, the toolkit extends computer vision and non-vision workloads across Intel® hardware, maximizing performance. It accelerates applications with high-performance, AI and deep learning inference deployed from edge to cloud.

The **developer package** includes the following components installed by default:
**The developer package includes the following components installed by default:**

| Component | Console Script | Description |
|------------------|---------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
Expand All @@ -21,8 +21,9 @@ The **developer package** includes the following components installed by default
| [Post-Training Optimization Tool](https://docs.openvinotoolkit.org/latest/pot_README.html)| `pot` |**Post-Training Optimization Tool** allows you to optimize trained models with advanced capabilities, such as quantization and low-precision optimizations, without the need to retrain or fine-tune models. Optimizations are also available through the [API](https://docs.openvinotoolkit.org/latest/pot_compression_api_README.html). |
| [Model Downloader and other Open Model Zoo tools](https://docs.openvinotoolkit.org/latest/omz_tools_downloader.html)| `omz_downloader` <br> `omz_converter` <br> `omz_quantizer` <br> `omz_info_dumper`| **Model Downloader** is a tool for getting access to the collection of high-quality and extremely fast pre-trained deep learning [public](https://docs.openvinotoolkit.org/latest/omz_models_group_public.html) and [intel](https://docs.openvinotoolkit.org/latest/omz_models_group_intel.html)-trained models. Use these free pre-trained models instead of training your own models to speed up the development and production deployment process. The principle of the tool is as follows: it downloads model files from online sources and, if necessary, patches them with Model Optimizer to make them more usable. A number of additional tools are also provided to automate the process of working with downloaded models:<br> **Model Converter** is a tool for converting the models stored in a format other than the Intermediate Representation (IR) into that format using Model Optimizer. <br> **Model Quantizer** is a tool for automatic quantization of full-precision IR models into low-precision versions using Post-Training Optimization Tool. <br> **Model Information Dumper** is a helper utility for dumping information about the models in a stable machine-readable format.|

> **NOTE**: The developer package also installs the OpenVINO™ runtime package as a dependency.
**Developer package** also provides the **runtime package** installed as a dependency. The runtime package includes the following components:
**The runtime package installs the following components:**

| Component | Description |
|-----------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
Expand Down Expand Up @@ -87,10 +88,10 @@ python -m pip install --upgrade pip

To install and configure the components of the development package for working with specific frameworks, use the `pip install openvino-dev[extras]` command, where `extras` is a list of extras from the table below:

| DL Framework | Extra |
| DL Framework | Extra |
| :------------------------------------------------------------------------------- | :-------------------------------|
| [Caffe*](https://caffe.berkeleyvision.org/) | caffe |
| [Caffe2*](https://caffe2.ai/) | caffe2 |
| [Caffe2*](https://caffe2.ai/) | caffe2 |
| [Kaldi*](https://kaldi-asr.org/) | kaldi |
| [MXNet*](https://mxnet.apache.org/) | mxnet |
| [ONNX*](https://github.com/microsoft/onnxruntime/) | onnx |
Expand Down
Loading

0 comments on commit af2fec9

Please sign in to comment.