Quantize Speech Recognition Models using NNCF PTQ API

This tutorial demonstrates how to apply INT8 quantization to the speech recognition models, using post-training quantization with NNCF (Neural Network Compression Framework).

Supported models:

107-speech-recognition-wav2vec2.ipynb demonstrates how to apply post-training INT8 quantization on a fine-tuned Wav2Vec2-Base-960h PyTorch model, trained on the LibriSpeech ASR corpus.
107-speech-recognition-data2vec.ipynb demonstrates how to apply post-training INT8 quantization on a fine-tuned Data2Vec-Audio-Base-960h PyTorch model, trained on the LibriSpeech ASR corpus.

The code of the tutorials is designed to be extendable to custom models and datasets.

Notebook Contents

The tutorial consists of the following steps:

Downloading and preparing the model and dataset.
Defining data loading and accuracy validation functionality.
Preparing the model for quantization.
Running quantization.
Comparing performance of the original and quantized models.
Compare accuracy of the original and quantized models.

Installation Instructions

This is a self-contained example that relies solely on its own code.
We recommend running the notebook in a virtual environment. You only need a Jupyter server to start. For details, please refer to Installation Guide.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Quantize Speech Recognition Models using NNCF PTQ API

Notebook Contents

Installation Instructions

Files

README.md

Latest commit

History

README.md

File metadata and controls

Quantize Speech Recognition Models using NNCF PTQ API

Notebook Contents

Installation Instructions