Skip to content

Support weight-only quantization with quantized operators in intel-extension-for-transformers. #511

Support weight-only quantization with quantized operators in intel-extension-for-transformers.

Support weight-only quantization with quantized operators in intel-extension-for-transformers. #511

name: OpenVINO - Basic Test
on:
workflow_dispatch:
schedule:
- cron: '41 1 * * *' # run every day at 1:41
push:
paths:
- 'tests/openvino/test_modeling_basic.py'
- '.github/workflows/test_openvino_basic.yml'
pull_request:
paths:
- 'tests/openvino/test_modeling_basic.py'
- '.github/workflows/test_openvino_basic.yml'
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
jobs:
build:
strategy:
fail-fast: false
matrix:
# Testing lower and upper bound of supported Python versions
# This also ensures that the test fails if dependencies break for Python 3.7
python-version: ["3.8", "3.11"]
transformers: ['transformers', 'git+https://github.com/huggingface/transformers.git']
optimum: ['optimum', 'git+https://github.com/huggingface/optimum.git']
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v2
- name: Setup Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies
run: |
# Install openvino manually to prevent dependency conflicts when .[openvino] pins
# optimum or transformers to a specific version
# Install PyTorch CPU to prevent unnecessary downloading/installing of CUDA packages
pip install torch --extra-index-url https://download.pytorch.org/whl/cpu
pip install .[tests] openvino onnx onnxruntime ${{ matrix.optimum}} ${{ matrix.transformers }}
- name: Pip freeze
run: pip freeze
- name: Test with Pytest
run: |
pytest tests/openvino/test_modeling_basic.py