Skip to content

Commit

Permalink
update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
calad0i committed Apr 26, 2024
1 parent a5f3a3f commit c5b7d39
Show file tree
Hide file tree
Showing 9 changed files with 54 additions and 28 deletions.
26 changes: 20 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,28 @@
[![PyPI version](https://badge.fury.io/py/hgq.svg)](https://badge.fury.io/py/hgq)


HGQ is a framework for quantization aware training of neural networks to be deployed on FPGAs, which allows for per-weight and per-activation bitwidth optimization.
HGQ is an gradient-based automatic bitwidth optimization and quantization-aware training algorithm for neural networks to be deployed on FPGAs, By laveraging gradients, it allows for bitwidth optimization at arbitrary granularity, up to per-weight and per-activation level.

Depending on the specific [application](https://arxiv.org/abs/2006.10159), HGQ could achieve up to 10x resource reduction compared to the traditional `AutoQkeras` approach, while maintaining the same accuracy. For some more challenging [tasks](https://arxiv.org/abs/2202.04976), where the model is already under-fitted, HGQ could still improve the performance under the same on-board resource consumption. For more details, please refer to our paper (link coming not too soon).
<img src="docs/_static/overview.svg" alt="HGQ-overview" width="600"/>

This repository implements HGQ for `tensorflow.keras` models. It is independent of the [QKeras project](https://github.com/google/qkeras).
Compare to the other heterogeneous quantization approach, like the QKeras counterpart, HGQ provides the following advantages:

## Warning:
- **High Granularity**: HGQ supports per-weight and per-activation bitwidth optimization, or any other lower granularity.
- **Automatic Quantization**: By setting a resource regularization term, HGQ could automatically optimize the bitwidth of all parameters during training. Pruning is performed naturally when a bitwidth is reduced to 0.
- **Bit-accurate conversion** to `hls4ml`: You get exactly what you get from `Keras` models from `hls4ml` models. HGQ provides a bit-accurate conversion interface, proxy models, for bit-accurate conversion to hls4ml models.
- still subject to machine float precision limitation.
- **Accurate Resource Estimation**: BOPs estimated by HGQ is roughly #LUTs + 55#DSPs for actual (post place & route) FPGA resource consumption. This metric is available during training, and one can estimate the resource consumption of the final model in a very early stage.

This framework requires an **unmerged** [PR](https://github.com/fastmachinelearning/hls4ml/pull/914) of hls4ml. Please install it by running `pip install "git+https://github.com/calad0i/hls4ml@HGQ-integration"`. Or, conversion will fail with unsupported layer error.
Depending on the specific [application](https://arxiv.org/abs/2006.10159), HGQ could achieve up to 20x resource reduction compared to the `AutoQkeras` approach, while maintaining the same accuracy. For some more challenging [tasks](https://arxiv.org/abs/2202.04976), where the model is already under-fitted, HGQ could still improve the performance under the same on-board resource consumption. For more details, please refer to our paper (link coming soon).

## This package is still under development. Any API might change without notice at any time!
## Installation

You will need `python>=3.10` and `tensorflow>=2.13` to run this framework. You can install it via pip:

```bash
pip install hgq
```

## Usage

Please refer to the [documentation](https://calad0i.github.io/HGQ/) for more details.
3 changes: 3 additions & 0 deletions docs/_static/custom.css
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
img.light {
color-scheme: light;
}
1 change: 1 addition & 0 deletions docs/_static/overview.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -66,3 +66,7 @@
html_theme = "sphinx_rtd_theme"
html_static_path = ['_static']
html_favicon = '_static/icon.svg'

html_css_files = [
'custom.css',
]
2 changes: 1 addition & 1 deletion docs/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ HGQ is a method for quantization aware training of neural works to be deployed o

## Why is it useful?

Depending on the specific [application](https://arxiv.org/abs/2006.10159), HGQ could achieve up to 10x resource reduction compared to the traditional `AutoQkeras` approach, while maintaining the same accuracy. For some more challenging [tasks](https://arxiv.org/abs/2202.04976), where the model is already under-fitted, HGQ could still improve the performance under the same on-board resource consumption. For more details, please refer to our paper (link coming not too soon).
Depending on the specific [application](https://arxiv.org/abs/2006.10159), HGQ could achieve up to 20x resource reduction compared to the traditional `AutoQkeras` approach, while maintaining the same accuracy. For some more challenging [tasks](https://arxiv.org/abs/2202.04976), where the model is already under-fitted, HGQ could still improve the performance under the same on-board resource consumption. For more details, please refer to our paper (link coming not too soon).

## Can I use it?

Expand Down
4 changes: 2 additions & 2 deletions docs/getting_started.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Quick Start

```{warning}
This guide is only for models with fully heterogeneous quantized weights. For models with partially-heterogeneous quantized weights, please refer to the [Full Usage](#Full Usage) guide.
```{note}
This guide is only for models with fully heterogeneous quantized weights (per-weight bitwidth).
```

## Model definition & training
Expand Down
30 changes: 21 additions & 9 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,21 +3,33 @@
You can adapt this file completely to your liking, but it should at least
contain the root `toctree` directive.
===========================
High Granularity Quantization
===============================================================
===========================

HGQ is a framework for quantization aware training of neural networks to be deployed on FPGAs, which allows for per-weight and per-activation bitwidth optimization.
.. image:: https://img.shields.io/badge/license-Apache%202.0-green.svg
:target: LICENSE
.. image:: https://github.com/calad0i/HGQ/actions/workflows/sphinx-build.yml/badge.svg
:target: https://calad0i.github.io/HGQ/
.. image:: https://badge.fury.io/py/hgq.svg
:target: https://badge.fury.io/py/hgq

Depending on the specific application_, HGQ could achieve up to 10x resource reduction compared to the traditional AutoQkeras_ approach, while maintaining the same accuracy. For some more `challenging tasks`_, where the model is already under-fitted, HGQ could still improve the performance under the same on-board resource consumption. For more details, please refer to our paper (link coming not too soon).
HGQ is an gradient-based automatic bitwidth optimization and quantization-aware training algorithm for neural networks to be deployed on FPGAs, By laveraging gradients, it allows for bitwidth optimization at arbitrary granularity, up to per-weight and per-activation level.

This repository implements HGQ for `tensorflow.keras` models. It is independent of the `QKeras project`_.
.. rst-class:: light
.. image:: _static/overview.svg
:alt: HGQ-overview
:width: 600

Notice: this repository is still under development, and the API might change in the future.
Compare to the other heterogeneous quantization approach, like the QKeras counterpart, HGQ provides the following advantages:

.. _application: https://arxiv.org/abs/2006.10159
.. _AutoQkeras: https://arxiv.org/abs/2006.10159
.. _challenging tasks: https://arxiv.org/abs/2202.04976
.. _QKeras project: https://github.com/google/qkeras
- **High Granularity**: HGQ supports per-weight and per-activation bitwidth optimization, or any other lower granularity.
- **Automatic Quantization**: By setting a resource regularization term, HGQ could automatically optimize the bitwidth of all parameters during training. Pruning is performed naturally when a bitwidth is reduced to 0.
- **Bit-accurate conversion** to `hls4ml`: You get exactly what you get from `Keras` models from `hls4ml` models. HGQ provides a bit-accurate conversion interface, proxy models, for bit-accurate conversion to hls4ml models.
- still subject to machine float precision limitation.
- **Accurate Resource Estimation**: BOPs estimated by HGQ is roughly #LUTs + 55#DSPs for actual (post place & route) FPGA resource consumption. This metric is available during training, and one can estimate the resource consumption of the final model in a very early stage.

Depending on the specific `application <https://arxiv.org/abs/2006.10159>`_, HGQ could achieve up to 20x resource reduction compared to the `AutoQkeras` approach, while maintaining the same accuracy. For some more challenging `tasks <https://arxiv.org/abs/2202.04976>`_, where the model is already under-fitted, HGQ could still improve the performance under the same on-board resource consumption. For more details, please refer to our paper (link coming soon).

Index
=========================================================
Expand Down
10 changes: 1 addition & 9 deletions docs/install.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,11 @@
# Installation

Use `pip install --pre HGQ` to install the latest version from PyPI. You will need a environment with `python>=3.10` installed. Currently, only `python3.10 and 3.11` are tested.
Use `pip install HGQ` to install the latest version from PyPI. You will need a environment with `python>=3.10` installed. Currently, only `python3.10 and 3.11` are tested.

```{warning}
This framework requires an **unmerged** [PR](https://github.com/fastmachinelearning/hls4ml/pull/914) of hls4ml. Please install it by running `pip install "git+https://github.com/calad0i/hls4ml@HGQ-integration"`. Or, conversion will fail with unsupported layer error.
```

```{note}
The current varsion requires an **unmerged** version of hls4ml. Please install it by running `pip install git+https://github.com/calad0i/hls4ml`.
```

```{warning}
HGQ v0.2 requires `tensorflow>=2.13,<2.16` (tested on 2.13 and 2.15; 2.16 untested but may work) and `python>=3.10`. Please make sure that you have the correct version of python and tensorflow installed.
```

```{warning}
Due to broken dependency declaration, you will need to specify the version of tensorflow manually. Otherwise, there will likely to be version conflicts.
```
2 changes: 1 addition & 1 deletion docs/reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ Heterogenerous layers (`H-` prefix):
- (New in 0.2) `HActivation` with **arbitrary unary function**. (See note below.)

```{note}
`HActivation` will be converted to a general `unary LUT` in `to_proxy_model` when
`HActivation` will be converted to a general `unaryLUT` in `to_proxy_model` when
- the required table size is smaller or equal to `unary_lut_max_table_size`.
- the corresponding function is not `relu`.
Expand Down

0 comments on commit c5b7d39

Please sign in to comment.