diff --git a/README.md b/README.md
index da3b0fd..ea9cc38 100644
--- a/README.md
+++ b/README.md
@@ -1,5 +1,19 @@
-## Dataflow Accelerator Examples
-*for PYNQ on Zynq and Alveo*
+# Dataflow Accelerator Examples
+
for PYNQ on Zynq and Alveo
+
+
+
+
+
+
+
+
+
+
+
+
+
+
@@ -18,34 +32,81 @@ Need help with a problem in this repo, or got a question? Feel free to ask for h
In the past, we also had a [Gitter channel](https://gitter.im/xilinx-finn/community). Please be aware that this is no longer maintained by us but can still be used to search for questions previous users had.
## Quickstart
+*We recommend PYNQ version 3.0.1, but older installations of PYNQ should also work. For PYNQ v2.6.1, please refer for set-up instructions to [FINN-examples v0.0.5](https://github.com/Xilinx/finn-examples/tree/v0.0.5).*
+### Zynq
+*For ZYNQ boards, all commands below must be prefixed with `sudo` or by first going into `sudo su`.*
-*For Alveo we recommend setting up everything inside a virtualenv as described [here](https://pynq.readthedocs.io/en/v2.6.1/getting_started/alveo_getting_started.html?highlight=alveo#install-conda).*
-*For PYNQ boards, all commands below must be prefixed with `sudo` or by first going into `sudo su`. We recommend PYNQ version 2.6.1 as some installation issues have been reported for PYNQ version 2.7.*
+First, source the PYNQ and XRT virtual environment:
+
+```shell
+source /etc/profile.d/pynq_venv.sh
+source /etc/profile.d/xrt_setup.sh
+```
-First, ensure that your `pip` and `setuptools` installations are up-to-date
-on your PYNQ board or Alveo server:
+Next, ensure that your `pip` and `setuptools` installations are up-to-date
+on your PYNQ board:
```shell
-python3 -m pip install --upgrade pip setuptools
+python3 -m pip install pip==23.0 setuptools==67.1.0
+```
+
+Since we are going to install finn-examples without build-isolation, we need to ensure all dependencies are installed. For that, install `setuptools_csm` as well:
+
+```shell
+python3 -m pip install setuptools_scm==7.1.0
```
Install the `finn-examples` package using `pip`:
```shell
# remove previous versions with: pip3 uninstall finn-examples
-pip3 install finn-examples
+pip3 install finn-examples --no-build-isolation
# to install particular git branch:
-# pip3 install git+https://github.com/Xilinx/finn-examples.git@dev
+# pip3 install git+https://github.com/Xilinx/finn-examples.git@dev --no-build-isolation
```
-Retrieve the example Jupyter notebooks using the PYNQ get-notebooks command:
+Retrieve the example Jupyter notebooks using the PYNQ get-notebooks command. An example of how to run the Jupyter notebook server, assuming we are forwarding port 8888 from the target to some port on our local machine, is also shown below:
```shell
# on PYNQ boards, first cd /home/xilinx/jupyter_notebooks
pynq get-notebooks --from-package finn-examples -p . --force
+jupyter-notebook --no-browser --allow-root --port=8888
+```
+
+### Alveo
+*For Alveo we recommend setting up everything inside a virtualenv as described [here](https://pynq.readthedocs.io/en/v2.6.1/getting_started/alveo_getting_started.html?highlight=alveo#install-conda).*
+
+First, create & source a virtual environment:
+```shell
+conda create -n python=3.10
+conda activate
+```
+
+Next, ensure that your `pip` and `setuptools` installations are up-to-date:
+```shell
+python3 -m pip install --upgrade pip==23.0 setuptools==67.2.0
+```
+
+Finally, we can now install Pynq, FINN-examples and Jupyter (please note to source the XRT environment before):
+```shell
+pip3 install pynq==3.0.1
+python3 -m pip install setuptools_scm==7.1.0 ipython==8.9.0
+pip3 install finn-examples --no-build-isolation
+# to install particular git branch:
+# pip3 install git+https://github.com/Xilinx/finn-examples.git@dev --no-build-isolation
+python3 -m pip install jupyter==1.0.0
+```
+
+Retrieve the example Jupyter notebooks using the PYNQ get-notebooks command. An example of how to run the Jupyter notebook server is also shown below:
+
+```shell
+pynq get-notebooks --from-package finn-examples -p . --force
+jupyter-notebook --no-browser --port=8888
```
+***
+
You can now navigate the provided Jupyter notebook examples, or just use the
provided accelerators as part of your own Python program:
@@ -56,7 +117,7 @@ import numpy as np
# instantiate the accelerator
accel = models.cnv_w2a2_cifar10()
# generate an empty numpy array to use as input
-dummy_in = np.empty(accel.ishape_normal, dtype=np.uint8)
+dummy_in = np.empty(accel.ishape_normal(), dtype=np.uint8)
# perform inference and get output
dummy_out = accel.execute(dummy_in)
```
@@ -64,15 +125,17 @@ dummy_out = accel.execute(dummy_in)
## Example Neural Network Accelerators
| Dataset | Topology | Quantization | Supported boards | Supported build flows
|----------------------------------------------------------------|-------------------------|------------------------------------------------------------|------------------|------------------|
-|
CIFAR-10 | CNV (VGG-11-like) | several variants:
1/2-bit weights/activations | all | Pynq-Z1
ZCU104
Ultra96 |
-|
MNIST | 3-layer fully-connected | several variants:
1/2-bit weights/activations | all | Pynq-Z1
ZCU104
Ultra96 |
-|
ImageNet | MobileNet-v1 | 4-bit weights and activations
8-bit first layer weights | Alveo U250
ZCU104 | ZCU104 |
+|
CIFAR-10 | CNV (VGG-11-like) | several variants:
1/2-bit weights/activations | Pynq-Z1
ZCU104
Ultra96
U250 | Pynq-Z1
ZCU104
Ultra96
U250 |
+|
MNIST | 3-layer fully-connected | several variants:
1/2-bit weights/activations | Pynq-Z1
ZCU104
Ultra96
U250 | Pynq-Z1
ZCU104
Ultra96
U250 |
+|
ImageNet | MobileNet-v1 | 4-bit weights & activations
8-bit first layer weights | Alveo U250 | Alveo U250 |
|
ImageNet | ResNet-50 | 1-bit weights 2-bit activations
4-bit residuals
8-bit first/last layer weights | Alveo U250 | - |
-|
RadioML 2018 | 1D CNN (VGG10) | 4-bit weights and activations | ZCU104 | ZCU104 |
-|
MaskedFace-Net | [BinaryCoP](https://arxiv.org/pdf/2102.03456)
*Contributed by TU Munich+BMW* | 1-bit weights and activations | Pynq-Z1 | Pynq-Z1 |
-|
Google Speech Commands v2 | 3-layer fully-connected | 3-bit weights and activations | Pynq-Z1 | Pynq-Z1 |
+|
RadioML 2018 | 1D CNN (VGG10) | 4-bit weights & activations | ZCU104 | ZCU104 |
+|
MaskedFace-Net | [BinaryCoP](https://arxiv.org/pdf/2102.03456)
*Contributed by TU Munich+BMW* | 1-bit weights & activations | Pynq-Z1 | Pynq-Z1 |
+|
Google Speech Commands v2 | 3-layer fully-connected | 3-bit weights & activations | Pynq-Z1 | Pynq-Z1 |
+|
UNSW-NB15 | 4-layer fully-connected | 2-bit weights & activations | Pynq-Z1
ZCU104
Ultra96 | Pynq-Z1
ZCU104
Ultra96 |
-*Please note that for the non-supported Alveo build flows, you can use the pre-built FPGA bitfiles generated with older versions of the Vitis/Vivado tools. These bitfiles target the following Alveo U250 platform: [xilinx_u250_xdma_201830_2](https://www.xilinx.com/products/boards-and-kits/alveo/package-files-archive/u250-2018-3-1.html).
+*Please note that the build flow for ResNet-50 for the Alveo U250 has known issues and we're currently working on resolving them. However, you can still execute the associated notebook, as we provide a pre-built FPGA bitfile generated with an older Vivado (/FINN) version targeting the [xilinx_u250_xdma_201830_2](https://www.xilinx.com/products/boards-and-kits/alveo/package-files-archive/u250-2018-3-1.html) platform.*
+*Furthermore, please note that you can target other boards (such as the Pynq-Z2 or ZCU102) by changing the build script manually, but these accelerators have not been tested.*
We welcome community contributions to add more examples to this repo!
@@ -82,7 +145,7 @@ We welcome community contributions to add more examples to this repo!
`finn-examples` provides pre-built FPGA bitfiles for the following boards:
-* **Edge:** Pynq-Z1, Pynq-Z2, Ultra96 and ZCU104
+* **Edge:** Pynq-Z1, Ultra96 and ZCU104
* **Datacenter:** Alveo U250
It's possible to generate Vivado IP for the provided examples to target *any*
diff --git a/build/bnn-pynq/README.md b/build/bnn-pynq/README.md
index 85b0fe1..0c2b65a 100644
--- a/build/bnn-pynq/README.md
+++ b/build/bnn-pynq/README.md
@@ -32,7 +32,7 @@ FINN_EXAMPLES=/path/to/finn-examples
# cd into finn submodule
cd $FINN_EXAMPLES/build/finn
# launch the build on the bnn-pynq folder
-./run-docker.sh build_custom /path/to/finn-examples/build/bnn-pynq
+./run-docker.sh build_custom $FINN_EXAMPLES/build/bnn-pynq
```
5. The generated outputs will be under `bnn-pynq/output__`. You can find a description of the generated files [here](https://finn-dev.readthedocs.io/en/latest/command_line.html#simple-dataflow-build-mode).
@@ -40,7 +40,7 @@ cd $FINN_EXAMPLES/build/finn
## Where did those ONNX model files come from?
The BNN-PYNQ networks are part of the
-[Brevitas examples](https://github.com/Xilinx/brevitas/tree/master/brevitas_examples/bnn_pynq). You can find the details on quantization, accuracy, layers used in the Brevitas repo, as well as the training scripts if you'd like to retrain them yourself.
+[Brevitas examples](https://github.com/Xilinx/brevitas/tree/master/src/brevitas_examples/bnn_pynq). You can find the details on quantization, accuracy, layers used in the Brevitas repo, as well as the training scripts if you'd like to retrain them yourself.
Subsequently, those trained networks are [exported to ONNX](https://github.com/Xilinx/finn/blob/master/notebooks/basics/1_brevitas_network_import.ipynb). In addition, the particular versions
used here have two additions, as described in the "Adding Pre- and Postprocessing" section of [this notebook](https://github.com/Xilinx/finn/blob/master/notebooks/end2end_example/bnn-pynq/tfc_end2end_example.ipynb):
diff --git a/build/build-all.sh b/build/build-all.sh
index 886eecb..4fc2652 100755
--- a/build/build-all.sh
+++ b/build/build-all.sh
@@ -33,7 +33,7 @@ SCRIPT=$(readlink -f "$0")
# absolute path this script is in, thus /home/user/bin
SCRIPTPATH=$(dirname "$SCRIPT")
# subdirs for all finn-examples build folders
-BUILD_FOLDERS="bnn-pynq kws mobilenet-v1 resnet50 vgg10-radioml"
+BUILD_FOLDERS="bnn-pynq kws mobilenet-v1 resnet50 vgg10-radioml cybersecurity-mlp"
# all HW platforms we build for
PLATFORMS="Pynq-Z1 Ultra96 ZCU104 U250"
diff --git a/build/cybersecurity-mlp/README.md b/build/cybersecurity-mlp/README.md
new file mode 100644
index 0000000..5a15d0e
--- /dev/null
+++ b/build/cybersecurity-mlp/README.md
@@ -0,0 +1,22 @@
+# The multilayer perceptron for cybersecurity use-cases
+The multi layer perceptron (MLP) for the cybersecurity use-case is based on the three-part tutorial for training a quantized MLP and deploying it with FINN, which is provided in the FINN [end-to-end example repository](https://github.com/Xilinx/finn/tree/main/notebooks/end2end_example). The MLP consists of four fully-connected layers in total: three hidden layers with 64 neurons, and a final output layer with a single output, all using 2-bit weights. For more information on training the network, or more details behind what's happening under the hood, the notebooks provided in the FINN end-to-end example repository serve as an excellent starting point.
+
+# Build bitfiles for MLP example
+0. Ensure you have performed the *Setup* steps in the top-level README for setting up the FINN requirements and environment variables.
+
+1. Edit the `mlp-cybersecurity/build.py` to restrict the platform variables to the ones that you are interested in, e.g. `platforms_to_build = ["Pynq-Z1"]`. You can also change the other build configuration options, see the [FINN docs](https://finn-dev.readthedocs.io/en/latest/source_code/finn.util.html#finn.util.build_dataflow.DataflowBuildConfig) for a full explanation.
+
+2. Launch the build as follows:
+```shell
+# update this according to where you cloned this repo:
+FINN_EXAMPLES=/path/to/finn-examples
+# cd into finn submodule
+cd $FINN_EXAMPLES/build/finn
+# launch the build on the cybersecurity-mlp folder
+./run-docker.sh build_custom $FINN_EXAMPLES/build/cybersecurity-mlp
+```
+
+3. The generated outputs will be under `cybersecurity-mlp/output__`. You can find a description of the generated files [here](https://finn-dev.readthedocs.io/en/latest/command_line.html#simple-dataflow-build-mode).
+
+# Where did the ONNX model file come from?
+The ONNX model is created and exported prior to the build flow is launched. You can find the details of this process in the `cybersecurity-mlp/custom_steps.py` file.
\ No newline at end of file
diff --git a/build/cybersecurity-mlp/build.py b/build/cybersecurity-mlp/build.py
new file mode 100644
index 0000000..e5f4181
--- /dev/null
+++ b/build/cybersecurity-mlp/build.py
@@ -0,0 +1,78 @@
+import finn.builder.build_dataflow as build
+import finn.builder.build_dataflow_config as build_cfg
+from finn.util.basic import alveo_default_platform
+import os
+import shutil
+from custom_steps import custom_step_mlp_export
+
+# Which platforms to build the networks for
+zynq_platforms = ["Pynq-Z1", "Ultra96", "ZCU104"]
+alveo_platforms = []
+
+# Note: only zynq platforms currently tested
+platforms_to_build = zynq_platforms + alveo_platforms
+
+# determine which shell flow to use for a given platform
+def platform_to_shell(platform):
+ if platform in zynq_platforms:
+ return build_cfg.ShellFlowType.VIVADO_ZYNQ
+ elif platform in alveo_platforms:
+ return build_cfg.ShellFlowType.VITIS_ALVEO
+ else:
+ raise Exception("Unknown platform, can't determine ShellFlowType")
+
+# Define model name
+model_name = "unsw_nb15-mlp-w2a2"
+
+# Create a release dir, used for finn-examples release packaging
+os.makedirs("release", exist_ok=True)
+
+for platform_name in platforms_to_build:
+ shell_flow_type = platform_to_shell(platform_name)
+ if shell_flow_type == build_cfg.ShellFlowType.VITIS_ALVEO:
+ vitis_platform = alveo_default_platform[platform_name]
+ # for Alveo, use the Vitis platform name as the release name
+ # e.g. xilinx_u250_xdma_201830_2
+ release_platform_name = vitis_platform
+ else:
+ vitis_platform = None
+ # for Zynq, use the board name as the release name
+ # e.g. ZCU104
+ release_platform_name = platform_name
+ platform_dir = "release/%s" % release_platform_name
+ os.makedirs(platform_dir, exist_ok=True)
+ # Set up the build configuration for this model
+ cfg = build_cfg.DataflowBuildConfig(
+ output_dir = "output_%s_%s" % (model_name, release_platform_name),
+ mvau_wwidth_max = 80,
+ target_fps = 1000000,
+ synth_clk_period_ns = 10.0,
+ board = platform_name,
+ shell_flow_type = shell_flow_type,
+ vitis_platform = vitis_platform,
+ vitis_opt_strategy = build_cfg.VitisOptStrategyCfg.PERFORMANCE_BEST,
+ generate_outputs = [
+ build_cfg.DataflowOutputType.PYNQ_DRIVER,
+ build_cfg.DataflowOutputType.ESTIMATE_REPORTS,
+ build_cfg.DataflowOutputType.BITFILE,
+ build_cfg.DataflowOutputType.DEPLOYMENT_PACKAGE,
+ ],
+ save_intermediate_models=True
+ )
+
+ # Export MLP model to FINN-ONNX
+ model = custom_step_mlp_export(model_name)
+ # Launch FINN compiler to generate bitfile
+ build.build_dataflow_cfg(model, cfg)
+ # Copy bitfiles into release dir if found
+ bitfile_gen_dir = cfg.output_dir + "/bitfile"
+ filtes_to_check_and_copy = [
+ "finn-accel.bit",
+ "finn-accel.hwh",
+ "finn-accel.xclbin"
+ ]
+ for f in filtes_to_check_and_copy:
+ src_file = bitfile_gen_dir + "/" + f
+ dst_file = platform_dir + "/" + f.replace("finn-accel", model_name)
+ if os.path.isfile(src_file):
+ shutil.copy(src_file, dst_file)
diff --git a/build/cybersecurity-mlp/custom_steps.py b/build/cybersecurity-mlp/custom_steps.py
new file mode 100644
index 0000000..17a0d4f
--- /dev/null
+++ b/build/cybersecurity-mlp/custom_steps.py
@@ -0,0 +1,93 @@
+import numpy as np
+import pkg_resources as pk
+import os
+
+from brevitas.nn import QuantLinear, QuantReLU, QuantIdentity
+import torch
+import torch.nn as nn
+import brevitas.onnx as bo
+from brevitas.quant_tensor import QuantTensor
+
+# Define export wrapper
+class CybSecMLPForExport(nn.Module):
+ def __init__(self, my_pretrained_model):
+ super(CybSecMLPForExport, self).__init__()
+ self.pretrained = my_pretrained_model
+ self.qnt_output = QuantIdentity(
+ quant_type='binary',
+ scaling_impl_type='const',
+ bit_width=1,
+ min_val=-1.0,
+ max_val=1.0
+ )
+
+ def forward(self, x):
+ # assume x contains bipolar {-1,1} elems
+ # shift from {-1,1} -> {0,1} since that is the
+ # input range for the trained network
+ x = (x + torch.tensor([1.0]).to("cpu")) / 2.0
+ out_original = self.pretrained(x)
+ out_final = self.qnt_output(out_original) # output as {-1, 1}
+ return out_final
+
+def custom_step_mlp_export(model_name):
+ # Define model parameters
+ input_size = 593
+ hidden1 = 64
+ hidden2 = 64
+ hidden3 = 64
+ weight_bit_width = 2
+ act_bit_width = 2
+ num_classes = 1
+
+ # Create model definition from Brevitas library
+ model = nn.Sequential(
+ QuantLinear(input_size, hidden1, bias=True, weight_bit_width=weight_bit_width),
+ nn.BatchNorm1d(hidden1),
+ nn.Dropout(0.5),
+ QuantReLU(act_bit_width=act_bit_width),
+ QuantLinear(hidden1, hidden2, bias=True, weight_bit_width=weight_bit_width),
+ nn.BatchNorm1d(hidden2),
+ nn.Dropout(0.5),
+ QuantReLU(bit_width=act_bit_width),
+ QuantLinear(hidden2, hidden3, bias=True, weight_bit_width=weight_bit_width),
+ nn.BatchNorm1d(hidden3),
+ nn.Dropout(0.5),
+ QuantReLU(bit_width=act_bit_width),
+ QuantLinear(hidden3, num_classes, bias=True, weight_bit_width=weight_bit_width)
+ )
+
+ # Load pre-trained weights
+ assets_dir = pk.resource_filename("finn.qnn-data", "cybsec-mlp/")
+ trained_state_dict = torch.load(assets_dir+"/state_dict.pth")["models_state_dict"][0]
+ model.load_state_dict(trained_state_dict, strict=False)
+
+ # Network surgery: pad input size from 593 to 600 and convert bipolar to binary
+ W_orig = model[0].weight.data.detach().numpy()
+ W_new = np.pad(W_orig, [(0, 0), (0, 7)])
+ model[0].weight.data = torch.from_numpy(W_new)
+
+ model_for_export = CybSecMLPForExport(model)
+
+ # Create directory to save model
+ os.makedirs("models", exist_ok=True)
+ ready_model_filename = "models/%s.onnx" % model_name
+ input_shape = (1, 600)
+
+ # create a QuantTensor instance to mark input as bipolar during export
+ input_a = np.random.randint(0, 1, size=input_shape).astype(np.float32)
+ input_a = 2 * input_a - 1
+ scale = 1.0
+ input_t = torch.from_numpy(input_a * scale)
+ input_qt = QuantTensor(
+ input_t, scale=torch.tensor(scale), bit_width=torch.tensor(1.0), signed=True
+ )
+
+ # Export to ONNX
+ bo.export_finn_onnx(
+ model_for_export, export_path=ready_model_filename, input_t=input_qt
+ )
+
+ return ready_model_filename
+
+
\ No newline at end of file
diff --git a/build/get-finn.sh b/build/get-finn.sh
index 1f9a67c..9708119 100755
--- a/build/get-finn.sh
+++ b/build/get-finn.sh
@@ -30,7 +30,7 @@
# URL for git repo to be cloned
REPO_URL=https://github.com/Xilinx/finn
# commit hash for repo
-REPO_COMMIT=96c0f5e3678abd7b1eaab2a2b4f8e937ac1f48b8
+REPO_COMMIT=cdc5ec4b0dde59d5d8de0a5359aae529816376af
# directory (under the same folder as this script) to clone to
REPO_DIR=finn
diff --git a/build/kws/README.md b/build/kws/README.md
index 9588204..3b5eb4b 100644
--- a/build/kws/README.md
+++ b/build/kws/README.md
@@ -1,4 +1,4 @@
-# The KWS examplee
+# The KWS example
The KWS example includes an MLP for the Google SpeechCommandsV2 dataset.
@@ -15,7 +15,7 @@ FINN_EXAMPLES=/path/to/finn-examples
# cd into finn submodule
cd $FINN_EXAMPLES/build/finn
# launch the build on the bnn-pynq folder
-bash run-docker.sh build_custom /path/to/finn-examples/build/kws
+bash run-docker.sh build_custom $FINN_EXAMPLES/build/kws
```
3. The generated outputs will be under `kws/_output__`.
diff --git a/build/kws/build.py b/build/kws/build.py
index 2b72de5..63f15c2 100644
--- a/build/kws/build.py
+++ b/build/kws/build.py
@@ -39,6 +39,7 @@
import datetime
from glob import glob
import os
+import shutil
# Inject the preprocessing step into FINN to enable json serialization later on
@@ -168,7 +169,7 @@ def step_preprocess(model: ModelWrapper, cfg: DataflowBuildConfig):
pre_processed_inputs = pre_processed_inputs.astype(np.int8)
# Save data
- export_path = "models/" + f_name.replace(".npz", "_{}_len_{}.npy")
+ export_path = f_name.replace(".npz", "_{}_len_{}.npy")
print(f"Saving data to: {export_path}")
np.save(
export_path.format("inputs", len(pre_processed_inputs)), pre_processed_inputs
diff --git a/build/mobilenet-v1/README.md b/build/mobilenet-v1/README.md
index 3ff1a92..f31ad57 100644
--- a/build/mobilenet-v1/README.md
+++ b/build/mobilenet-v1/README.md
@@ -17,7 +17,7 @@ It requires about 2 MB of weight storage and 1.1 GMACs per inference, yielding
Due to the depthwise separable convolutions in MobileNet-v1,
we use a specialized build script that replaces a few of the standard steps
in FINN with custom ones.
-**MobileNet-v1 is currently only supported on Alveo U250 and ZCU104.**
+**MobileNet-v1 is currently only supported on Alveo U250.**
We also provide a folding configuration for the **ZCU102**, but there is no pre-built Pynq image available for this board.
0. Ensure you have performed the *Setup* steps in the top-level README for setting up the FINN requirements and environment variables.
@@ -33,7 +33,7 @@ FINN_EXAMPLES=/path/to/finn-examples
# cd into finn submodule
cd $FINN_EXAMPLES/build/finn
# launch the build on the mobilenet-v1 folder
-./run-docker.sh build_custom /path/to/finn-examples/build/mobilenet-v1
+./run-docker.sh build_custom $FINN_EXAMPLES/build/mobilenet-v1
```
5. The generated outputs will be under `mobilenet-v1/output__`. You can find a description of the generated files [here](https://finn-dev.readthedocs.io/en/latest/command_line.html#simple-dataflow-build-mode).
@@ -49,5 +49,5 @@ Subsequently, the trained networks is [exported to ONNX](https://github.com/Xili
* A top-K node with k=5 is added at the output (to return the top-5 class indices instead of logits)
These modifications are done as part of the end2end MobileNet-v1 test in FINN.
-You can [see more here](https://github.com/Xilinx/finn/blob/bf9a67eee6ff5a797ea3a0bd866706d7518c3c6f/tests/end2end/test_end2end_mobilenet_v1.py#L102)
+You can [see more here](https://github.com/Xilinx/finn/blob/41740ed1a953c09dd2f87b03ebfde5f9d8a7d4f0/tests/end2end/test_end2end_mobilenet_v1.py#L91)
for further reference.
diff --git a/build/mobilenet-v1/build.py b/build/mobilenet-v1/build.py
index 5c8f58c..8433308 100644
--- a/build/mobilenet-v1/build.py
+++ b/build/mobilenet-v1/build.py
@@ -44,7 +44,8 @@
model_name = "mobilenetv1-w4a4"
# which platforms to build the networks for
-zynq_platforms = ["ZCU102", "ZCU104"]
+#zynq_platforms = ["ZCU102", "ZCU104"]
+zynq_platforms = ["ZCU102"]
#alveo_platforms = ["U50", "U200", "U250", "U280"]
alveo_platforms = ["U250"]
platforms_to_build = zynq_platforms + alveo_platforms
diff --git a/build/resnet50/README.md b/build/resnet50/README.md
index afd98e3..3eab06f 100644
--- a/build/resnet50/README.md
+++ b/build/resnet50/README.md
@@ -16,7 +16,7 @@ finn-experimental](https://github.com/Xilinx/finn-experimental). This allows 16x
1. Download the pretrained Resnet 50 ONNX model from the releases page, and extract
the zipfile under `resnet50/models`. You should have e.g. `resnet50/models∕resnet50_w1a2_exported.onnx` as a result.
-You can use the provided `resnet50/models/download_resnet50.sh` script for this.
+You can use the provided `resnet50/models/download-model.sh` script for this.
2. Launch the build as follows:
```SHELL
@@ -25,7 +25,7 @@ FINN_EXAMPLES=/path/to/finn-examples
# cd into finn submodule
cd $FINN_EXAMPLES/build/finn
# launch the build on the resnet50 folder
-./run-docker.sh build_custom /path/to/finn-examples/build/resnet50
+./run-docker.sh build_custom $FINN_EXAMPLES/build/resnet50
```
5. The generated outputs will be under `resnet50/output__`. You can find a description of the generated files [here](https://finn-dev.readthedocs.io/en/latest/command_line.html#simple-dataflow-build-mode).
diff --git a/build/resnet50/build.py b/build/resnet50/build.py
index d09c26d..ca86cd9 100644
--- a/build/resnet50/build.py
+++ b/build/resnet50/build.py
@@ -113,7 +113,7 @@ def platform_to_shell(platform):
# throughput parameters (auto-folding)
mvau_wwidth_max = 24,
target_fps = target_fps,
- folding_config_file = folding_config_file,
+ folding_config_file = folding_config_file,
# enable extra performance optimizations (physopt)
vitis_opt_strategy=build_cfg.VitisOptStrategyCfg.PERFORMANCE_BEST,
generate_outputs=[
diff --git a/build/resnet50/custom_steps.py b/build/resnet50/custom_steps.py
index c9c2950..8e0b08c 100644
--- a/build/resnet50/custom_steps.py
+++ b/build/resnet50/custom_steps.py
@@ -107,6 +107,7 @@
from finn.transformation.fpgadataflow.set_fifo_depths import (
InsertAndSetFIFODepths,
RemoveShallowFIFOs,
+ SplitLargeFIFOs,
)
from finn.transformation.fpgadataflow.insert_dwc import InsertDWC
from finn.transformation.fpgadataflow.insert_fifo import InsertFIFO
@@ -265,8 +266,10 @@ def step_resnet50_set_fifo_depths(model: ModelWrapper, cfg: DataflowBuildConfig)
model = model.transform(GiveReadableTensorNames())
if cfg.folding_config_file is not None:
model = model.transform(ApplyConfig(cfg.folding_config_file))
- # remove any shallow FIFOs
- model = model.transform(RemoveShallowFIFOs())
+ # split large FIFOs into multiple FIFOs
+ model = model.transform(SplitLargeFIFOs())
+ # remove any shallow FIFOs
+ model = model.transform(RemoveShallowFIFOs())
# extract the final configuration and save it as json
hw_attrs = [
diff --git a/build/resnet50/folding_config/U250_folding_config.json b/build/resnet50/folding_config/U250_folding_config.json
index 7e096f1..da4f7da 100644
--- a/build/resnet50/folding_config/U250_folding_config.json
+++ b/build/resnet50/folding_config/U250_folding_config.json
@@ -1,8 +1,8 @@
{
"Defaults": {
- "outFIFODepth":[32,"all"],
- "inFIFODepth":[32,"all"],
- "mem_mode":["decoupled",["MatrixVectorActivation"]]
+ "outFIFODepths":[[32],"all"],
+ "inFIFODepths":[[32],"all"],
+ "mem_mode":["decoupled",["MatrixVectorActivation"]]
},
"ConvDoublePacked_Batch_0": {
"SIMD": 3,
@@ -19,7 +19,8 @@
"PE": 64
},
"DuplicateStreams_Batch_0": {
- "PE": 32
+ "PE": 32,
+ "outFIFODepths": [32, 32]
},
"MatrixVectorActivation_1": {
"PE": 32,
@@ -44,13 +45,15 @@
"SIMD": 32
},
"AddStreams_Batch_0": {
- "PE": 32
+ "PE": 32,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_0": {
"PE": 32
},
"DuplicateStreams_Batch_1": {
- "PE": 32
+ "PE": 32,
+ "outFIFODepths": [32, 32]
},
"Thresholding_Batch_1": {
"PE": 32
@@ -77,13 +80,15 @@
"SIMD": 32
},
"AddStreams_Batch_1": {
- "PE": 32
+ "PE": 32,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_3": {
"PE": 32
},
"DuplicateStreams_Batch_2": {
- "PE": 32
+ "PE": 32,
+ "outFIFODepths": [32, 32]
},
"Thresholding_Batch_4": {
"PE": 32
@@ -110,7 +115,8 @@
"SIMD": 64
},
"AddStreams_Batch_2": {
- "PE":32
+ "PE":32,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_6": {
"PE": 32
@@ -119,7 +125,8 @@
"PE": 32
},
"DuplicateStreams_Batch_3": {
- "PE": 32
+ "PE": 32,
+ "outFIFODepths": [32, 32]
},
"DownSampler_0": {
"SIMD": 64
@@ -147,13 +154,15 @@
"SIMD": 32
},
"AddStreams_Batch_3": {
- "PE":32
+ "PE":32,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_8": {
"PE": 32
},
"DuplicateStreams_Batch_4": {
- "PE": 32
+ "PE": 32,
+ "outFIFODepths": [32, 32]
},
"Thresholding_Batch_9": {
"PE": 32
@@ -180,13 +189,15 @@
"SIMD": 32
},
"AddStreams_Batch_4": {
- "PE":32
+ "PE":32,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_11": {
"PE": 32
},
"DuplicateStreams_Batch_5": {
- "PE": 32
+ "PE": 32,
+ "outFIFODepths": [32, 32]
},
"Thresholding_Batch_12": {
"PE": 32
@@ -213,13 +224,15 @@
"SIMD": 32
},
"AddStreams_Batch_5": {
- "PE":32
+ "PE":32,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_14": {
"PE": 32
},
"DuplicateStreams_Batch_6": {
- "PE": 32
+ "PE": 32,
+ "outFIFODepths": [32, 32]
},
"Thresholding_Batch_15": {
"PE": 32
@@ -246,7 +259,8 @@
"SIMD": 32
},
"AddStreams_Batch_6": {
- "PE":32
+ "PE":32,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_17": {
"PE": 32
@@ -255,7 +269,8 @@
"PE": 32
},
"DuplicateStreams_Batch_7": {
- "PE": 32
+ "PE": 32,
+ "outFIFODepths": [32, 32]
},
"DownSampler_1": {
"SIMD": 64
@@ -283,13 +298,15 @@
"SIMD": 32
},
"AddStreams_Batch_7": {
- "PE":32
+ "PE":32,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_19": {
"PE": 32
},
"DuplicateStreams_Batch_8": {
- "PE": 32
+ "PE": 32,
+ "outFIFODepths": [32, 32]
},
"Thresholding_Batch_20": {
"PE": 32
@@ -316,13 +333,15 @@
"SIMD": 32
},
"AddStreams_Batch_8": {
- "PE":32
+ "PE":32,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_22": {
"PE": 32
},
"DuplicateStreams_Batch_9": {
- "PE": 32
+ "PE": 32,
+ "outFIFODepths": [32, 32]
},
"Thresholding_Batch_23": {
"PE": 32
@@ -349,13 +368,15 @@
"SIMD": 32
},
"AddStreams_Batch_9": {
- "PE":32
+ "PE":32,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_25": {
"PE": 32
},
"DuplicateStreams_Batch_10": {
- "PE": 32
+ "PE": 32,
+ "outFIFODepths": [32, 32]
},
"Thresholding_Batch_26": {
"PE": 32
@@ -382,13 +403,15 @@
"SIMD": 32
},
"AddStreams_Batch_10": {
- "PE":32
+ "PE":32,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_28": {
"PE": 32
},
"DuplicateStreams_Batch_11": {
- "PE": 32
+ "PE": 32,
+ "outFIFODepths": [32, 32]
},
"Thresholding_Batch_29": {
"PE": 32
@@ -415,13 +438,15 @@
"SIMD": 32
},
"AddStreams_Batch_11": {
- "PE":32
+ "PE":32,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_31": {
"PE": 32
},
"DuplicateStreams_Batch_12": {
- "PE": 32
+ "PE": 32,
+ "outFIFODepths": [32, 32]
},
"Thresholding_Batch_32": {
"PE": 32
@@ -448,7 +473,8 @@
"SIMD": 32
},
"AddStreams_Batch_12": {
- "PE":32
+ "PE":32,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_34": {
"PE": 32
@@ -457,7 +483,8 @@
"PE": 32
},
"DuplicateStreams_Batch_13": {
- "PE": 32
+ "PE": 32,
+ "outFIFODepths": [32, 32]
},
"DownSampler_2": {
"SIMD": 64
@@ -485,13 +512,15 @@
"SIMD": 32
},
"AddStreams_Batch_13": {
- "PE":32
+ "PE":32,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_36": {
"PE": 32
},
"DuplicateStreams_Batch_14": {
- "PE": 32
+ "PE": 32,
+ "outFIFODepths": [32, 32]
},
"Thresholding_Batch_37": {
"PE": 32
@@ -518,13 +547,15 @@
"SIMD": 32
},
"AddStreams_Batch_14": {
- "PE":32
+ "PE":32,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_39": {
"PE": 32
},
"DuplicateStreams_Batch_15": {
- "PE": 32
+ "PE": 32,
+ "outFIFODepths": [32, 32]
},
"Thresholding_Batch_40": {
"PE": 32
@@ -551,7 +582,8 @@
"SIMD": 32
},
"AddStreams_Batch_15": {
- "PE":32
+ "PE":32,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_42": {
"PE": 32
diff --git a/build/resnet50/folding_config/U250_folding_config_no_doublepack_pe_folded_16.json b/build/resnet50/folding_config/U250_folding_config_no_doublepack_pe_folded_16.json
index 11573aa..09aa2dc 100644
--- a/build/resnet50/folding_config/U250_folding_config_no_doublepack_pe_folded_16.json
+++ b/build/resnet50/folding_config/U250_folding_config_no_doublepack_pe_folded_16.json
@@ -1,11 +1,11 @@
{
"Defaults": {
- "outFIFODepth": [
- 32,
+ "outFIFODepths": [
+ [32],
"all"
],
- "inFIFODepth": [
- 32,
+ "inFIFODepths": [
+ [32],
"all"
],
"mem_mode": [
@@ -35,7 +35,8 @@
"PE": 4
},
"DuplicateStreams_Batch_0": {
- "PE": 2
+ "PE": 2,
+ "outFIFODepths": [32, 32]
},
"MatrixVectorActivation_2": {
"PE": 2,
@@ -60,13 +61,15 @@
"SIMD": 32
},
"AddStreams_Batch_0": {
- "PE": 2
+ "PE": 2,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_0": {
"PE": 2
},
"DuplicateStreams_Batch_1": {
- "PE": 2
+ "PE": 2,
+ "outFIFODepths": [32, 32]
},
"Thresholding_Batch_1": {
"PE": 2
@@ -93,13 +96,15 @@
"SIMD": 32
},
"AddStreams_Batch_1": {
- "PE": 2
+ "PE": 2,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_3": {
"PE": 2
},
"DuplicateStreams_Batch_2": {
- "PE": 2
+ "PE": 2,
+ "outFIFODepths": [32, 32]
},
"Thresholding_Batch_4": {
"PE": 2
@@ -126,7 +131,8 @@
"SIMD": 64
},
"AddStreams_Batch_2": {
- "PE": 2
+ "PE": 2,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_6": {
"PE": 2
@@ -135,7 +141,8 @@
"PE": 2
},
"DuplicateStreams_Batch_3": {
- "PE": 2
+ "PE": 2,
+ "outFIFODepths": [32, 32]
},
"DownSampler_0": {
"SIMD": 4
@@ -163,13 +170,15 @@
"SIMD": 32
},
"AddStreams_Batch_3": {
- "PE": 2
+ "PE": 2,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_8": {
"PE": 2
},
"DuplicateStreams_Batch_4": {
- "PE": 2
+ "PE": 2,
+ "outFIFODepths": [32, 32]
},
"Thresholding_Batch_9": {
"PE": 2
@@ -196,13 +205,15 @@
"SIMD": 32
},
"AddStreams_Batch_4": {
- "PE": 2
+ "PE": 2,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_11": {
"PE": 2
},
"DuplicateStreams_Batch_5": {
- "PE": 2
+ "PE": 2,
+ "outFIFODepths": [32, 32]
},
"Thresholding_Batch_12": {
"PE": 2
@@ -229,13 +240,15 @@
"SIMD": 32
},
"AddStreams_Batch_5": {
- "PE": 2
+ "PE": 2,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_14": {
"PE": 2
},
"DuplicateStreams_Batch_6": {
- "PE": 2
+ "PE": 2,
+ "outFIFODepths": [32, 32]
},
"Thresholding_Batch_15": {
"PE": 2
@@ -262,7 +275,8 @@
"SIMD": 32
},
"AddStreams_Batch_6": {
- "PE": 2
+ "PE": 2,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_17": {
"PE": 2
@@ -271,7 +285,8 @@
"PE": 2
},
"DuplicateStreams_Batch_7": {
- "PE": 2
+ "PE": 2,
+ "outFIFODepths": [32, 32]
},
"DownSampler_1": {
"SIMD": 4
@@ -299,13 +314,15 @@
"SIMD": 32
},
"AddStreams_Batch_7": {
- "PE": 2
+ "PE": 2,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_19": {
"PE": 2
},
"DuplicateStreams_Batch_8": {
- "PE": 2
+ "PE": 2,
+ "outFIFODepths": [32, 32]
},
"Thresholding_Batch_20": {
"PE": 2
@@ -332,13 +349,15 @@
"SIMD": 32
},
"AddStreams_Batch_8": {
- "PE": 2
+ "PE": 2,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_22": {
"PE": 2
},
"DuplicateStreams_Batch_9": {
- "PE": 2
+ "PE": 2,
+ "outFIFODepths": [32, 32]
},
"Thresholding_Batch_23": {
"PE": 2
@@ -365,13 +384,15 @@
"SIMD": 32
},
"AddStreams_Batch_9": {
- "PE": 2
+ "PE": 2,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_25": {
"PE": 2
},
"DuplicateStreams_Batch_10": {
- "PE": 2
+ "PE": 2,
+ "outFIFODepths": [32, 32]
},
"Thresholding_Batch_26": {
"PE": 2
@@ -398,13 +419,15 @@
"SIMD": 32
},
"AddStreams_Batch_10": {
- "PE": 2
+ "PE": 2,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_28": {
"PE": 2
},
"DuplicateStreams_Batch_11": {
- "PE": 2
+ "PE": 2,
+ "outFIFODepths": [32, 32]
},
"Thresholding_Batch_29": {
"PE": 2
@@ -431,13 +454,15 @@
"SIMD": 32
},
"AddStreams_Batch_11": {
- "PE": 2
+ "PE": 2,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_31": {
"PE": 2
},
"DuplicateStreams_Batch_12": {
- "PE": 2
+ "PE": 2,
+ "outFIFODepths": [32, 32]
},
"Thresholding_Batch_32": {
"PE": 2
@@ -464,7 +489,8 @@
"SIMD": 32
},
"AddStreams_Batch_12": {
- "PE": 2
+ "PE": 2,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_34": {
"PE": 2
@@ -473,7 +499,8 @@
"PE": 2
},
"DuplicateStreams_Batch_13": {
- "PE": 2
+ "PE": 2,
+ "outFIFODepths": [32, 32]
},
"DownSampler_2": {
"SIMD": 4
@@ -501,13 +528,15 @@
"SIMD": 32
},
"AddStreams_Batch_13": {
- "PE": 2
+ "PE": 2,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_36": {
"PE": 2
},
"DuplicateStreams_Batch_14": {
- "PE": 2
+ "PE": 2,
+ "outFIFODepths": [32, 32]
},
"Thresholding_Batch_37": {
"PE": 2
@@ -534,13 +563,15 @@
"SIMD": 32
},
"AddStreams_Batch_14": {
- "PE": 2
+ "PE": 2,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_39": {
"PE": 2
},
"DuplicateStreams_Batch_15": {
- "PE": 2
+ "PE": 2,
+ "outFIFODepths": [32, 32]
},
"Thresholding_Batch_40": {
"PE": 2
@@ -567,7 +598,8 @@
"SIMD": 32
},
"AddStreams_Batch_15": {
- "PE": 2
+ "PE": 2,
+ "inFIFODepths": [32, 32]
},
"Thresholding_Batch_42": {
"PE": 2
diff --git a/docs/img/unsw-nb15.jpg b/docs/img/unsw-nb15.jpg
new file mode 100644
index 0000000..6941fe4
Binary files /dev/null and b/docs/img/unsw-nb15.jpg differ
diff --git a/finn_examples/bitfiles/bitfiles.zip.link b/finn_examples/bitfiles/bitfiles.zip.link
index 9e93b92..c8d3762 100644
--- a/finn_examples/bitfiles/bitfiles.zip.link
+++ b/finn_examples/bitfiles/bitfiles.zip.link
@@ -1,22 +1,22 @@
{
"Pynq-Z1": {
- "url": "https://github.com/Xilinx/finn-examples/releases/download/kws/Pynq-Z1.zip",
- "md5sum": "bf1df783bae7a1a477797d2eaa61eb9f"
- },
- "Pynq-Z2": {
- "url": "https://github.com/Xilinx/finn-examples/releases/download/v0.0.1a/Pynq-Z2.zip",
- "md5sum": "8b6519073bcc830ca1660dc647dabd79"
+ "url": "https://github.com/Xilinx/finn-examples/releases/download/v0.0.6/Pynq-Z1.zip",
+ "md5sum": "54b4691e5195ff10fb0c5339b78d6ea5"
},
"Ultra96": {
- "url": "https://github.com/Xilinx/finn-examples/releases/download/v0.0.1a/Ultra96.zip",
- "md5sum": "59598d7f36ffdc74a0a0262f5b67423c"
+ "url": "https://github.com/Xilinx/finn-examples/releases/download/v0.0.6/Ultra96.zip",
+ "md5sum": "a297d1f5657f624ffce16c901f1f3f30"
},
"ZCU104": {
- "url": "https://github.com/Xilinx/finn-examples/releases/download/radioml/ZCU104.zip",
- "md5sum": "9b7edad0511da9cb3c834a289d6797a2"
+ "url": "https://github.com/Xilinx/finn-examples/releases/download/v0.0.6/ZCU104.zip",
+ "md5sum": "17dd6ba3d24002275b6622ae0c098378"
},
"xilinx_u250_xdma_201830_2": {
"url": "https://github.com/Xilinx/finn-examples/releases/download/rn50-u250/xilinx_u250_xdma_201830_2.zip",
"md5sum": "042cc5602c8a39d7541f1d79946c0b68"
+ },
+ "xilinx_u250_gen3x16_xdma_2_1_202010_1": {
+ "url": "https://github.com/Xilinx/finn-examples/releases/download/v0.0.6/xilinx_u250_gen3x16_xdma_2_1_202010_1.zip",
+ "md5sum": "59a61f233376ab0e340835034db50449"
}
}
diff --git a/finn_examples/driver.py b/finn_examples/driver.py
index b28321e..2096760 100644
--- a/finn_examples/driver.py
+++ b/finn_examples/driver.py
@@ -31,12 +31,9 @@
import time
from pynq import Overlay, allocate
from pynq.ps import Clocks
-
from qonnx.core.datatype import DataType
-from qonnx.util.basic import (
- gen_finn_dt_tensor,
- roundup_to_integer_multiple
-)
+from qonnx.util.basic import gen_finn_dt_tensor
+
from finn.util.data_packing import (
finnpy_to_packed_bytearray,
packed_bytearray_to_finnpy,
@@ -88,24 +85,27 @@ def __init__(
self.platform = platform
self.batch_size = batch_size
self.fclk_mhz = fclk_mhz
- if self.platform == "alveo":
- if "input_dma_name" in io_shape_dict.keys():
- self.idma = getattr(self, io_shape_dict["input_dma_name"])
- else:
- self.idma = self.idma0
- self.odma = self.odma0
- self.odma_handle = None
- elif self.platform == "zynq-iodma":
- if "input_dma_name" in io_shape_dict.keys():
- self.idma = getattr(self, io_shape_dict["input_dma_name"])
- else:
- self.idma = self.idma0
- self.odma = self.odma0
+ self.idma = []
+ self.odma = []
+ self.odma_handle = []
+ if "input_dma_name" in io_shape_dict.keys():
+ for idma_name in io_shape_dict["input_dma_name"]:
+ self.idma.append(getattr(self, idma_name))
+ else:
+ self.idma = [self.idma0]
+ if "output_dma_name" in io_shape_dict.keys():
+ for odma_name in io_shape_dict["output_dma_name"]:
+ self.odma.append(getattr(self, odma_name))
+ if self.platform == "alveo":
+ self.odma_handle.append(None)
+ else:
+ self.odma = [self.odma0]
+ if self.platform == "alveo":
+ self.odma_handle.append(None)
+ if self.platform == "zynq-iodma":
# set the clock frequency as specified by user during transformations
if self.fclk_mhz > 0:
Clocks.fclk0_mhz = self.fclk_mhz
- else:
- raise ValueError("Supported platforms are zynq-iodma alveo")
# load any external + runtime weights
self.load_external_weights()
self.load_runtime_weights()
@@ -207,50 +207,50 @@ def load_runtime_weights(self, flush_accel=True, verify=True):
# run accelerator to flush any stale weights from weight streamer FIFOs
self.execute_on_buffers()
- @property
- def idt(self):
- return self._io_shape_dict["idt"]
+ def idt(self, ind=0):
+ return self._io_shape_dict["idt"][ind]
- @property
- def odt(self):
- return self._io_shape_dict["odt"]
+ def odt(self, ind=0):
+ return self._io_shape_dict["odt"][ind]
- @property
- def ishape_normal(self):
- ret = list(self._io_shape_dict["ishape_normal"])
+ def ishape_normal(self, ind=0):
+ ret = list(self._io_shape_dict["ishape_normal"][ind])
ret[0] = self.batch_size
return tuple(ret)
- @property
- def oshape_normal(self):
- ret = list(self._io_shape_dict["oshape_normal"])
+ def oshape_normal(self, ind=0):
+ ret = list(self._io_shape_dict["oshape_normal"][ind])
ret[0] = self.batch_size
return tuple(ret)
- @property
- def ishape_folded(self):
- ret = list(self._io_shape_dict["ishape_folded"])
+ def ishape_folded(self, ind=0):
+ ret = list(self._io_shape_dict["ishape_folded"][ind])
ret[0] = self.batch_size
return tuple(ret)
- @property
- def oshape_folded(self):
- ret = list(self._io_shape_dict["oshape_folded"])
+ def oshape_folded(self, ind=0):
+ ret = list(self._io_shape_dict["oshape_folded"][ind])
ret[0] = self.batch_size
return tuple(ret)
- @property
- def ishape_packed(self):
- ret = list(self._io_shape_dict["ishape_packed"])
+ def ishape_packed(self, ind=0):
+ ret = list(self._io_shape_dict["ishape_packed"][ind])
ret[0] = self.batch_size
return tuple(ret)
- @property
- def oshape_packed(self):
- ret = list(self._io_shape_dict["oshape_packed"])
+ def oshape_packed(self, ind=0):
+ ret = list(self._io_shape_dict["oshape_packed"][ind])
ret[0] = self.batch_size
return tuple(ret)
+ @property
+ def num_inputs(self):
+ return self._io_shape_dict["num_inputs"]
+
+ @property
+ def num_outputs(self):
+ return self._io_shape_dict["num_outputs"]
+
@property
def batch_size(self):
return self._batch_size
@@ -264,68 +264,72 @@ def batch_size(self, value):
self.ibuf_packed_device = None
if self.obuf_packed_device is not None:
self.obuf_packed_device = None
- if self.platform == "alveo":
- self.ibuf_packed_device = allocate(shape=self.ishape_packed, dtype=np.uint8)
- self.obuf_packed_device = allocate(shape=self.oshape_packed, dtype=np.uint8)
- else:
- self.ibuf_packed_device = allocate(
- shape=self.ishape_packed, dtype=np.uint8, cacheable=True
+ cacheable = {"alveo": False, "zynq-iodma": True}[self.platform]
+ self.ibuf_packed_device = []
+ self.obuf_packed_device = []
+ self.obuf_packed = []
+ for i in range(self.num_inputs):
+ new_packed_ibuf = allocate(
+ shape=self.ishape_packed(i), dtype=np.uint8, cacheable=cacheable
)
- self.obuf_packed_device = allocate(
- shape=self.oshape_packed, dtype=np.uint8, cacheable=True
+ self.ibuf_packed_device.append(new_packed_ibuf)
+ for o in range(self.num_outputs):
+ new_packed_obuf = allocate(
+ shape=self.oshape_packed(o), dtype=np.uint8, cacheable=cacheable
)
- self.obuf_packed = np.empty_like(self.obuf_packed_device)
+ self.obuf_packed_device.append(new_packed_obuf)
+ self.obuf_packed.append(np.empty_like(new_packed_obuf))
- def fold_input(self, ibuf_normal):
+ def fold_input(self, ibuf_normal, ind=0):
"""Reshapes input in desired shape.
Gets input data (ibuf_normal), checks if data is in expected normal shape.
Returns folded input."""
# ensure that shape is as expected
- assert ibuf_normal.shape == self.ishape_normal
+ assert ibuf_normal.shape == self.ishape_normal(ind)
# convert to folded form
- ibuf_folded = ibuf_normal.reshape(self.ishape_folded)
+ ibuf_folded = ibuf_normal.reshape(self.ishape_folded(ind))
return ibuf_folded
- def pack_input(self, ibuf_folded):
+ def pack_input(self, ibuf_folded, ind=0):
"""Packs folded input and reverses both SIMD dim and endianness.
Gets input data in folded shape and returns packed input data."""
ibuf_packed = finnpy_to_packed_bytearray(
ibuf_folded,
- self.idt,
+ self.idt(ind),
reverse_endian=True,
reverse_inner=True,
fast_mode=True,
)
return ibuf_packed
- def unpack_output(self, obuf_packed):
+ def unpack_output(self, obuf_packed, ind=0):
"""Unpacks the packed output buffer from accelerator.
Gets packed output and returns output data in folded shape."""
obuf_folded = packed_bytearray_to_finnpy(
obuf_packed,
- self.odt,
- self.oshape_folded,
+ self.odt(ind),
+ self.oshape_folded(ind),
reverse_endian=True,
reverse_inner=True,
fast_mode=True,
)
return obuf_folded
- def unfold_output(self, obuf_folded):
+ def unfold_output(self, obuf_folded, ind=0):
"""Unfolds output data to normal shape.
Gets folded output data and returns output data in normal shape."""
- obuf_normal = obuf_folded.reshape(self.oshape_normal)
+ obuf_normal = obuf_folded.reshape(self.oshape_normal(ind))
return obuf_normal
- def copy_input_data_to_device(self, data):
+ def copy_input_data_to_device(self, data, ind=0):
"""Copies given input data to PYNQ buffer."""
- np.copyto(self.ibuf_packed_device, data)
- self.ibuf_packed_device.flush()
+ np.copyto(self.ibuf_packed_device[ind], data)
+ self.ibuf_packed_device[ind].flush()
- def copy_output_data_from_device(self, data):
+ def copy_output_data_from_device(self, data, ind=0):
"""Copies PYNQ output buffer from device."""
- self.obuf_packed_device.invalidate()
- np.copyto(data, self.obuf_packed_device)
+ self.obuf_packed_device[ind].invalidate()
+ np.copyto(data, self.obuf_packed_device[ind])
def execute_on_buffers(self, asynch=False, batch_size=None):
"""Executes accelerator by setting up the DMA(s) on pre-allocated buffers.
@@ -341,24 +345,36 @@ def execute_on_buffers(self, asynch=False, batch_size=None):
batch_size = self.batch_size
assert batch_size <= self.batch_size, "Specified batch_size is too large."
if self.platform == "zynq-iodma":
- assert self.odma.read(0x00) & 0x4 != 0, "Output DMA is not idle"
+ for o in range(self.num_outputs):
+ assert (
+ self.odma[o].read(0x00) & 0x4 != 0
+ ), "Output DMA %d is not idle" % (o)
# manually launch IODMAs since signatures are missing
for iwdma, iwbuf, iwdma_name in self.external_weights:
iwdma.write(0x10, iwbuf.device_address)
iwdma.write(0x1C, batch_size)
iwdma.write(0x00, 1)
- self.idma.write(0x10, self.ibuf_packed_device.device_address)
- self.idma.write(0x1C, batch_size)
- self.odma.write(0x10, self.obuf_packed_device.device_address)
- self.odma.write(0x1C, batch_size)
- self.idma.write(0x00, 1)
- self.odma.write(0x00, 1)
+ for o in range(self.num_outputs):
+ self.odma[o].write(0x10, self.obuf_packed_device[o].device_address)
+ self.odma[o].write(0x1C, batch_size)
+ self.odma[o].write(0x00, 1)
+ for i in range(self.num_inputs):
+ self.idma[i].write(0x10, self.ibuf_packed_device[i].device_address)
+ self.idma[i].write(0x1C, batch_size)
+ self.idma[i].write(0x00, 1)
elif self.platform == "alveo":
- assert self.odma_handle is None, "Output DMA is already running"
- self.idma.start(self.ibuf_packed_device, batch_size)
+ for o in range(self.num_outputs):
+ assert self.odma_handle[o] is None, (
+ "Output DMA %d is already running" % o
+ )
+ for i in range(self.num_inputs):
+ self.idma[i].start(self.ibuf_packed_device[i], batch_size)
for iwdma, iwbuf, iwdma_name in self.external_weights:
iwdma.start(iwbuf, batch_size)
- self.odma_handle = self.odma.start(self.obuf_packed_device, batch_size)
+ for o in range(self.num_outputs):
+ self.odma_handle[o] = self.odma[o].start(
+ self.obuf_packed_device[o], batch_size
+ )
else:
raise Exception("Unrecognized platform: %s" % self.platform)
# blocking behavior depends on asynch parameter
@@ -366,31 +382,48 @@ def execute_on_buffers(self, asynch=False, batch_size=None):
self.wait_until_finished()
def wait_until_finished(self):
- "Block until the output DMA has finished writing."
+ "Block until all output DMAs have finished writing."
if self.platform == "zynq-iodma":
# check if output IODMA is finished via register reads
- status = self.odma.read(0x00)
- while status & 0x2 == 0:
- status = self.odma.read(0x00)
+ for o in range(self.num_outputs):
+ status = self.odma[o].read(0x00)
+ while status & 0x2 == 0:
+ status = self.odma[o].read(0x00)
elif self.platform == "alveo":
- assert self.odma_handle is not None, "No odma_handle to wait on"
- self.odma_handle.wait()
- self.odma_handle = None
+ assert all(
+ [x is not None for x in self.odma_handle]
+ ), "No odma_handle to wait on"
+ for o in range(self.num_outputs):
+ self.odma_handle[o].wait()
+ self.odma_handle[o] = None
else:
raise Exception("Unrecognized platform: %s" % self.platform)
def execute(self, input_npy):
- """Given input numpy array, first perform necessary packing and copying
- to device buffers, execute on accelerator, then unpack output and return
- output numpy array from accelerator."""
- ibuf_folded = self.fold_input(input_npy)
- ibuf_packed = self.pack_input(ibuf_folded)
- self.copy_input_data_to_device(ibuf_packed)
+ """Given a single or a list of input numpy array, first perform necessary
+ packing and copying to device buffers, execute on accelerator, then unpack
+ output and return output numpy array from accelerator."""
+ # if single input, convert to list to normalize how we process the input
+ if not type(input_npy) is list:
+ input_npy = [input_npy]
+ assert self.num_inputs == len(
+ input_npy
+ ), "Not all accelerator inputs are specified."
+ for i in range(self.num_inputs):
+ ibuf_folded = self.fold_input(input_npy[i], ind=i)
+ ibuf_packed = self.pack_input(ibuf_folded, ind=i)
+ self.copy_input_data_to_device(ibuf_packed, ind=i)
self.execute_on_buffers()
- self.copy_output_data_from_device(self.obuf_packed)
- obuf_folded = self.unpack_output(self.obuf_packed)
- obuf_normal = self.unfold_output(obuf_folded)
- return obuf_normal
+ outputs = []
+ for o in range(self.num_outputs):
+ self.copy_output_data_from_device(self.obuf_packed[o], ind=o)
+ obuf_folded = self.unpack_output(self.obuf_packed[o], ind=o)
+ obuf_normal = self.unfold_output(obuf_folded, ind=o)
+ outputs.append(obuf_normal)
+ if self.num_outputs == 1:
+ return outputs[0]
+ else:
+ return outputs
def throughput_test(self):
"""Run accelerator with empty inputs to measure throughput and other metrics.
@@ -403,19 +436,17 @@ def throughput_test(self):
runtime = end - start
res["runtime[ms]"] = runtime * 1000
res["throughput[images/s]"] = self.batch_size / runtime
- # the _packed arrays always consist of bytes so no need to
- # take i/o bitwidths into account (which is baked into the shape)
- res["DRAM_in_bandwidth[MB/s]"] = (
- np.prod(self.ishape_packed) * 0.000001 / runtime
- )
- res["DRAM_out_bandwidth[MB/s]"] = (
- np.prod(self.oshape_packed) * 0.000001 / runtime
- )
+ total_in = 0
+ for i in range(self.num_inputs):
+ total_in += np.prod(self.ishape_packed(i))
+ res["DRAM_in_bandwidth[MB/s]"] = total_in * 0.000001 / runtime
+ total_out = 0
+ for o in range(self.num_outputs):
+ total_out += np.prod(self.oshape_packed(o))
+ res["DRAM_out_bandwidth[MB/s]"] = total_out * 0.000001 / runtime
for iwdma, iwbuf, iwdma_name in self.external_weights:
- # Bit-width of the elements of the array are always 8-bit;
- # see function load_external_weights
res["DRAM_extw_%s_bandwidth[MB/s]" % iwdma_name] = (
- self.batch_size * np.prod(iwbuf.shape) * 0.000001 / runtime * 8
+ self.batch_size * np.prod(iwbuf.shape) * 0.000001 / runtime
)
if self.platform == "zynq-iodma":
res["fclk[mhz]"] = Clocks.fclk0_mhz
@@ -423,11 +454,11 @@ def throughput_test(self):
res["fclk[mhz]"] = self.clock_dict["clock0"]["frequency"]
res["batch_size"] = self.batch_size
# also benchmark driver-related overheads
- input_npy = gen_finn_dt_tensor(self.idt, self.ishape_normal)
+ input_npy = gen_finn_dt_tensor(self.idt(), self.ishape_normal())
# provide as int8/uint8 to support fast packing path where possible
- if self.idt == DataType["UINT8"]:
+ if self.idt() == DataType["UINT8"]:
input_npy = input_npy.astype(np.uint8)
- elif self.idt == DataType["INT8"]:
+ elif self.idt() == DataType["INT8"]:
input_npy = input_npy.astype(np.int8)
start = time.time()
ibuf_folded = self.fold_input(input_npy)
@@ -448,13 +479,13 @@ def throughput_test(self):
res["copy_input_data_to_device[ms]"] = runtime * 1000
start = time.time()
- self.copy_output_data_from_device(self.obuf_packed)
+ self.copy_output_data_from_device(self.obuf_packed[0])
end = time.time()
runtime = end - start
res["copy_output_data_from_device[ms]"] = runtime * 1000
start = time.time()
- obuf_folded = self.unpack_output(self.obuf_packed)
+ obuf_folded = self.unpack_output(self.obuf_packed[0])
end = time.time()
runtime = end - start
res["unpack_output[ms]"] = runtime * 1000
diff --git a/finn_examples/models.py b/finn_examples/models.py
index fa83010..09264af 100644
--- a/finn_examples/models.py
+++ b/finn_examples/models.py
@@ -36,85 +36,133 @@
from finn_examples.driver import FINNExampleOverlay
_mnist_fc_io_shape_dict = {
- "idt": DataType["UINT8"],
- "odt": DataType["UINT8"],
- "ishape_normal": (1, 784),
- "oshape_normal": (1, 1),
- "ishape_folded": (1, 1, 784),
- "oshape_folded": (1, 1, 1),
- "ishape_packed": (1, 1, 784),
- "oshape_packed": (1, 1, 1),
+ "idt" : [DataType['UINT8']],
+ "odt" : [DataType['UINT8']],
+ "ishape_normal" : [(1, 784)],
+ "oshape_normal" : [(1, 1)],
+ "ishape_folded" : [(1, 16, 49)],
+ "oshape_folded" : [(1, 1, 1)],
+ "ishape_packed" : [(1, 16, 49)],
+ "oshape_packed" : [(1, 1, 1)],
+ "input_dma_name" : ['idma0'],
+ "output_dma_name" : ['odma0'],
+ "number_of_external_weights": 0,
+ "num_inputs" : 1,
+ "num_outputs" : 1,
}
_cifar10_cnv_io_shape_dict = {
- "idt": DataType["UINT8"],
- "odt": DataType["UINT8"],
- "ishape_normal": (1, 32, 32, 3),
- "oshape_normal": (1, 1),
- "ishape_folded": (1, 1, 32, 32, 1, 3),
- "oshape_folded": (1, 1, 1),
- "ishape_packed": (1, 1, 32, 32, 1, 3),
- "oshape_packed": (1, 1, 1),
+ "idt" : [DataType['UINT8']],
+ "odt" : [DataType['UINT8']],
+ "ishape_normal" : [(1, 32, 32, 3)],
+ "oshape_normal" : [(1, 1)],
+ "ishape_folded" : [(1, 32, 32, 3, 1)],
+ "oshape_folded" : [(1, 1, 1)],
+ "ishape_packed" : [(1, 32, 32, 3, 1)],
+ "oshape_packed" : [(1, 1, 1)],
+ "input_dma_name" : ['idma0'],
+ "output_dma_name" : ['odma0'],
+ "number_of_external_weights": 0,
+ "num_inputs" : 1,
+ "num_outputs" : 1,
}
_bincop_cnv_io_shape_dict = {
- "idt": DataType["UINT8"],
- "odt": DataType["UINT8"],
- "ishape_normal": (1, 72, 72, 3),
- "oshape_normal": (1, 1),
- "ishape_folded": (1, 1, 72, 72, 1, 3),
- "oshape_folded": (1, 1, 1),
- "ishape_packed": (1, 1, 72, 72, 1, 3),
- "oshape_packed": (1, 1, 1),
+ "idt": [DataType["UINT8"]],
+ "odt": [DataType["UINT8"]],
+ "ishape_normal": [(1, 72, 72, 3)],
+ "oshape_normal": [(1, 1)],
+ "ishape_folded": [(1, 1, 72, 72, 1, 3)],
+ "oshape_folded": [(1, 1, 1)],
+ "ishape_packed": [(1, 1, 72, 72, 1, 3)],
+ "oshape_packed": [(1, 1, 1)],
+ "input_dma_name" : ["idma0"],
+ "output_dma_name" : ["odma0"],
+ "number_of_external_weights": 0,
+ "num_inputs" : 1,
+ "num_outputs" : 1,
}
_imagenet_top5inds_io_shape_dict = {
- "idt": DataType["UINT8"],
- "odt": DataType["UINT16"],
- "ishape_normal": (1, 224, 224, 3),
- "oshape_normal": (1, 1, 1, 5),
- "ishape_folded": (1, 224, 224, 1, 3),
- "oshape_folded": (1, 1, 1, 1, 5),
- "ishape_packed": (1, 224, 224, 1, 3),
- "oshape_packed": (1, 1, 1, 1, 10),
+ "idt" : [DataType['UINT8']],
+ "odt" : [DataType['UINT16']],
+ "ishape_normal" : [(1, 224, 224, 3)],
+ "oshape_normal" : [(1, 1, 1, 5)],
+ "ishape_folded" : [(1, 224, 224, 3, 1)],
+ "oshape_folded" : [(1, 1, 1, 5, 1)],
+ "ishape_packed" : [(1, 224, 224, 3, 1)],
+ "oshape_packed" : [(1, 1, 1, 5, 2)],
+ "input_dma_name" : ['idma0'],
+ "output_dma_name" : ['odma0'],
+ "number_of_external_weights": 0,
+ "num_inputs" : 1,
+ "num_outputs" : 1,
}
# resnet50 uses a different io_shape_dict due to
# external weights for last layer
_imagenet_resnet50_top5inds_io_shape_dict = {
- "idt": DataType["UINT8"],
- "odt": DataType["UINT16"],
- "ishape_normal": (1, 224, 224, 3),
- "oshape_normal": (1, 5),
- "ishape_folded": (1, 224, 224, 3),
- "oshape_folded": (1, 5, 1),
- "ishape_packed": (1, 224, 224, 3),
- "oshape_packed": (1, 5, 2),
- "input_dma_name": "idma1",
+ "idt": [DataType["UINT8"]],
+ "odt": [DataType["UINT16"]],
+ "ishape_normal": [(1, 224, 224, 3)],
+ "oshape_normal": [(1, 5)],
+ "ishape_folded": [(1, 224, 224, 3)],
+ "oshape_folded": [(1, 5, 1)],
+ "ishape_packed": [(1, 224, 224, 3)],
+ "oshape_packed": [(1, 5, 2)],
+ "input_dma_name" : ['idma1'],
+ "output_dma_name" : ['odma1'],
"number_of_external_weights": 1,
+ "num_inputs" : 1,
+ "num_outputs" : 1,
}
_radioml_io_shape_dict = {
- "idt": DataType["INT8"],
- "odt": DataType["UINT8"],
- "ishape_normal": (1, 1024, 1, 2),
- "oshape_normal": (1, 1),
- "ishape_folded": (1, 1024, 1, 1, 2),
- "oshape_folded": (1, 1, 1),
- "ishape_packed": (1, 1024, 1, 1, 2),
- "oshape_packed": (1, 1, 1),
+ "idt" : [DataType['INT8']],
+ "odt" : [DataType['UINT8']],
+ "ishape_normal" : [(1, 1024, 1, 2)],
+ "oshape_normal" : [(1, 1)],
+ "ishape_folded" : [(1, 1024, 1, 1, 2)],
+ "oshape_folded" : [(1, 1, 1)],
+ "ishape_packed" : [(1, 1024, 1, 1, 2)],
+ "oshape_packed" : [(1, 1, 1)],
+ "input_dma_name" : ['idma0'],
+ "output_dma_name" : ['odma0'],
+ "number_of_external_weights": 0,
+ "num_inputs" : 1,
+ "num_outputs" : 1,
}
_gscv2_mlp_io_shape_dict = {
- "idt" : DataType["INT8"],
- "odt" : DataType["UINT8"],
- "ishape_normal" : (1, 490),
- "oshape_normal" : (1, 1),
- "ishape_folded" : (1, 49, 10),
- "oshape_folded" : (1, 1, 1),
- "ishape_packed" : (1, 49, 10),
- "oshape_packed" : (1, 1, 1),
- "input_dma_name" : 'idma0',
+ "idt" : [DataType['INT8']],
+ "odt" : [DataType['UINT8']],
+ "ishape_normal" : [(1, 490)],
+ "oshape_normal" : [(1, 1)],
+ "ishape_folded" : [(1, 49, 10)],
+ "oshape_folded" : [(1, 1, 1)],
+ "ishape_packed" : [(1, 49, 10)],
+ "oshape_packed" : [(1, 1, 1)],
+ "input_dma_name" : ['idma0'],
+ "output_dma_name" : ['odma0'],
+ "number_of_external_weights": 0,
+ "num_inputs" : 1,
+ "num_outputs" : 1,
+}
+
+_unsw_nb15_mlp_io_shape_dict = {
+ "idt" : [DataType['BIPOLAR']],
+ "odt" : [DataType['BIPOLAR']],
+ "ishape_normal" : [(1, 600)],
+ "oshape_normal" : [(1, 1)],
+ "ishape_folded" : [(1, 15, 40)],
+ "oshape_folded" : [(1, 1, 1)],
+ "ishape_packed" : [(1, 15, 5)],
+ "oshape_packed" : [(1, 1, 1)],
+ "input_dma_name" : ['idma0'],
+ "output_dma_name" : ['odma0'],
+ "number_of_external_weights": 0,
+ "num_inputs" : 1,
+ "num_outputs" : 1,
}
# from https://github.com/Xilinx/PYNQ-HelloWorld/blob/master/setup.py
@@ -296,3 +344,17 @@ def vgg10_w4a4_radioml(target_platform=None):
_radioml_io_shape_dict,
fclk_mhz=fclk_mhz,
)
+
+
+def mlp_w2a2_unsw_nb15(target_platform=None):
+ target_platform = resolve_target_platform(target_platform)
+ driver_mode = get_driver_mode()
+ model_name = "unsw_nb15-mlp-w2a2"
+ filename = find_bitfile(model_name, target_platform)
+ fclk_mhz = 100.0
+ return FINNExampleOverlay(
+ filename,
+ driver_mode,
+ _unsw_nb15_mlp_io_shape_dict,
+ fclk_mhz=fclk_mhz
+ )
diff --git a/finn_examples/notebooks/0_mnist_with_fc_networks.ipynb b/finn_examples/notebooks/0_mnist_with_fc_networks.ipynb
index a9d6cb7..0c140ce 100644
--- a/finn_examples/notebooks/0_mnist_with_fc_networks.ipynb
+++ b/finn_examples/notebooks/0_mnist_with_fc_networks.ipynb
@@ -2,91 +2,88 @@
"cells": [
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"# Initialize the accelerator"
- ],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 1,
- "source": [
- "from finn_examples import models\n",
- "print(list(filter(lambda x: \"mnist\" in x, dir(models))))"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "display_data",
"data": {
"application/javascript": "\ntry {\nrequire(['notebook/js/codecell'], function(codecell) {\n codecell.CodeCell.options_default.highlight_modes[\n 'magic_text/x-csrc'] = {'reg':[/^%%microblaze/]};\n Jupyter.notebook.events.one('kernel_ready.Kernel', function(){\n Jupyter.notebook.get_cells().map(function(cell){\n if (cell.cell_type == 'code'){ cell.auto_highlight(); } }) ;\n });\n});\n} catch (e) {};\n"
},
- "metadata": {}
+ "metadata": {},
+ "output_type": "display_data"
},
{
- "output_type": "display_data",
"data": {
"application/javascript": "\ntry {\nrequire(['notebook/js/codecell'], function(codecell) {\n codecell.CodeCell.options_default.highlight_modes[\n 'magic_text/x-csrc'] = {'reg':[/^%%pybind11/]};\n Jupyter.notebook.events.one('kernel_ready.Kernel', function(){\n Jupyter.notebook.get_cells().map(function(cell){\n if (cell.cell_type == 'code'){ cell.auto_highlight(); } }) ;\n });\n});\n} catch (e) {};\n"
},
- "metadata": {}
+ "metadata": {},
+ "output_type": "display_data"
},
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"['_mnist_fc_io_shape_dict', 'tfc_w1a1_mnist', 'tfc_w1a2_mnist', 'tfc_w2a2_mnist']\n"
]
}
],
- "metadata": {}
+ "source": [
+ "from finn_examples import models\n",
+ "print(list(filter(lambda x: \"mnist\" in x, dir(models))))"
+ ]
},
{
"cell_type": "code",
"execution_count": 2,
+ "metadata": {},
+ "outputs": [],
"source": [
"accel = models.tfc_w1a1_mnist()"
- ],
- "outputs": [],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 3,
- "source": [
- "print(\"Expected input shape and datatype: %s %s\" % (str(accel.ishape_normal), str(accel.idt)))\n",
- "print(\"Expected output shape and datatype: %s %s\" % (str(accel.oshape_normal), str(accel.odt)))"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"Expected input shape and datatype: (1, 784) DataType.UINT8\n",
"Expected output shape and datatype: (1, 1) DataType.UINT8\n"
]
}
],
- "metadata": {}
+ "source": [
+ "print(\"Expected input shape and datatype: %s %s\" % (str(accel.ishape_normal()), str(accel.idt())))\n",
+ "print(\"Expected output shape and datatype: %s %s\" % (str(accel.oshape_normal()), str(accel.odt())))"
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"# Load the MNIST dataset\n",
"\n",
"Use the `dataset_loading` package to get easy Python access to MNIST dataset:"
- ],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 4,
- "source": [
- "from dataset_loading import mnist\n",
- "trainx, trainy, testx, testy, valx, valy = mnist.load_mnist_data(\"/tmp\", download=True, one_hot=False)"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"Looking for Train Imgs\n",
"Download URL: http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz\n",
@@ -107,158 +104,172 @@
]
}
],
- "metadata": {}
+ "source": [
+ "from dataset_loading import mnist\n",
+ "trainx, trainy, testx, testy, valx, valy = mnist.load_mnist_data(\"/tmp\", download=True, one_hot=False)"
+ ]
},
{
"cell_type": "code",
"execution_count": 5,
- "source": [
- "testx.shape"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "execute_result",
"data": {
"text/plain": [
"(10000, 28, 28, 1)"
]
},
+ "execution_count": 5,
"metadata": {},
- "execution_count": 5
+ "output_type": "execute_result"
}
],
- "metadata": {}
+ "source": [
+ "testx.shape"
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"# Classify a single image"
- ],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 6,
+ "metadata": {},
+ "outputs": [],
"source": [
"test_single_x = testx[0].reshape(28, 28)\n",
"test_single_y = testy[0]"
- ],
- "outputs": [],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 8,
- "source": [
- "%matplotlib inline\n",
- "from matplotlib import pyplot as plt\n",
- "\n",
- "plt.imshow(test_single_x, cmap='gray')\n",
- "plt.show()"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "display_data",
"data": {
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAP8AAAD8CAYAAAC4nHJkAAAABHNCSVQICAgIfAhkiAAAAAlwSFlzAAALEgAACxIB0t1+/AAAADl0RVh0U29mdHdhcmUAbWF0cGxvdGxpYiB2ZXJzaW9uIDIuMS4xLCBodHRwOi8vbWF0cGxvdGxpYi5vcmcvAOZPmwAADQNJREFUeJzt3W+MVfWdx/HPZylNjPQBWLHEgnQb3bgaAzoaE3AzamxYbYKN1NQHGzbZMH2AZps0ZA1PypMmjemfrU9IpikpJtSWhFbRGBeDGylRGwejBYpQICzMgkAzJgUT0yDfPphDO8W5v3u5/84dv+9XQube8z1/vrnhM+ecOefcnyNCAPL5h7obAFAPwg8kRfiBpAg/kBThB5Ii/EBShB9IivADSRF+IKnP9HNjtrmdEOixiHAr83W057e9wvZB24dtP9nJugD0l9u9t9/2LEmHJD0gaVzSW5Iei4jfF5Zhzw/0WD/2/HdJOhwRRyPiz5J+IWllB+sD0EedhP96SSemvB+vpv0d2yO2x2yPdbAtAF3WyR/8pju0+MRhfUSMShqVOOwHBkkne/5xSQunvP+ipJOdtQOgXzoJ/1uSbrT9JduflfQNSdu70xaAXmv7sD8iLth+XNL/SJolaVNE7O9aZwB6qu1LfW1tjHN+oOf6cpMPgJmL8ANJEX4gKcIPJEX4gaQIP5AU4QeSIvxAUoQfSIrwA0kRfiApwg8kRfiBpAg/kBThB5Ii/EBShB9IivADSRF+ICnCDyRF+IGkCD+QFOEHkiL8QFKEH0iK8ANJEX4gKcIPJEX4gaTaHqJbkmwfk3RO0seSLkTEUDeaAtB7HYW/cm9E/LEL6wHQRxz2A0l1Gv6QtMP2Htsj3WgIQH90eti/LCJO2p4v6RXb70XErqkzVL8U+MUADBhHRHdWZG+QdD4ivl+YpzsbA9BQRLiV+do+7Ld9te3PXXot6SuS9rW7PgD91clh/3WSfm370np+HhEvd6UrAD3XtcP+ljbGYT/Qcz0/7AcwsxF+ICnCDyRF+IGkCD+QFOEHkurGU30prFq1qmFtzZo1xWVPnjxZrH/00UfF+pYtW4r1999/v2Ht8OHDxWWRF3t+ICnCDyRF+IGkCD+QFOEHkiL8QFKEH0iKR3pbdPTo0Ya1xYsX96+RaZw7d65hbf/+/X3sZLCMj483rD311FPFZcfGxrrdTt/wSC+AIsIPJEX4gaQIP5AU4QeSIvxAUoQfSIrn+VtUemb/tttuKy574MCBYv3mm28u1m+//fZifXh4uGHt7rvvLi574sSJYn3hwoXFeicuXLhQrJ89e7ZYX7BgQdvbPn78eLE+k6/zt4o9P5AU4QeSIvxAUoQfSIrwA0kRfiApwg8k1fR5ftubJH1V0pmIuLWaNk/SLyUtlnRM0qMR8UHTjc3g5/kH2dy5cxvWlixZUlx2z549xfqdd97ZVk+taDZewaFDh4r1ZvdPzJs3r2Ft7dq1xWU3btxYrA+ybj7P/zNJKy6b9qSknRFxo6Sd1XsAM0jT8EfELkkTl01eKWlz9XqzpIe73BeAHmv3nP+6iDglSdXP+d1rCUA/9PzeftsjkkZ6vR0AV6bdPf9p2wskqfp5ptGMETEaEUMRMdTmtgD0QLvh3y5pdfV6taTnu9MOgH5pGn7bz0p6Q9I/2R63/R+SvifpAdt/kPRA9R7ADML39mNgPfLII8X61q1bi/V9+/Y1rN17773FZScmLr/ANXPwvf0Aigg/kBThB5Ii/EBShB9IivADSXGpD7WZP7/8SMjevXs7Wn7VqlUNa9u2bSsuO5NxqQ9AEeEHkiL8QFKEH0iK8ANJEX4gKcIPJMUQ3ahNs6/Pvvbaa4v1Dz4of1v8wYMHr7inTNjzA0kRfiApwg8kRfiBpAg/kBThB5Ii/EBSPM+Pnlq2bFnD2quvvlpcdvbs2cX68PBwsb5r165i/dOK5/kBFBF+ICnCDyRF+IGkCD+QFOEHkiL8QFJNn+e3vUnSVyWdiYhbq2kbJK2RdLaabX1EvNSrJjFzPfjggw1rza7j79y5s1h/44032uoJk1rZ8/9M0opppv8oIpZU/wg+MMM0DX9E7JI00YdeAPRRJ+f8j9v+ne1Ntud2rSMAfdFu+DdK+rKkJZJOSfpBoxltj9gesz3W5rYA9EBb4Y+I0xHxcURclPQTSXcV5h2NiKGIGGq3SQDd11b4bS+Y8vZrkvZ1px0A/dLKpb5nJQ1L+rztcUnfkTRse4mkkHRM0jd72COAHuB5fnTkqquuKtZ3797dsHbLLbcUl73vvvuK9ddff71Yz4rn+QEUEX4gKcIPJEX4gaQIP5AU4QeSYohudGTdunXF+tKlSxvWXn755eKyXMrrLfb8QFKEH0iK8ANJEX4gKcIPJEX4gaQIP5AUj/Si6KGHHirWn3vuuWL9ww8/bFhbsWK6L4X+mzfffLNYx/R4pBdAEeEHkiL8QFKEH0iK8ANJEX4gKcIPJMXz/Mldc801xfrTTz9drM+aNatYf+mlxgM4cx2/Xuz5gaQIP5AU4QeSIvxAUoQfSIrwA0kRfiCpps/z214o6RlJX5B0UdJoRPzY9jxJv5S0WNIxSY9GxAdN1sXz/H3W7Dp8s2vtd9xxR7F+5MiRYr30zH6zZdGebj7Pf0HStyPiZkl3S1pr+58lPSlpZ0TcKGln9R7ADNE0/BFxKiLerl6fk3RA0vWSVkraXM22WdLDvWoSQPdd0Tm/7cWSlkr6raTrIuKUNPkLQtL8bjcHoHdavrff9hxJ2yR9KyL+ZLd0WiHbI5JG2msPQK+0tOe3PVuTwd8SEb+qJp+2vaCqL5B0ZrplI2I0IoYiYqgbDQPojqbh9+Qu/qeSDkTED6eUtktaXb1eLen57rcHoFdaudS3XNJvJO3V5KU+SVqvyfP+rZIWSTou6esRMdFkXVzq67ObbrqpWH/vvfc6Wv/KlSuL9RdeeKGj9ePKtXqpr+k5f0TsltRoZfdfSVMABgd3+AFJEX4gKcIPJEX4gaQIP5AU4QeS4qu7PwVuuOGGhrUdO3Z0tO5169YV6y+++GJH60d92PMDSRF+ICnCDyRF+IGkCD+QFOEHkiL8QFJc5/8UGBlp/C1pixYt6mjdr732WrHe7PsgMLjY8wNJEX4gKcIPJEX4gaQIP5AU4QeSIvxAUlznnwGWL19erD/xxBN96gSfJuz5gaQIP5AU4QeSIvxAUoQfSIrwA0kRfiCpptf5bS+U9IykL0i6KGk0In5se4OkNZLOVrOuj4iXetVoZvfcc0+xPmfOnLbXfeTIkWL9/Pnzba8bg62Vm3wuSPp2RLxt+3OS9th+par9KCK+37v2APRK0/BHxClJp6rX52wfkHR9rxsD0FtXdM5ve7GkpZJ+W0163PbvbG+yPbfBMiO2x2yPddQpgK5qOfy250jaJulbEfEnSRslfVnSEk0eGfxguuUiYjQihiJiqAv9AuiSlsJve7Ymg78lIn4lSRFxOiI+joiLkn4i6a7etQmg25qG37Yl/VTSgYj44ZTpC6bM9jVJ+7rfHoBeaeWv/csk/Zukvbbfqaatl/SY7SWSQtIxSd/sSYfoyLvvvlus33///cX6xMREN9vBAGnlr/27JXmaEtf0gRmMO/yApAg/kBThB5Ii/EBShB9IivADSbmfQyzbZjxnoMciYrpL85/Anh9IivADSRF+ICnCDyRF+IGkCD+QFOEHkur3EN1/lPR/U95/vpo2iAa1t0HtS6K3dnWztxtanbGvN/l8YuP22KB+t9+g9jaofUn01q66euOwH0iK8ANJ1R3+0Zq3XzKovQ1qXxK9tauW3mo95wdQn7r3/ABqUkv4ba+wfdD2YdtP1tFDI7aP2d5r+526hxirhkE7Y3vflGnzbL9i+w/Vz2mHSauptw22/7/67N6x/WBNvS20/b+2D9jeb/s/q+m1fnaFvmr53Pp+2G97lqRDkh6QNC7pLUmPRcTv+9pIA7aPSRqKiNqvCdv+F0nnJT0TEbdW056SNBER36t+cc6NiP8akN42SDpf98jN1YAyC6aOLC3pYUn/rho/u0Jfj6qGz62OPf9dkg5HxNGI+LOkX0haWUMfAy8idkm6fNSMlZI2V683a/I/T9816G0gRMSpiHi7en1O0qWRpWv97Ap91aKO8F8v6cSU9+MarCG/Q9IO23tsj9TdzDSuq4ZNvzR8+vya+7lc05Gb++mykaUH5rNrZ8Trbqsj/NN9xdAgXXJYFhG3S/pXSWurw1u0pqWRm/tlmpGlB0K7I153Wx3hH5e0cMr7L0o6WUMf04qIk9XPM5J+rcEbffj0pUFSq59nau7nrwZp5ObpRpbWAHx2gzTidR3hf0vSjba/ZPuzkr4haXsNfXyC7aurP8TI9tWSvqLBG314u6TV1evVkp6vsZe/MygjNzcaWVo1f3aDNuJ1LTf5VJcy/lvSLEmbIuK7fW9iGrb/UZN7e2nyicef19mb7WclDWvyqa/Tkr4j6TlJWyUtknRc0tcjou9/eGvQ27AmD13/OnLzpXPsPve2XNJvJO2VdLGavF6T59e1fXaFvh5TDZ8bd/gBSXGHH5AU4QeSIvxAUoQfSIrwA0kRfiApwg8kRfiBpP4CIJjqosJxHysAAAAASUVORK5CYII=",
"text/plain": [
""
]
},
- "metadata": {}
+ "metadata": {},
+ "output_type": "display_data"
}
],
- "metadata": {}
+ "source": [
+ "%matplotlib inline\n",
+ "from matplotlib import pyplot as plt\n",
+ "\n",
+ "plt.imshow(test_single_x, cmap='gray')\n",
+ "plt.show()"
+ ]
},
{
"cell_type": "code",
"execution_count": 9,
- "source": [
- "print(\"Expected class is %d\" % test_single_y)"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"Expected class is 7\n"
]
}
],
- "metadata": {}
+ "source": [
+ "print(\"Expected class is %d\" % test_single_y)"
+ ]
},
{
"cell_type": "code",
"execution_count": 10,
- "source": [
- "accel_in = test_single_x.reshape(accel.ishape_normal)\n",
- "print(\"Input buffer shape is %s and datatype is %s\" % (str(accel_in.shape), str(accel_in.dtype)))"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"Input buffer shape is (1, 784) and datatype is uint8\n"
]
}
],
- "metadata": {}
+ "source": [
+ "accel_in = test_single_x.reshape(accel.ishape_normal())\n",
+ "print(\"Input buffer shape is %s and datatype is %s\" % (str(accel_in.shape), str(accel_in.dtype)))"
+ ]
},
{
"cell_type": "code",
"execution_count": 11,
+ "metadata": {},
+ "outputs": [],
"source": [
"accel_out = accel.execute(accel_in)"
- ],
- "outputs": [],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 12,
- "source": [
- "print(\"Returned class is %d\" % accel_out)"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"Returned class is 7\n"
]
}
],
- "metadata": {}
+ "source": [
+ "print(\"Returned class is %d\" % accel_out)"
+ ]
},
{
"cell_type": "code",
"execution_count": 13,
- "source": [
- "%%timeit\n",
- "accel_out = accel.execute(accel_in)"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"1000 loops, best of 3: 808 µs per loop\n"
]
}
],
- "metadata": {}
+ "source": [
+ "%%timeit\n",
+ "accel_out = accel.execute(accel_in)"
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"# Validate accuracy on entire MNIST test set"
- ],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 15,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Ready to run validation, test images tensor has shape (10, 1000, 784)\n",
+ "Accelerator buffer shapes are (1000, 1, 784) for input, (1000, 1, 1) for output\n"
+ ]
+ }
+ ],
"source": [
"import numpy as np\n",
"batch_size = 1000\n",
@@ -270,39 +281,17 @@
"batch_labels = testy.reshape(n_batches, batch_size)\n",
"obuf_normal = np.empty_like(accel.obuf_packed_device)\n",
"print(\"Ready to run validation, test images tensor has shape %s\" % str(batch_imgs.shape))\n",
- "print(\"Accelerator buffer shapes are %s for input, %s for output\" % (str(accel.ishape_packed), str(accel.oshape_packed)) )"
- ],
- "outputs": [
- {
- "output_type": "stream",
- "name": "stdout",
- "text": [
- "Ready to run validation, test images tensor has shape (10, 1000, 784)\n",
- "Accelerator buffer shapes are (1000, 1, 784) for input, (1000, 1, 1) for output\n"
- ]
- }
- ],
- "metadata": {}
+ "print(\"Accelerator buffer shapes are %s for input, %s for output\" % (str(accel.ishape_packed()), str(accel.oshape_packed())) )"
+ ]
},
{
"cell_type": "code",
"execution_count": 22,
- "source": [
- "ok = 0\n",
- "nok = 0\n",
- "for i in range(n_batches):\n",
- " ibuf_normal = batch_imgs[i].reshape(accel.ishape_normal)\n",
- " exp = batch_labels[i]\n",
- " obuf_normal = accel.execute(ibuf_normal)\n",
- " ret = np.bincount(obuf_normal.flatten() == exp.flatten())\n",
- " nok += ret[0]\n",
- " ok += ret[1]\n",
- " print(\"batch %d / %d : total OK %d NOK %d\" % (i, n_batches, ok, nok))"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"batch 0 / 10 : total OK 913 NOK 87\n",
"batch 1 / 10 : total OK 1800 NOK 200\n",
@@ -317,89 +306,97 @@
]
}
],
- "metadata": {}
+ "source": [
+ "ok = 0\n",
+ "nok = 0\n",
+ "for i in range(n_batches):\n",
+ " ibuf_normal = batch_imgs[i].reshape(accel.ishape_normal())\n",
+ " exp = batch_labels[i]\n",
+ " obuf_normal = accel.execute(ibuf_normal)\n",
+ " ret = np.bincount(obuf_normal.flatten() == exp.flatten())\n",
+ " nok += ret[0]\n",
+ " ok += ret[1]\n",
+ " print(\"batch %d / %d : total OK %d NOK %d\" % (i, n_batches, ok, nok))"
+ ]
},
{
"cell_type": "code",
"execution_count": 23,
- "source": [
- "acc = 100.0 * ok / (total)\n",
- "print(\"Final accuracy: {}%\".format(acc))"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"Final accuracy: 92.96%\n"
]
}
],
- "metadata": {}
+ "source": [
+ "acc = 100.0 * ok / (total)\n",
+ "print(\"Final accuracy: {}%\".format(acc))"
+ ]
},
{
"cell_type": "code",
"execution_count": 26,
+ "metadata": {},
+ "outputs": [],
"source": [
"def run_validation():\n",
" for i in range(n_batches):\n",
- " ibuf_normal = batch_imgs[i].reshape(accel.ishape_normal)\n",
+ " ibuf_normal = batch_imgs[i].reshape(accel.ishape_normal())\n",
" exp = batch_labels[i]\n",
" accel.execute(ibuf_normal)"
- ],
- "outputs": [],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 27,
- "source": [
- "full_validation_time = %timeit -n 5 -o run_validation()"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"5 loops, best of 3: 22.6 ms per loop\n"
]
}
],
- "metadata": {}
+ "source": [
+ "full_validation_time = %timeit -n 5 -o run_validation()"
+ ]
},
{
"cell_type": "code",
"execution_count": 28,
- "source": [
- "print(\"%f images per second including data movement\" % (total / float(full_validation_time.best)))"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"441567.114603 images per second including data movement\n"
]
}
],
- "metadata": {}
+ "source": [
+ "print(\"%f images per second including data movement\" % (total / float(full_validation_time.best)))"
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"## Run some more built-in benchmarks"
- ],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 29,
- "source": [
- "accel.throughput_test()"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "execute_result",
"data": {
"text/plain": [
"{'DRAM_in_bandwidth[Mb/s]': 656.2231762123328,\n",
@@ -416,18 +413,21 @@
" 'unpack_output[ms]': 0.0006036758422851562}"
]
},
+ "execution_count": 29,
"metadata": {},
- "execution_count": 29
+ "output_type": "execute_result"
}
],
- "metadata": {}
+ "source": [
+ "accel.throughput_test()"
+ ]
},
{
"cell_type": "code",
"execution_count": null,
- "source": [],
+ "metadata": {},
"outputs": [],
- "metadata": {}
+ "source": []
}
],
"metadata": {
@@ -451,4 +451,4 @@
},
"nbformat": 4,
"nbformat_minor": 2
-}
\ No newline at end of file
+}
diff --git a/finn_examples/notebooks/1_cifar10_with_cnv_networks.ipynb b/finn_examples/notebooks/1_cifar10_with_cnv_networks.ipynb
index af305ff..39ade56 100644
--- a/finn_examples/notebooks/1_cifar10_with_cnv_networks.ipynb
+++ b/finn_examples/notebooks/1_cifar10_with_cnv_networks.ipynb
@@ -2,91 +2,99 @@
"cells": [
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"# Initialize the accelerator"
- ],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 1,
- "source": [
- "from finn_examples import models\n",
- "print(list(filter(lambda x: \"cifar10\" in x, dir(models))))"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "display_data",
"data": {
"application/javascript": "\ntry {\nrequire(['notebook/js/codecell'], function(codecell) {\n codecell.CodeCell.options_default.highlight_modes[\n 'magic_text/x-csrc'] = {'reg':[/^%%microblaze/]};\n Jupyter.notebook.events.one('kernel_ready.Kernel', function(){\n Jupyter.notebook.get_cells().map(function(cell){\n if (cell.cell_type == 'code'){ cell.auto_highlight(); } }) ;\n });\n});\n} catch (e) {};\n"
},
- "metadata": {}
+ "metadata": {},
+ "output_type": "display_data"
},
{
- "output_type": "display_data",
"data": {
"application/javascript": "\ntry {\nrequire(['notebook/js/codecell'], function(codecell) {\n codecell.CodeCell.options_default.highlight_modes[\n 'magic_text/x-csrc'] = {'reg':[/^%%pybind11/]};\n Jupyter.notebook.events.one('kernel_ready.Kernel', function(){\n Jupyter.notebook.get_cells().map(function(cell){\n if (cell.cell_type == 'code'){ cell.auto_highlight(); } }) ;\n });\n});\n} catch (e) {};\n"
},
- "metadata": {}
+ "metadata": {},
+ "output_type": "display_data"
},
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"['_cifar10_cnv_io_shape_dict', 'cnv_w1a1_cifar10', 'cnv_w1a2_cifar10', 'cnv_w2a2_cifar10']\n"
]
}
],
- "metadata": {}
+ "source": [
+ "from finn_examples import models\n",
+ "print(list(filter(lambda x: \"cifar10\" in x, dir(models))))"
+ ]
},
{
"cell_type": "code",
"execution_count": 2,
+ "metadata": {},
+ "outputs": [],
"source": [
"accel = models.cnv_w1a1_cifar10()"
- ],
- "outputs": [],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 3,
- "source": [
- "print(\"Expected input shape and datatype: %s %s\" % (str(accel.ishape_normal), str(accel.idt)))\n",
- "print(\"Expected output shape and datatype: %s %s\" % (str(accel.oshape_normal), str(accel.odt)))"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"Expected input shape and datatype: (1, 32, 32, 3) DataType.UINT8\n",
"Expected output shape and datatype: (1, 1) DataType.UINT8\n"
]
}
],
- "metadata": {}
+ "source": [
+ "print(\"Expected input shape and datatype: %s %s\" % (str(accel.ishape_normal()), str(accel.idt())))\n",
+ "print(\"Expected output shape and datatype: %s %s\" % (str(accel.oshape_normal()), str(accel.odt())))"
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"# Load the CIFAR-10 dataset\n",
"\n",
"Use the `dataset_loading` package to get easy Python access to CIFAR-10 dataset:"
- ],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
- "execution_count": 5,
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
"source": [
- "from dataset_loading import cifar\n",
- "trainx, trainy, testx, testy, valx, valy = cifar.load_cifar_data(\"/tmp\", download=True, one_hot=False)"
- ],
+ "# Uncomment the following lines (to disable SSL verification) if you encounter an SSLCertVerificationError when downloading the cifar-10 dataset.\n",
+ "#import ssl\n",
+ "#ssl._create_default_https_context = ssl._create_unverified_context"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 5,
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"Tar File found in dest_dir. Not Downloading again\n",
"Extracting Python CIFAR10 data.\n",
@@ -94,159 +102,173 @@
]
}
],
- "metadata": {}
+ "source": [
+ "from dataset_loading import cifar\n",
+ "trainx, trainy, testx, testy, valx, valy = cifar.load_cifar_data(\"/tmp\", download=True, one_hot=False)"
+ ]
},
{
"cell_type": "code",
"execution_count": 6,
- "source": [
- "testx.shape"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "execute_result",
"data": {
"text/plain": [
"(10000, 32, 32, 3)"
]
},
+ "execution_count": 6,
"metadata": {},
- "execution_count": 6
+ "output_type": "execute_result"
}
],
- "metadata": {}
+ "source": [
+ "testx.shape"
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"# Classify a single image"
- ],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 7,
+ "metadata": {},
+ "outputs": [],
"source": [
"test_single_x = testx[0]\n",
"test_single_y = testy[0]\n",
"cifar10_class_names = ['Airplane', 'Automobile', 'Bird', 'Cat', 'Deer', 'Dog', 'Frog', 'Horse', 'Ship', 'Truck']\n"
- ],
- "outputs": [],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 9,
- "source": [
- "%matplotlib inline\n",
- "from matplotlib import pyplot as plt\n",
- "\n",
- "plt.imshow(test_single_x)\n",
- "plt.show()"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "display_data",
"data": {
"image/png": "",
"text/plain": [
""
]
},
- "metadata": {}
+ "metadata": {},
+ "output_type": "display_data"
}
],
- "metadata": {}
+ "source": [
+ "%matplotlib inline\n",
+ "from matplotlib import pyplot as plt\n",
+ "\n",
+ "plt.imshow(test_single_x)\n",
+ "plt.show()"
+ ]
},
{
"cell_type": "code",
"execution_count": 10,
- "source": [
- "print(\"Expected class is %d (%s)\" % (test_single_y, cifar10_class_names[test_single_y]))"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"Expected class is 3 (Cat)\n"
]
}
],
- "metadata": {}
+ "source": [
+ "print(\"Expected class is %d (%s)\" % (test_single_y, cifar10_class_names[test_single_y]))"
+ ]
},
{
"cell_type": "code",
"execution_count": 11,
- "source": [
- "accel_in = test_single_x.reshape(accel.ishape_normal)\n",
- "print(\"Input buffer shape is %s and datatype is %s\" % (str(accel_in.shape), str(accel_in.dtype)))"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"Input buffer shape is (1, 32, 32, 3) and datatype is uint8\n"
]
}
],
- "metadata": {}
+ "source": [
+ "accel_in = test_single_x.reshape(accel.ishape_normal())\n",
+ "print(\"Input buffer shape is %s and datatype is %s\" % (str(accel_in.shape), str(accel_in.dtype)))"
+ ]
},
{
"cell_type": "code",
"execution_count": 12,
+ "metadata": {},
+ "outputs": [],
"source": [
"accel_out = accel.execute(accel_in)"
- ],
- "outputs": [],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 13,
- "source": [
- "print(\"Returned class is %d\" % accel_out)"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"Returned class is 3\n"
]
}
],
- "metadata": {}
+ "source": [
+ "print(\"Returned class is %d\" % accel_out)"
+ ]
},
{
"cell_type": "code",
"execution_count": 14,
- "source": [
- "%%timeit\n",
- "accel_out = accel.execute(accel_in)"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"100 loops, best of 3: 2.34 ms per loop\n"
]
}
],
- "metadata": {}
+ "source": [
+ "%%timeit\n",
+ "accel_out = accel.execute(accel_in)"
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"# Validate accuracy on entire CIFAR-10 test set"
- ],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 15,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Ready to run validation, test images tensor has shape (10, 1000, 3072)\n",
+ "Accelerator buffer shapes are (1000, 1, 32, 32, 1, 3) for input, (1000, 1, 1) for output\n"
+ ]
+ }
+ ],
"source": [
"import numpy as np\n",
"\n",
@@ -259,39 +281,17 @@
"batch_labels = testy.reshape(n_batches, batch_size)\n",
"obuf_normal = np.empty_like(accel.obuf_packed_device)\n",
"print(\"Ready to run validation, test images tensor has shape %s\" % str(batch_imgs.shape))\n",
- "print(\"Accelerator buffer shapes are %s for input, %s for output\" % (str(accel.ishape_packed), str(accel.oshape_packed)) )"
- ],
- "outputs": [
- {
- "output_type": "stream",
- "name": "stdout",
- "text": [
- "Ready to run validation, test images tensor has shape (10, 1000, 3072)\n",
- "Accelerator buffer shapes are (1000, 1, 32, 32, 1, 3) for input, (1000, 1, 1) for output\n"
- ]
- }
- ],
- "metadata": {}
+ "print(\"Accelerator buffer shapes are %s for input, %s for output\" % (str(accel.ishape_packed()), str(accel.oshape_packed())))"
+ ]
},
{
"cell_type": "code",
"execution_count": 16,
- "source": [
- "ok = 0\n",
- "nok = 0\n",
- "for i in range(n_batches):\n",
- " ibuf_normal = batch_imgs[i].reshape(accel.ishape_normal)\n",
- " exp = batch_labels[i]\n",
- " obuf_normal = accel.execute(ibuf_normal)\n",
- " ret = np.bincount(obuf_normal.flatten() == exp.flatten())\n",
- " nok += ret[0]\n",
- " ok += ret[1]\n",
- " print(\"batch %d / %d : total OK %d NOK %d\" % (i, n_batches, ok, nok))"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"batch 0 / 10 : total OK 851 NOK 149\n",
"batch 1 / 10 : total OK 1683 NOK 317\n",
@@ -306,89 +306,97 @@
]
}
],
- "metadata": {}
+ "source": [
+ "ok = 0\n",
+ "nok = 0\n",
+ "for i in range(n_batches):\n",
+ " ibuf_normal = batch_imgs[i].reshape(accel.ishape_normal())\n",
+ " exp = batch_labels[i]\n",
+ " obuf_normal = accel.execute(ibuf_normal)\n",
+ " ret = np.bincount(obuf_normal.flatten() == exp.flatten())\n",
+ " nok += ret[0]\n",
+ " ok += ret[1]\n",
+ " print(\"batch %d / %d : total OK %d NOK %d\" % (i, n_batches, ok, nok))"
+ ]
},
{
"cell_type": "code",
"execution_count": 17,
- "source": [
- "acc = 100.0 * ok / (total)\n",
- "print(\"Final accuracy: {}%\".format(acc))"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"Final accuracy: 84.19%\n"
]
}
],
- "metadata": {}
+ "source": [
+ "acc = 100.0 * ok / (total)\n",
+ "print(\"Final accuracy: {}%\".format(acc))"
+ ]
},
{
"cell_type": "code",
"execution_count": 18,
+ "metadata": {},
+ "outputs": [],
"source": [
"def run_validation():\n",
" for i in range(n_batches):\n",
- " ibuf_normal = batch_imgs[i].reshape(accel.ishape_normal)\n",
+ " ibuf_normal = batch_imgs[i].reshape(accel.ishape_normal())\n",
" exp = batch_labels[i]\n",
" accel.execute(ibuf_normal)"
- ],
- "outputs": [],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 20,
- "source": [
- "full_validation_time = %timeit -n 1 -o run_validation()"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"1 loop, best of 3: 3.34 s per loop\n"
]
}
],
- "metadata": {}
+ "source": [
+ "full_validation_time = %timeit -n 1 -o run_validation()"
+ ]
},
{
"cell_type": "code",
"execution_count": 21,
- "source": [
- "print(\"%f images per second including data movement\" % (total / float(full_validation_time.best)))"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"2995.076851 images per second including data movement\n"
]
}
],
- "metadata": {}
+ "source": [
+ "print(\"%f images per second including data movement\" % (total / float(full_validation_time.best)))"
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"## More benchmarking"
- ],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 23,
- "source": [
- "accel.throughput_test()"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "execute_result",
"data": {
"text/plain": [
"{'DRAM_in_bandwidth[Mb/s]': 9.278965852281484,\n",
@@ -405,18 +413,21 @@
" 'unpack_output[ms]': 0.0006213188171386719}"
]
},
+ "execution_count": 23,
"metadata": {},
- "execution_count": 23
+ "output_type": "execute_result"
}
],
- "metadata": {}
+ "source": [
+ "accel.throughput_test()"
+ ]
},
{
"cell_type": "code",
"execution_count": null,
- "source": [],
+ "metadata": {},
"outputs": [],
- "metadata": {}
+ "source": []
}
],
"metadata": {
@@ -440,4 +451,4 @@
},
"nbformat": 4,
"nbformat_minor": 2
-}
\ No newline at end of file
+}
diff --git a/finn_examples/notebooks/2_imagenet_with_cnns.ipynb b/finn_examples/notebooks/2_imagenet_with_cnns.ipynb
index 04a80b1..7e653ad 100755
--- a/finn_examples/notebooks/2_imagenet_with_cnns.ipynb
+++ b/finn_examples/notebooks/2_imagenet_with_cnns.ipynb
@@ -68,8 +68,8 @@
}
],
"source": [
- "print(\"Expected input shape and datatype: %s %s\" % (str(accel.ishape_normal), str(accel.idt)))\n",
- "print(\"Expected output shape and datatype: %s %s\" % (str(accel.oshape_normal), str(accel.odt)))"
+ "print(\"Expected input shape and datatype: %s %s\" % (str(accel.ishape_normal()), str(accel.idt())))\n",
+ "print(\"Expected output shape and datatype: %s %s\" % (str(accel.oshape_normal()), str(accel.odt())))"
]
},
{
@@ -226,7 +226,7 @@
}
],
"source": [
- "accel_in = test_single_x.reshape(accel.ishape_normal)\n",
+ "accel_in = test_single_x.reshape(accel.ishape_normal())\n",
"print(\"Input buffer shape is %s and datatype is %s\" % (str(accel_in.shape), str(accel_in.dtype)))"
]
},
@@ -297,7 +297,7 @@
"source": [
"batch_size = 100\n",
"accel.batch_size = batch_size\n",
- "print(\"Accelerator buffer shapes are %s for input, %s for output\" % (str(accel.ishape_packed), str(accel.oshape_packed)) )"
+ "print(\"Accelerator buffer shapes are %s for input, %s for output\" % (str(accel.ishape_packed()), str(accel.oshape_packed())))"
]
},
{
@@ -837,7 +837,7 @@
" imgs = np.array(imgs)\n",
" exp = np.array(lbls)\n",
" \n",
- " ibuf_normal = imgs.reshape(accel.ishape_normal)\n",
+ " ibuf_normal = imgs.reshape(accel.ishape_normal())\n",
" obuf_normal = accel.execute(ibuf_normal)\n",
" obuf_normal = obuf_normal.reshape(batch_size, -1)[:,0]\n",
" ret = np.bincount(obuf_normal.flatten() == exp.flatten())\n",
diff --git a/finn_examples/notebooks/3_binarycop_mask_detection.ipynb b/finn_examples/notebooks/3_binarycop_mask_detection.ipynb
index 15688c1..43f490f 100644
--- a/finn_examples/notebooks/3_binarycop_mask_detection.ipynb
+++ b/finn_examples/notebooks/3_binarycop_mask_detection.ipynb
@@ -3,74 +3,76 @@
{
"cell_type": "code",
"execution_count": 1,
- "source": [
- "from finn_examples import models\n",
- "import os\n",
- "from PIL import Image\n",
- "import numpy as np\n",
- "import cv2"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "display_data",
"data": {
"application/javascript": "\nrequire(['notebook/js/codecell'], function(codecell) {\n codecell.CodeCell.options_default.highlight_modes[\n 'magic_text/x-csrc'] = {'reg':[/^%%microblaze/]};\n Jupyter.notebook.events.one('kernel_ready.Kernel', function(){\n Jupyter.notebook.get_cells().map(function(cell){\n if (cell.cell_type == 'code'){ cell.auto_highlight(); } }) ;\n });\n});\n"
},
- "metadata": {}
+ "metadata": {},
+ "output_type": "display_data"
}
],
- "metadata": {}
+ "source": [
+ "from finn_examples import models\n",
+ "import os\n",
+ "from PIL import Image\n",
+ "import numpy as np\n",
+ "import cv2"
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"# Initialize the Accelerator"
- ],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 2,
+ "metadata": {
+ "scrolled": true
+ },
+ "outputs": [],
"source": [
"# Note: the face mask detection example is only available on Pynq-Z1 at the moment\n",
"accel = models.bincop_cnv()"
- ],
- "outputs": [],
- "metadata": {
- "scrolled": true
- }
+ ]
},
{
"cell_type": "code",
"execution_count": 3,
- "source": [
- "class_dict = {0: \"Correctly Masked\", 1: \"Incorrectly Worn\", 2: \"No Mask\"}\n",
- "\n",
- "print(\"Expected input shape and datatype: %s %s\" % (str(accel.ishape_normal), str(accel.idt)))\n",
- "print(\"Expected output shape and datatype: %s %s\" % (str(accel.oshape_normal), str(accel.odt)))"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"Expected input shape and datatype: (1, 72, 72, 3) DataType.UINT8\n",
"Expected output shape and datatype: (1, 1) DataType.UINT8\n"
]
}
],
- "metadata": {}
+ "source": [
+ "class_dict = {0: \"Correctly Masked\", 1: \"Incorrectly Worn\", 2: \"No Mask\"}\n",
+ "\n",
+ "print(\"Expected input shape and datatype: %s %s\" % (str(accel.ishape_normal()), str(accel.idt())))\n",
+ "print(\"Expected output shape and datatype: %s %s\" % (str(accel.oshape_normal()), str(accel.odt())))"
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"# Load Mask Examples"
- ],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 4,
+ "metadata": {},
+ "outputs": [],
"source": [
"mask_examples_dir = \"/tmp/mask_examples\"\n",
"if not os.path.exists(mask_examples_dir):\n",
@@ -79,20 +81,20 @@
"for i in range(6):\n",
" if \"{}.jpg\".format(i+1) not in os.listdir(mask_examples_dir):\n",
" os.system(\"wget -P \" + mask_examples_dir + \" https://github.com/NaelF/BinaryCoP/raw/master/notebook/pictures/{}.jpg\".format(i+1))"
- ],
- "outputs": [],
- "metadata": {}
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"# Run Inference"
- ],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 5,
+ "metadata": {},
+ "outputs": [],
"source": [
"def resize(img):\n",
" img = np.array(img)\n",
@@ -100,153 +102,153 @@
" resized_img = cv2.resize(img,(72,72))\n",
" return (resized_img) \n",
" else: return img"
- ],
- "outputs": [],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 6,
- "source": [
- "im = Image.open(mask_examples_dir + '/1.jpg')\n",
- "im = resize(im)\n",
- "accel_in = im.reshape(accel.ishape_normal)\n",
- "im = Image.fromarray(im, 'RGB')\n",
- "display(im)\n",
- "accel_out = accel.execute(accel_in)\n",
- "print(\"Returned class is: \" + class_dict[int(accel_out)])\n",
- "\n",
- "im = Image.open(mask_examples_dir + '/2.jpg')\n",
- "im = resize(im)\n",
- "accel_in = im.reshape(accel.ishape_normal)\n",
- "im = Image.fromarray(im, 'RGB')\n",
- "display(im)\n",
- "accel_out = accel.execute(accel_in)\n",
- "print(\"Returned class is: \" + class_dict[int(accel_out)])\n",
- "\n",
- "im = Image.open(mask_examples_dir + '/3.jpg')\n",
- "im = resize(im)\n",
- "accel_in = im.reshape(accel.ishape_normal)\n",
- "im = Image.fromarray(im, 'RGB')\n",
- "display(im)\n",
- "accel_out = accel.execute(accel_in)\n",
- "print(\"Returned class is: \" + class_dict[int(accel_out)])\n",
- "\n",
- "im = Image.open(mask_examples_dir + '/4.jpg')\n",
- "im = resize(im)\n",
- "accel_in = im.reshape(accel.ishape_normal)\n",
- "im = Image.fromarray(im, 'RGB')\n",
- "display(im)\n",
- "accel_out = accel.execute(accel_in)\n",
- "print(\"Returned class is: \" + class_dict[int(accel_out)])\n",
- "\n",
- "im = Image.open(mask_examples_dir + '/5.jpg')\n",
- "im = resize(im)\n",
- "accel_in = im.reshape(accel.ishape_normal)\n",
- "im = Image.fromarray(im, 'RGB')\n",
- "display(im)\n",
- "accel_out = accel.execute(accel_in)\n",
- "print(\"Returned class is: \" + class_dict[int(accel_out)])"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "display_data",
"data": {
"image/png": "",
"text/plain": [
""
]
},
- "metadata": {}
+ "metadata": {},
+ "output_type": "display_data"
},
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"Returned class is: Correctly Masked\n"
]
},
{
- "output_type": "display_data",
"data": {
"image/png": "",
"text/plain": [
""
]
},
- "metadata": {}
+ "metadata": {},
+ "output_type": "display_data"
},
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"Returned class is: Correctly Masked\n"
]
},
{
- "output_type": "display_data",
"data": {
"image/png": "",
"text/plain": [
""
]
},
- "metadata": {}
+ "metadata": {},
+ "output_type": "display_data"
},
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"Returned class is: Incorrectly Worn\n"
]
},
{
- "output_type": "display_data",
"data": {
"image/png": "",
"text/plain": [
""
]
},
- "metadata": {}
+ "metadata": {},
+ "output_type": "display_data"
},
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"Returned class is: Incorrectly Worn\n"
]
},
{
- "output_type": "display_data",
"data": {
"image/png": "",
"text/plain": [
""
]
},
- "metadata": {}
+ "metadata": {},
+ "output_type": "display_data"
},
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"Returned class is: No Mask\n"
]
}
],
- "metadata": {}
+ "source": [
+ "im = Image.open(mask_examples_dir + '/1.jpg')\n",
+ "im = resize(im)\n",
+ "accel_in = im.reshape(accel.ishape_normal())\n",
+ "im = Image.fromarray(im, 'RGB')\n",
+ "display(im)\n",
+ "accel_out = accel.execute(accel_in)\n",
+ "print(\"Returned class is: \" + class_dict[int(accel_out)])\n",
+ "\n",
+ "im = Image.open(mask_examples_dir + '/2.jpg')\n",
+ "im = resize(im)\n",
+ "accel_in = im.reshape(accel.ishape_normal())\n",
+ "im = Image.fromarray(im, 'RGB')\n",
+ "display(im)\n",
+ "accel_out = accel.execute(accel_in)\n",
+ "print(\"Returned class is: \" + class_dict[int(accel_out)])\n",
+ "\n",
+ "im = Image.open(mask_examples_dir + '/3.jpg')\n",
+ "im = resize(im)\n",
+ "accel_in = im.reshape(accel.ishape_normal())\n",
+ "im = Image.fromarray(im, 'RGB')\n",
+ "display(im)\n",
+ "accel_out = accel.execute(accel_in)\n",
+ "print(\"Returned class is: \" + class_dict[int(accel_out)])\n",
+ "\n",
+ "im = Image.open(mask_examples_dir + '/4.jpg')\n",
+ "im = resize(im)\n",
+ "accel_in = im.reshape(accel.ishape_normal())\n",
+ "im = Image.fromarray(im, 'RGB')\n",
+ "display(im)\n",
+ "accel_out = accel.execute(accel_in)\n",
+ "print(\"Returned class is: \" + class_dict[int(accel_out)])\n",
+ "\n",
+ "im = Image.open(mask_examples_dir + '/5.jpg')\n",
+ "im = resize(im)\n",
+ "accel_in = im.reshape(accel.ishape_normal())\n",
+ "im = Image.fromarray(im, 'RGB')\n",
+ "display(im)\n",
+ "accel_out = accel.execute(accel_in)\n",
+ "print(\"Returned class is: \" + class_dict[int(accel_out)])"
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"# Run Webcam"
- ],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 7,
+ "metadata": {},
+ "outputs": [],
"source": [
"from IPython.display import clear_output\n",
"\n",
@@ -260,7 +262,7 @@
" if flag:\n",
" frame = webcam_rev(frame)\n",
" img = Image.fromarray(frame, 'RGB')\n",
- " frame = frame.reshape(accel.ishape_normal)\n",
+ " frame = frame.reshape(accel.ishape_normal())\n",
" return frame, img\n",
"\n",
" else:\n",
@@ -277,40 +279,39 @@
" img_resized = cv2.resize(img_cropped,(72,72))\n",
" img_rev = cv2.cvtColor(img_resized, cv2.COLOR_BGR2RGB)\n",
" return img_rev"
- ],
- "outputs": [],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 8,
- "source": [
- "cap = cv2.VideoCapture(0)\n",
- "while not cap.isOpened():\n",
- " cap = cv2.VideoCapture(0)\n",
- " cv2.waitKey(1)\n",
- " print (\"Wait for the device\")\n",
- "\n",
- "# set small capture resolution for faster processing\n",
- "cap.set(cv2.CAP_PROP_FRAME_WIDTH, 160)\n",
- "cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 120)"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "execute_result",
"data": {
"text/plain": [
"True"
]
},
+ "execution_count": 8,
"metadata": {},
- "execution_count": 8
+ "output_type": "execute_result"
}
],
- "metadata": {}
+ "source": [
+ "cap = cv2.VideoCapture(0)\n",
+ "while not cap.isOpened():\n",
+ " cap = cv2.VideoCapture(0)\n",
+ " cv2.waitKey(1)\n",
+ " print (\"Wait for the device\")\n",
+ "\n",
+ "# set small capture resolution for faster processing\n",
+ "cap.set(cv2.CAP_PROP_FRAME_WIDTH, 160)\n",
+ "cap.set(cv2.CAP_PROP_FRAME_HEIGHT, 120)"
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"# Classify Webcam Input\n",
"* Make sure you are in a well-lit environment\n",
@@ -318,119 +319,118 @@
"* Position your face in the center of the frame, close to camera (see examples)\n",
"\n",
"This notebook is a basic proof-of-concept. Model was trained on simple blue-mask augmentation of Flickr-Faces-HQ (FFHQ). For better results, more mask-types can be supported (e.g. https://github.com/aqeelanwar/MaskTheFace)"
- ],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 9,
- "source": [
- "clear_output()\n",
- "frame, img = producer_live(cap)\n",
- "consumer_live(accel, frame)\n",
- "img"
- ],
+ "metadata": {
+ "scrolled": true
+ },
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"Class name: Correctly Masked\n"
]
},
{
- "output_type": "execute_result",
"data": {
"image/png": "",
"text/plain": [
""
]
},
+ "execution_count": 9,
"metadata": {},
- "execution_count": 9
+ "output_type": "execute_result"
}
],
- "metadata": {
- "scrolled": true
- }
- },
- {
- "cell_type": "code",
- "execution_count": 10,
"source": [
"clear_output()\n",
"frame, img = producer_live(cap)\n",
"consumer_live(accel, frame)\n",
"img"
- ],
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"Class name: Incorrectly Worn\n"
]
},
{
- "output_type": "execute_result",
"data": {
"image/png": "",
"text/plain": [
""
]
},
+ "execution_count": 10,
"metadata": {},
- "execution_count": 10
+ "output_type": "execute_result"
}
],
- "metadata": {}
- },
- {
- "cell_type": "code",
- "execution_count": 11,
"source": [
"clear_output()\n",
"frame, img = producer_live(cap)\n",
"consumer_live(accel, frame)\n",
"img"
- ],
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 11,
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"Class name: No Mask\n"
]
},
{
- "output_type": "execute_result",
"data": {
"image/png": "",
"text/plain": [
""
]
},
+ "execution_count": 11,
"metadata": {},
- "execution_count": 11
+ "output_type": "execute_result"
}
],
- "metadata": {}
+ "source": [
+ "clear_output()\n",
+ "frame, img = producer_live(cap)\n",
+ "consumer_live(accel, frame)\n",
+ "img"
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"# Release Webcam"
- ],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 12,
+ "metadata": {},
+ "outputs": [],
"source": [
"cap.release()"
- ],
- "outputs": [],
- "metadata": {}
+ ]
}
],
"metadata": {
@@ -454,4 +454,4 @@
},
"nbformat": 4,
"nbformat_minor": 1
-}
\ No newline at end of file
+}
diff --git a/finn_examples/notebooks/4_keyword_spotting.ipynb b/finn_examples/notebooks/4_keyword_spotting.ipynb
index c7a200e..4898b76 100644
--- a/finn_examples/notebooks/4_keyword_spotting.ipynb
+++ b/finn_examples/notebooks/4_keyword_spotting.ipynb
@@ -62,30 +62,6 @@
"print(\"Label shape: \" + str(golden_out_data.shape))"
]
},
- {
- "cell_type": "markdown",
- "metadata": {},
- "source": [
- "If you get an error for the cell above, it's possible something went wrong with the sample data download/extraction and you can manually download and unzip files by uncommenting the function call in the cell below:"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "import pkg_resources as pk\n",
- "import json\n",
- "\n",
- "#data_folder = pk.resource_filename(\"finn_examples\", \"data\")\n",
- "#zip_file_link = pk.resource_filename(\"finn_examples\", \"data/all_validation_kws_data_preprocessed_py_speech.zip.link\")\n",
- "#file = open(zip_file_link)\n",
- "#zip_file = json.load(file)\n",
- "#! wget {zip_file[\"url\"]} -P {data_folder}\n",
- "#! unzip {data_folder+\"/all_validation_kws_data_preprocessed_py_speech.zip\"} -d {data_folder}"
- ]
- },
{
"cell_type": "markdown",
"metadata": {},
@@ -149,8 +125,8 @@
}
],
"source": [
- "print(\"Expected input shape and datatype: %s %s\" % (str(accel.ishape_normal), str(accel.idt)))\n",
- "print(\"Expected output shape and datatype: %s %s\" % (str(accel.oshape_normal), str(accel.odt)))"
+ "print(\"Expected input shape and datatype: %s %s\" % (str(accel.ishape_normal()), str(accel.idt())))\n",
+ "print(\"Expected output shape and datatype: %s %s\" % (str(accel.oshape_normal()), str(accel.odt())))"
]
},
{
@@ -484,21 +460,6 @@
"!ls $audio_samples_folder"
]
},
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {},
- "outputs": [],
- "source": [
- "# uncomment below if you get an error for the sample files\n",
- "#data_folder = pk.resource_filename(\"finn_examples\", \"data\")\n",
- "#zip_file_link = pk.resource_filename(\"finn_examples\", \"data/audio_samples.zip.link\")\n",
- "#file = open(zip_file_link)\n",
- "#zip_file = json.load(file)\n",
- "#! wget {zip_file[\"url\"]} -P {data_folder}\n",
- "#! unzip {data_folder+\"/audio_samples.zip\"} -d {data_folder}"
- ]
- },
{
"cell_type": "markdown",
"metadata": {},
@@ -618,7 +579,7 @@
}
],
"source": [
- "res_label = tf_dataset_labels[res_acc[0,0].astype(np.int)]\n",
+ "res_label = tf_dataset_labels[res_acc[0,0].astype(int)]\n",
"print(f\"The audio file was classified as: {res_label}\")"
]
},
@@ -644,7 +605,7 @@
" quant_mfcc_feat = quantize_input(mfcc_feat_py)\n",
" accel.batch_size = 1\n",
" res_acc = accel.execute(quant_mfcc_feat)\n",
- " res_label = tf_dataset_labels[res_acc[0,0].astype(np.int)]\n",
+ " res_label = tf_dataset_labels[res_acc[0,0].astype(int)]\n",
" print(f\"The audio file for {sample_class} was classified as: {res_label}\")"
]
}
diff --git a/finn_examples/notebooks/5_radioml_with_cnns.ipynb b/finn_examples/notebooks/5_radioml_with_cnns.ipynb
index ed3093b..b6b79b7 100755
--- a/finn_examples/notebooks/5_radioml_with_cnns.ipynb
+++ b/finn_examples/notebooks/5_radioml_with_cnns.ipynb
@@ -2,93 +2,118 @@
"cells": [
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"# Initialize the accelerator"
- ],
- "metadata": {}
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### Remember to install the following dependencies:"
+ ]
},
{
"cell_type": "code",
"execution_count": 1,
- "source": [
- "# remember to install the following dependencies\n",
- "#! apt-get install libhdf5-dev -y\n",
- "#! pip3 install versioned-hdf5"
- ],
+ "metadata": {},
"outputs": [],
- "metadata": {}
+ "source": [
+ "! apt-get install libhdf5-dev -y\n",
+ "! pip3 install versioned-hdf5"
+ ]
},
{
"cell_type": "code",
"execution_count": 2,
- "source": [
- "from finn_examples import models\n",
- "print(list(filter(lambda x: \"radioml\" in x, dir(models))))"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "display_data",
"data": {
"application/javascript": "\ntry {\nrequire(['notebook/js/codecell'], function(codecell) {\n codecell.CodeCell.options_default.highlight_modes[\n 'magic_text/x-csrc'] = {'reg':[/^%%microblaze/]};\n Jupyter.notebook.events.one('kernel_ready.Kernel', function(){\n Jupyter.notebook.get_cells().map(function(cell){\n if (cell.cell_type == 'code'){ cell.auto_highlight(); } }) ;\n });\n});\n} catch (e) {};\n"
},
- "metadata": {}
+ "metadata": {},
+ "output_type": "display_data"
},
{
- "output_type": "display_data",
"data": {
"application/javascript": "\ntry {\nrequire(['notebook/js/codecell'], function(codecell) {\n codecell.CodeCell.options_default.highlight_modes[\n 'magic_text/x-csrc'] = {'reg':[/^%%pybind11/]};\n Jupyter.notebook.events.one('kernel_ready.Kernel', function(){\n Jupyter.notebook.get_cells().map(function(cell){\n if (cell.cell_type == 'code'){ cell.auto_highlight(); } }) ;\n });\n});\n} catch (e) {};\n"
},
- "metadata": {}
+ "metadata": {},
+ "output_type": "display_data"
},
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"['_radioml_io_shape_dict', 'vgg10_w4a4_radioml']\n"
]
}
],
- "metadata": {}
+ "source": [
+ "from finn_examples import models\n",
+ "print(list(filter(lambda x: \"radioml\" in x, dir(models))))"
+ ]
},
{
"cell_type": "code",
"execution_count": 3,
+ "metadata": {},
+ "outputs": [],
"source": [
"# Note: the RadioML example is only available on the ZCU104 at the moment\n",
"accel = models.vgg10_w4a4_radioml()"
- ],
- "outputs": [],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 4,
- "source": [
- "print(\"Expected input shape and datatype: %s %s\" % (str(accel.ishape_normal), str(accel.idt)))\n",
- "print(\"Expected output shape and datatype: %s %s\" % (str(accel.oshape_normal), str(accel.odt)))"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"Expected input shape and datatype: (1, 1024, 1, 2) DataType.INT8\n",
"Expected output shape and datatype: (1, 1) DataType.UINT8\n"
]
}
],
- "metadata": {}
+ "source": [
+ "print(\"Expected input shape and datatype: %s %s\" % (str(accel.ishape_normal()), str(accel.idt())))\n",
+ "print(\"Expected output shape and datatype: %s %s\" % (str(accel.oshape_normal()), str(accel.odt())))"
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"# Load RadioML 2018 dataset"
- ],
- "metadata": {}
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "Please note that you will have to manually download the RadioML 2018 dataset and set the `dataset_dir` variable to point to its path."
+ ]
},
{
"cell_type": "code",
"execution_count": 5,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "/home/xilinx/datasets/radioml_2018\n"
+ ]
+ }
+ ],
"source": [
"import numpy as np\n",
"import math\n",
@@ -98,21 +123,13 @@
"\n",
"dataset_dir = \"/home/xilinx/datasets/radioml_2018\"\n",
"print(dataset_dir)"
- ],
- "outputs": [
- {
- "output_type": "stream",
- "name": "stdout",
- "text": [
- "/home/xilinx/datasets/radioml_2018\n"
- ]
- }
- ],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 6,
+ "metadata": {},
+ "outputs": [],
"source": [
"h5_file = h5py.File(dataset_dir + \"/GOLD_XYZ_OSC.0001_1024.hdf5\",'r')\n",
"data_h5 = h5_file['X']\n",
@@ -142,23 +159,16 @@
"'16APSK','32APSK','64APSK','128APSK','16QAM','32QAM','64QAM','128QAM','256QAM',\n",
"'AM-SSB-WC','AM-SSB-SC','AM-DSB-WC','AM-DSB-SC','FM','GMSK','OQPSK']\n",
"snr_classes = np.arange(-20., 32., 2) # -20dB to 30dB"
- ],
- "outputs": [],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 7,
- "source": [
- "print(data_h5.shape)\n",
- "print(label_mod.shape)\n",
- "print(label_snr.shape)\n",
- "print(len(test_indices))"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"(2555904, 1024, 2)\n",
"(2555904,)\n",
@@ -167,18 +177,33 @@
]
}
],
- "metadata": {}
+ "source": [
+ "print(data_h5.shape)\n",
+ "print(label_mod.shape)\n",
+ "print(label_snr.shape)\n",
+ "print(len(test_indices))"
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"# Inspect a single frame"
- ],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 8,
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Modulation: 16QAM, SNR: 30.0 dB\n"
+ ]
+ }
+ ],
"source": [
"%matplotlib inline\n",
"from matplotlib import pyplot as plt\n",
@@ -193,29 +218,21 @@
"plt.figure()\n",
"plt.plot(data)\n",
"print(\"Modulation: %s, SNR: %.1f dB\" % (mod_classes[mod], snr))"
- ],
- "outputs": [
- {
- "output_type": "stream",
- "name": "stdout",
- "text": [
- "Modulation: 16QAM, SNR: 30.0 dB\n"
- ]
- }
- ],
- "metadata": {}
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"# Input quantization\n",
"Quantize input data on-the-fly in software before feeding it to the accelerator. Use the uniform quantization range on which the model was trained."
- ],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 9,
+ "metadata": {},
+ "outputs": [],
"source": [
"def quantize(data):\n",
" quant_min = -2.0\n",
@@ -226,102 +243,94 @@
" data_quant = np.clip(data_quant, -128, 127)\n",
" data_quant = data_quant.astype(np.int8)\n",
" return data_quant"
- ],
- "outputs": [],
- "metadata": {}
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"# Classify a single frame"
- ],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 10,
- "source": [
- "accel_in = quantize(data).reshape(accel.ishape_normal)\n",
- "print(\"Input buffer shape is %s and datatype is %s\" % (str(accel_in.shape), str(accel_in.dtype)))"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"Input buffer shape is (1, 1024, 1, 2) and datatype is int8\n"
]
}
],
- "metadata": {}
+ "source": [
+ "accel_in = quantize(data).reshape(accel.ishape_normal())\n",
+ "print(\"Input buffer shape is %s and datatype is %s\" % (str(accel_in.shape), str(accel_in.dtype)))"
+ ]
},
{
"cell_type": "code",
"execution_count": 11,
+ "metadata": {},
+ "outputs": [],
"source": [
"accel_out = accel.execute(accel_in)"
- ],
- "outputs": [],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 12,
- "source": [
- "print(\"Result: \" + str(accel_out))\n",
- "print(\"Top-1 class predicted by the accelerator: \" + mod_classes[int(accel_out)])"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"Result: [[12.]]\n",
"Top-1 class predicted by the accelerator: 16QAM\n"
]
}
],
- "metadata": {}
+ "source": [
+ "print(\"Result: \" + str(accel_out))\n",
+ "print(\"Top-1 class predicted by the accelerator: \" + mod_classes[int(accel_out)])"
+ ]
},
{
"cell_type": "code",
"execution_count": 13,
- "source": [
- "%%timeit\n",
- "accel_out = accel.execute(accel_in)"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"1000 loops, best of 3: 822 µs per loop\n"
]
}
],
- "metadata": {}
+ "source": [
+ "%%timeit\n",
+ "accel_out = accel.execute(accel_in)"
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"# Validate accuracy on entire test set"
- ],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 14,
- "source": [
- "batch_size = 1024\n",
- "accel.batch_size = batch_size\n",
- "print(\"Accelerator buffer shapes are %s for input, %s for output\" % (str(accel.ishape_packed), str(accel.oshape_packed)) )\n",
- "print(\"Accelerator buffer shapes are %s for input, %s for output\" % (str(accel.ishape_folded), str(accel.oshape_folded)) )\n",
- "print(\"Accelerator buffer shapes are %s for input, %s for output\" % (str(accel.ishape_normal), str(accel.oshape_normal)) )"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"Accelerator buffer shapes are (1024, 1024, 1, 1, 2) for input, (1024, 1, 1) for output\n",
"Accelerator buffer shapes are (1024, 1024, 1, 1, 2) for input, (1024, 1, 1) for output\n",
@@ -329,11 +338,38 @@
]
}
],
- "metadata": {}
+ "source": [
+ "batch_size = 1024\n",
+ "accel.batch_size = batch_size\n",
+ "print(\"Accelerator buffer shapes are %s for input, %s for output\" % (str(accel.ishape_packed()), str(accel.oshape_packed())))\n",
+ "print(\"Accelerator buffer shapes are %s for input, %s for output\" % (str(accel.ishape_folded()), str(accel.oshape_folded())))\n",
+ "print(\"Accelerator buffer shapes are %s for input, %s for output\" % (str(accel.ishape_normal()), str(accel.oshape_normal())))"
+ ]
},
{
"cell_type": "code",
"execution_count": 15,
+ "metadata": {
+ "scrolled": true
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "batch 0 : total OK 1018 NOK 6\n",
+ "batch 1 : total OK 2041 NOK 7\n",
+ "batch 2 : total OK 3059 NOK 13\n",
+ "batch 3 : total OK 4082 NOK 14\n",
+ "batch 4 : total OK 4948 NOK 172\n",
+ "batch 5 : total OK 5682 NOK 462\n",
+ "batch 6 : total OK 6314 NOK 854\n",
+ "batch 7 : total OK 7039 NOK 1153\n",
+ "batch 8 : total OK 8024 NOK 1192\n",
+ "batch 9 : total OK 8648 NOK 1192\n"
+ ]
+ }
+ ],
"source": [
"ok = 0\n",
"nok = 0\n",
@@ -346,7 +382,7 @@
" batch_indices = test_indices[i_frame:i_frame+batch_size]\n",
" data, mod, snr = data_h5[batch_indices], label_mod[batch_indices], label_snr[batch_indices]\n",
"\n",
- " ibuf = quantize(data).reshape(accel.ishape_normal)\n",
+ " ibuf = quantize(data).reshape(accel.ishape_normal())\n",
" obuf = accel.execute(ibuf)\n",
"\n",
" pred = obuf.reshape(batch_size).astype(int)\n",
@@ -355,64 +391,39 @@
" nok += np.not_equal(pred, mod).sum().item()\n",
" \n",
" print(\"batch %d : total OK %d NOK %d\" % (i_batch, ok, nok))"
- ],
- "outputs": [
- {
- "output_type": "stream",
- "name": "stdout",
- "text": [
- "batch 0 : total OK 1018 NOK 6\n",
- "batch 1 : total OK 2041 NOK 7\n",
- "batch 2 : total OK 3059 NOK 13\n",
- "batch 3 : total OK 4082 NOK 14\n",
- "batch 4 : total OK 4948 NOK 172\n",
- "batch 5 : total OK 5682 NOK 462\n",
- "batch 6 : total OK 6314 NOK 854\n",
- "batch 7 : total OK 7039 NOK 1153\n",
- "batch 8 : total OK 8024 NOK 1192\n",
- "batch 9 : total OK 8648 NOK 1192\n"
- ]
- }
- ],
- "metadata": {
- "scrolled": true
- }
+ ]
},
{
"cell_type": "code",
"execution_count": 16,
- "source": [
- "acc = 100.0 * ok / (total)\n",
- "print(\"Overall top-1 accuracy: {}%\".format(acc))"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "stream",
"name": "stdout",
+ "output_type": "stream",
"text": [
"Overall top-1 accuracy: 87.88617886178862%\n"
]
}
],
- "metadata": {}
+ "source": [
+ "acc = 100.0 * ok / (total)\n",
+ "print(\"Overall top-1 accuracy: {}%\".format(acc))"
+ ]
},
{
"cell_type": "markdown",
+ "metadata": {},
"source": [
"## More benchmarking"
- ],
- "metadata": {}
+ ]
},
{
"cell_type": "code",
"execution_count": 17,
- "source": [
- "accel.batch_size = 1024\n",
- "accel.throughput_test()"
- ],
+ "metadata": {},
"outputs": [
{
- "output_type": "execute_result",
"data": {
"text/plain": [
"{'DRAM_in_bandwidth[Mb/s]': 473.18806940706867,\n",
@@ -429,18 +440,22 @@
" 'unpack_output[ms]': 0.6284713745117188}"
]
},
+ "execution_count": 17,
"metadata": {},
- "execution_count": 17
+ "output_type": "execute_result"
}
],
- "metadata": {}
+ "source": [
+ "accel.batch_size = 1024\n",
+ "accel.throughput_test()"
+ ]
},
{
"cell_type": "code",
"execution_count": null,
- "source": [],
+ "metadata": {},
"outputs": [],
- "metadata": {}
+ "source": []
}
],
"metadata": {
@@ -464,4 +479,4 @@
},
"nbformat": 4,
"nbformat_minor": 2
-}
\ No newline at end of file
+}
diff --git a/finn_examples/notebooks/6_cybersecurity_with_mlp.ipynb b/finn_examples/notebooks/6_cybersecurity_with_mlp.ipynb
new file mode 100644
index 0000000..7d946e5
--- /dev/null
+++ b/finn_examples/notebooks/6_cybersecurity_with_mlp.ipynb
@@ -0,0 +1,501 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "id": "adf1903e",
+ "metadata": {},
+ "source": [
+ "# Initialize the accelerator"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "id": "5f012d0d",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "['_unsw_nb15_mlp_io_shape_dict', 'mlp_w2a2_unsw_nb15']\n"
+ ]
+ }
+ ],
+ "source": [
+ "from finn_examples import models\n",
+ "print(list(filter(lambda x: \"unsw_nb15\" in x, dir(models))))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "1e8446d7",
+ "metadata": {},
+ "source": [
+ "Specify a batch size & create the FINN overlay. Note that the batch size must divide 82000."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "id": "ac7e32e0",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "application/javascript": "\ntry {\nrequire(['notebook/js/codecell'], function(codecell) {\n codecell.CodeCell.options_default.highlight_modes[\n 'magic_text/x-csrc'] = {'reg':[/^%%microblaze/]};\n Jupyter.notebook.events.one('kernel_ready.Kernel', function(){\n Jupyter.notebook.get_cells().map(function(cell){\n if (cell.cell_type == 'code'){ cell.auto_highlight(); } }) ;\n });\n});\n} catch (e) {};\n"
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ },
+ {
+ "data": {
+ "application/javascript": "\ntry {\nrequire(['notebook/js/codecell'], function(codecell) {\n codecell.CodeCell.options_default.highlight_modes[\n 'magic_text/x-csrc'] = {'reg':[/^%%pybind11/]};\n Jupyter.notebook.events.one('kernel_ready.Kernel', function(){\n Jupyter.notebook.get_cells().map(function(cell){\n if (cell.cell_type == 'code'){ cell.auto_highlight(); } }) ;\n });\n});\n} catch (e) {};\n"
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "batch_size = 1\n",
+ "accel = models.mlp_w2a2_unsw_nb15()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "id": "912943c1",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Expected input shape and datatype: (1, 600) BIPOLAR\n",
+ "Expected output shape and datatype: (1, 1) BIPOLAR\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\"Expected input shape and datatype: %s %s\" % (str(accel.ishape_normal()), str(accel.idt())))\n",
+ "print(\"Expected output shape and datatype: %s %s\" % (str(accel.oshape_normal()), str(accel.odt())))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "d47e899e",
+ "metadata": {},
+ "source": [
+ "# Load the binarized UNSW-NB15 test dataset"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "id": "69ba8798",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "File ‘unsw_nb15_binarized.npz’ already there; not retrieving.\r\n"
+ ]
+ }
+ ],
+ "source": [
+ "! wget -nc -O unsw_nb15_binarized.npz https://zenodo.org/record/4519767/files/unsw_nb15_binarized.npz?download=1"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "ebc2ae98",
+ "metadata": {},
+ "source": [
+ "Note that the generated design expects inputs of length 600. As explained in the [end-to-end notebook](https://github.com/Xilinx/finn/blob/main/notebooks/end2end_example/cybersecurity/1-train-mlp-with-brevitas.ipynb) in the FINN repository, padding the input data from length 593 to 600 enables SIMD parallelization for the first layer.\n",
+ "Thus, we'll have to pad our dataset before feeding it to the accelerator."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "id": "6631508a",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "import numpy as np\n",
+ "\n",
+ "def make_unsw_nb15_test_batches(bsize):\n",
+ " unsw_nb15_data = np.load(\"unsw_nb15_binarized.npz\")[\"test\"][:82000]\n",
+ " test_imgs = unsw_nb15_data[:, :-1]\n",
+ " test_imgs = np.pad(test_imgs, [(0, 0), [0, 7]], mode=\"constant\")\n",
+ " test_labels = unsw_nb15_data[:, -1]\n",
+ " n_batches = int(test_imgs.shape[0] / bsize)\n",
+ " test_imgs = test_imgs.reshape(n_batches, bsize, -1)\n",
+ " test_labels = test_labels.reshape(n_batches, bsize)\n",
+ " return (test_imgs, test_labels)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "69dbb1d3",
+ "metadata": {},
+ "source": [
+ "# Classify a single attack"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "id": "a92e1b11",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "(test_imgs, test_labels) = make_unsw_nb15_test_batches(bsize=1)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 39,
+ "id": "48f308f9",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Expected label is 0 (normal data)\n"
+ ]
+ }
+ ],
+ "source": [
+ "test_single = test_imgs[-1]\n",
+ "test_single_label = test_labels[-1].astype(np.float32)\n",
+ "\n",
+ "print(\"Expected label is: %d (%s data)\" % (test_single_label, (lambda x: \"normal\" if x==0 else \"abnormal\")(test_single_label)))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 40,
+ "id": "fc6e39fa",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# Note: the accelerator expects binary input data presented in bipolar form (i.e. {-1, 1})\n",
+ "accel_in = 2 * test_single - 1\n",
+ "accel_out = accel.execute(accel_in)\n",
+ "# To convert back to the original label (i.e. {0, 1}), we'll have to map the bipolar output to binary\n",
+ "accel_out_binary = (accel_out + 1) / 2"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 42,
+ "id": "5c273dd1",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Returned label is 0 (normal data)\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\"Returned label is: %d (%s data)\" % (accel_out_binary, (lambda x: \"normal\" if x==0 else \"abnormal\")(accel_out_binary)))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "1149e06f",
+ "metadata": {},
+ "source": [
+ "# Validate accuracy on 82000 (out of 82332) records from UNSW-NB15 test set"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "64b841f2",
+ "metadata": {},
+ "source": [
+ "To increase the throughput, let's increase the batch size. Note that the FINN accelerator operates on a batch size of 1, but to fill the compute pipeline, we'll copy a greater chunk of the test set to the device buffer."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 66,
+ "id": "f51e3e49",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "batch_size = 1000\n",
+ "accel.batch_size = batch_size\n",
+ "(test_imgs, test_labels) = make_unsw_nb15_test_batches(batch_size)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 67,
+ "id": "f8c88f49",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "ok = 0\n",
+ "nok = 0\n",
+ "n_batches = test_imgs.shape[0]\n",
+ "total = batch_size*n_batches"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 68,
+ "id": "a26358f9",
+ "metadata": {
+ "scrolled": true
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "batch 1 / 82 : total OK 866 NOK 134\n",
+ "batch 2 / 82 : total OK 1706 NOK 294\n",
+ "batch 3 / 82 : total OK 2607 NOK 393\n",
+ "batch 4 / 82 : total OK 3490 NOK 510\n",
+ "batch 5 / 82 : total OK 4438 NOK 562\n",
+ "batch 6 / 82 : total OK 5380 NOK 620\n",
+ "batch 7 / 82 : total OK 6290 NOK 710\n",
+ "batch 8 / 82 : total OK 7261 NOK 739\n",
+ "batch 9 / 82 : total OK 8174 NOK 826\n",
+ "batch 10 / 82 : total OK 9109 NOK 891\n",
+ "batch 11 / 82 : total OK 10026 NOK 974\n",
+ "batch 12 / 82 : total OK 10963 NOK 1037\n",
+ "batch 13 / 82 : total OK 11955 NOK 1045\n",
+ "batch 14 / 82 : total OK 12950 NOK 1050\n",
+ "batch 15 / 82 : total OK 13948 NOK 1052\n",
+ "batch 16 / 82 : total OK 14948 NOK 1052\n",
+ "batch 17 / 82 : total OK 15947 NOK 1053\n",
+ "batch 18 / 82 : total OK 16947 NOK 1053\n",
+ "batch 19 / 82 : total OK 17947 NOK 1053\n",
+ "batch 20 / 82 : total OK 18946 NOK 1054\n",
+ "batch 21 / 82 : total OK 19946 NOK 1054\n",
+ "batch 22 / 82 : total OK 20945 NOK 1055\n",
+ "batch 23 / 82 : total OK 21942 NOK 1058\n",
+ "batch 24 / 82 : total OK 22939 NOK 1061\n",
+ "batch 25 / 82 : total OK 23938 NOK 1062\n",
+ "batch 26 / 82 : total OK 24938 NOK 1062\n",
+ "batch 27 / 82 : total OK 25938 NOK 1062\n",
+ "batch 28 / 82 : total OK 26938 NOK 1062\n",
+ "batch 29 / 82 : total OK 27938 NOK 1062\n",
+ "batch 30 / 82 : total OK 28938 NOK 1062\n",
+ "batch 31 / 82 : total OK 29938 NOK 1062\n",
+ "batch 32 / 82 : total OK 30938 NOK 1062\n",
+ "batch 33 / 82 : total OK 31938 NOK 1062\n",
+ "batch 34 / 82 : total OK 32938 NOK 1062\n",
+ "batch 35 / 82 : total OK 33938 NOK 1062\n",
+ "batch 36 / 82 : total OK 34938 NOK 1062\n",
+ "batch 37 / 82 : total OK 35938 NOK 1062\n",
+ "batch 38 / 82 : total OK 36938 NOK 1062\n",
+ "batch 39 / 82 : total OK 37937 NOK 1063\n",
+ "batch 40 / 82 : total OK 38937 NOK 1063\n",
+ "batch 41 / 82 : total OK 39880 NOK 1120\n",
+ "batch 42 / 82 : total OK 40845 NOK 1155\n",
+ "batch 43 / 82 : total OK 41807 NOK 1193\n",
+ "batch 44 / 82 : total OK 42640 NOK 1360\n",
+ "batch 45 / 82 : total OK 43252 NOK 1748\n",
+ "batch 46 / 82 : total OK 43917 NOK 2083\n",
+ "batch 47 / 82 : total OK 44605 NOK 2395\n",
+ "batch 48 / 82 : total OK 45358 NOK 2642\n",
+ "batch 49 / 82 : total OK 46111 NOK 2889\n",
+ "batch 50 / 82 : total OK 46901 NOK 3099\n",
+ "batch 51 / 82 : total OK 47700 NOK 3300\n",
+ "batch 52 / 82 : total OK 48504 NOK 3496\n",
+ "batch 53 / 82 : total OK 49355 NOK 3645\n",
+ "batch 54 / 82 : total OK 50179 NOK 3821\n",
+ "batch 55 / 82 : total OK 51106 NOK 3894\n",
+ "batch 56 / 82 : total OK 51988 NOK 4012\n",
+ "batch 57 / 82 : total OK 52928 NOK 4072\n",
+ "batch 58 / 82 : total OK 53801 NOK 4199\n",
+ "batch 59 / 82 : total OK 54701 NOK 4299\n",
+ "batch 60 / 82 : total OK 55548 NOK 4452\n",
+ "batch 61 / 82 : total OK 56393 NOK 4607\n",
+ "batch 62 / 82 : total OK 57198 NOK 4802\n",
+ "batch 63 / 82 : total OK 58004 NOK 4996\n",
+ "batch 64 / 82 : total OK 58766 NOK 5234\n",
+ "batch 65 / 82 : total OK 59543 NOK 5457\n",
+ "batch 66 / 82 : total OK 60307 NOK 5693\n",
+ "batch 67 / 82 : total OK 61290 NOK 5710\n",
+ "batch 68 / 82 : total OK 62205 NOK 5795\n",
+ "batch 69 / 82 : total OK 63128 NOK 5872\n",
+ "batch 70 / 82 : total OK 64082 NOK 5918\n",
+ "batch 71 / 82 : total OK 65053 NOK 5947\n",
+ "batch 72 / 82 : total OK 66017 NOK 5983\n",
+ "batch 73 / 82 : total OK 66978 NOK 6022\n",
+ "batch 74 / 82 : total OK 67839 NOK 6161\n",
+ "batch 75 / 82 : total OK 68703 NOK 6297\n",
+ "batch 76 / 82 : total OK 69624 NOK 6376\n",
+ "batch 77 / 82 : total OK 70600 NOK 6400\n",
+ "batch 78 / 82 : total OK 71577 NOK 6423\n",
+ "batch 79 / 82 : total OK 72551 NOK 6449\n",
+ "batch 80 / 82 : total OK 73459 NOK 6541\n",
+ "batch 81 / 82 : total OK 74418 NOK 6582\n",
+ "batch 82 / 82 : total OK 75357 NOK 6643\n"
+ ]
+ }
+ ],
+ "source": [
+ "for i in range(n_batches):\n",
+ " inp = test_imgs[i].astype(np.float32)\n",
+ " exp = test_labels[i].astype(np.float32)\n",
+ " inp = 2 * inp - 1\n",
+ " exp = 2 * exp - 1\n",
+ " out = accel.execute(inp)\n",
+ " matches = np.count_nonzero(out.flatten() == exp.flatten())\n",
+ " nok += batch_size - matches\n",
+ " ok += matches\n",
+ " print(\"batch %d / %d : total OK %d NOK %d\" % (i + 1, n_batches, ok, nok))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 69,
+ "id": "98af33f1",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Final accuracy: 91.898780\n"
+ ]
+ }
+ ],
+ "source": [
+ "acc = 100.0 * ok / (total)\n",
+ "print(\"Final accuracy: {:.2f}%\".format(acc))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 71,
+ "id": "c7d63354",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "def run_validation():\n",
+ " for i in range(n_batches):\n",
+ " ibuf_normal = test_imgs[i].reshape(accel.ishape_normal())\n",
+ " accel.execute(ibuf_normal)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 72,
+ "id": "e6a7fa9d",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "33.3 s ± 698 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
+ ]
+ }
+ ],
+ "source": [
+ "full_validation_time = %timeit -n 1 -o run_validation()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 73,
+ "id": "e2dde028",
+ "metadata": {},
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "2542.157784 images per second including data movement\n"
+ ]
+ }
+ ],
+ "source": [
+ "print(\"%f images per second including data movement\" % (total / float(full_validation_time.best)))"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "f95c0d3a",
+ "metadata": {},
+ "source": [
+ "# More benchmarking"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 74,
+ "id": "f79491c3",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "text/plain": [
+ "{'runtime[ms]': 1.0788440704345703,\n",
+ " 'throughput[images/s]': 926918.0110497237,\n",
+ " 'DRAM_in_bandwidth[MB/s]': 69.51885082872928,\n",
+ " 'DRAM_out_bandwidth[MB/s]': 0.9269180110497238,\n",
+ " 'fclk[mhz]': 100.0,\n",
+ " 'batch_size': 1000,\n",
+ " 'fold_input[ms]': 0.09775161743164062,\n",
+ " 'pack_input[ms]': 71.11644744873047,\n",
+ " 'copy_input_data_to_device[ms]': 2.642393112182617,\n",
+ " 'copy_output_data_from_device[ms]': 0.2548694610595703,\n",
+ " 'unpack_output[ms]': 355.4694652557373,\n",
+ " 'unfold_output[ms]': 0.05626678466796875}"
+ ]
+ },
+ "execution_count": 74,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "accel.throughput_test()"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "9ca5c916",
+ "metadata": {},
+ "source": [
+ "The measured `throughput` of the accelerator, excluding any software and data movement overhead, is influenced by the batch size. The more we fill the compute pipeline, the higher the throughput.\n",
+ "Note that the total runtime consists of the overhead of packing/unpacking the inputs/outputs to convert form numpy arrays to the bit-contiguous data representation our accelerator expectes (`pack_input`/`unpack_output`), the cost of moving data between the CPU and accelerator memories (`copy_input_data_to_device`/`copy_output_data_from_device`), as well as the accelerator's execution time."
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.10.4"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/finn_examples/qonnx/core/datatype.py b/finn_examples/qonnx/core/datatype.py
index 1fac858..7adc272 100644
--- a/finn_examples/qonnx/core/datatype.py
+++ b/finn_examples/qonnx/core/datatype.py
@@ -78,7 +78,6 @@ def max(self):
@abstractmethod
def allowed(self, value):
"""Check whether given value is allowed for this DataType.
-
* value (float32): value to be checked"""
pass
@@ -105,12 +104,12 @@ def get_hls_datatype_str(self):
@abstractmethod
def to_numpy_dt(self):
- "Return an appropriate numpy datatype that can represent this FINN DataType."
+ "Return an appropriate numpy datatype that can represent this QONNX DataType."
pass
@abstractmethod
def get_canonical_name(self):
- "Return a canonical string representation of this FINN DataType."
+ "Return a canonical string representation of this QONNX DataType."
class FloatType(BaseDataType):
@@ -339,8 +338,8 @@ def __getitem__(self, name):
class DataType(Enum, metaclass=DataTypeMeta):
- """Enum class that contains FINN data types to set the quantization annotation.
- ONNX does not support data types smaller than 8-bit integers, whereas in FINN we are
+ """Enum class that contains QONNX data types to set the quantization annotation.
+ ONNX does not support data types smaller than 8-bit integers, whereas in QONNX we are
interested in smaller integers down to ternary and bipolar."""
@staticmethod
@@ -362,4 +361,4 @@ def get_smallest_possible(value):
dt = DataType[cand]
if (dt.min() <= value) and (value <= dt.max()):
return dt
- raise Exception("Could not find a suitable int datatype for " + str(value))
+ raise Exception("Could not find a suitable int datatype for " + str(value))
\ No newline at end of file
diff --git a/finn_examples/qonnx/util/basic.py b/finn_examples/qonnx/util/basic.py
index cc467b1..ee948dd 100644
--- a/finn_examples/qonnx/util/basic.py
+++ b/finn_examples/qonnx/util/basic.py
@@ -34,6 +34,33 @@
from qonnx.core.datatype import DataType
+# TODO solve by moving onnx-dependent fxns to onnx.py
+# finn-examples uses parts of qonnx without having
+# onnx installed and doesn't use this functionality
+# workaround to avoid import errors when onnx isn't
+# installed:
+try:
+ from onnx.helper import make_model, make_opsetid
+except ModuleNotFoundError:
+ make_model = None
+ make_opsetid = None
+
+
+def get_preferred_onnx_opset():
+ "Return preferred ONNX opset version for QONNX"
+ return 11
+
+
+def qonnx_make_model(graph_proto, **kwargs):
+ "Wrapper around ONNX make_model with preferred qonnx opset version"
+ opset_imports = kwargs.pop("opset_imports", None)
+ if opset_imports is None:
+ opset_imports = [make_opsetid("", get_preferred_onnx_opset())]
+ kwargs["opset_imports"] = opset_imports
+ else:
+ kwargs["opset_imports"] = opset_imports
+ return make_model(graph_proto, **kwargs)
+
def is_finn_op(op_type):
"Return whether given op_type string is a QONNX or FINN custom op"
@@ -181,35 +208,22 @@ def pad_tensor_to_multiple_of(ndarray, pad_to_dims, val=0, distr_pad=False):
return ret
-def calculate_matvec_accumulator_range(matrix, vec_dt):
+def calculate_matvec_accumulator_range(matrix: np.ndarray, vec_dt: DataType):
"""Calculate the minimum and maximum possible result (accumulator) values
for a dot product x * A, given matrix A of dims (MW, MH), and vector (1, MW)
with datatype vec_dt. Returns (acc_min, acc_max).
"""
- min_weight = matrix.min()
- max_weight = matrix.max()
- perceptive_field_elems = matrix.shape[0]
- min_input = vec_dt.min()
- max_input = vec_dt.max()
- # calculate minimum and maximum values of accumulator
- # assume inputs span the whole range of the input datatype
- acc_min = perceptive_field_elems * min(
- min_weight * max_input,
- min_weight * min_input,
- max_weight * max_input,
- max_weight * min_input,
- )
- acc_max = perceptive_field_elems * max(
- min_weight * max_input,
- min_weight * min_input,
- max_weight * max_input,
- max_weight * min_input,
- )
- return (acc_min, acc_max)
+ max_weight = abs(matrix).sum(axis=0).max()
+ max_input = max(abs(vec_dt.min()), abs(vec_dt.max()))
+ max_value = max_input * max_weight
+ # If either the weight and input datatypes are signed, then the minimum
+ # value that their accumulated product can be is -max_value. Else, it's 0.
+ min_value = -max_value if (matrix.min() < 0) or vec_dt.signed() else 0
+ return (min_value, max_value)
def gen_finn_dt_tensor(finn_dt, tensor_shape):
- """Generates random tensor in given shape and with given FINN DataType."""
+ """Generates random tensor in given shape and with given QONNX DataType."""
if type(tensor_shape) == list:
tensor_shape = tuple(tensor_shape)
if finn_dt == DataType["BIPOLAR"]:
@@ -255,10 +269,8 @@ def sanitize_quant_values(model, node_tensors, execution_context, check_values=F
that are supposed to be integers (as indicated by their quantization
annotation). Will raise an assertion if the amount of rounding is too large.
Returns the sanitized execution context.
-
If check_values is specified, an extra DataType.allowed() check will be
performed on any rounded tensors.
-
Background:
QONNX uses floating point tensors as a carrier data type to represent
integers. Floating point arithmetic can introduce rounding errors, e.g.
@@ -305,9 +317,9 @@ def sanitize_quant_values(model, node_tensors, execution_context, check_values=F
execution_context[tensor_name] = updated_values
else:
raise Exception(
- """Rounding error is too high to match set FINN
+ """Rounding error is too high to match set QONNX
datatype ({}) for input {}""".format(
dtype, tensor_name
)
)
- return execution_context
+ return
\ No newline at end of file
diff --git a/setup.py b/setup.py
index d7aef96..4708827 100644
--- a/setup.py
+++ b/setup.py
@@ -17,7 +17,7 @@
import os
import zipfile
from distutils.command.build import build as dist_build
-from pynq.utils import build_py as _build_py
+from pynqutils.setup_utils import build_py as _build_py
__author__ = "Yaman Umuroglu"
__copyright__ = "Copyright 2020-2021, Xilinx"