From 8601a67dbde1ca040ba4746e9e05bc4b3667115c Mon Sep 17 00:00:00 2001 From: Mwiza <43536864+kundaMwiza@users.noreply.github.com> Date: Wed, 24 May 2023 09:57:20 +0100 Subject: [PATCH] Deberta notebook fix: parallelize pipelined model (#43) Parallelize must be called in order to add poptorch block annotations to model layers. Without this the model will only run on 1 IPU. --------- Co-authored-by: Alexandre Payot <18074599+payoto@users.noreply.github.com> --- .../deberta-blog-notebook.ipynb | 21 +++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-) diff --git a/natural-language-processing/other-use-cases/deberta-blog-notebook.ipynb b/natural-language-processing/other-use-cases/deberta-blog-notebook.ipynb index d82d80f..c7eb101 100644 --- a/natural-language-processing/other-use-cases/deberta-blog-notebook.ipynb +++ b/natural-language-processing/other-use-cases/deberta-blog-notebook.ipynb @@ -1,6 +1,7 @@ { "cells": [ { + "attachments": {}, "cell_type": "markdown", "id": "2f37d919-8e25-4149-9f94-6aeebce8d2cd", "metadata": {}, @@ -34,14 +35,18 @@ "oracle(question=\"Where do I live?\", context=\"My name is Wolfgang and I live in Berlin\")\n", "```\n", "\n", - "However in some cases such as MNLI, there is no off-the-shelf pipeline ready to use. In this case, you could simply instantiate the model, use the optimum-specific call `to_pipelined` to pipeline the model according to the `IPUConfig`, and prepare it for inference using `poptorch.inferenceModel()`.\n", + "However in some cases such as MNLI, there is no off-the-shelf pipeline ready to use. In this case, you could simply:\n", + "- Instantiate the model with the correct execution mode\n", + "- Use the optimum-specific call `to_pipelined` to return the model with changes and annotations for running on the IPU\n", + "- Set the model to run in `eval` mode and use the `parallelize` method on the new model to parallelize it across IPUs\n", + "- Prepare it for inference using `poptorch.inferenceModel()`\n", "\n", "```\n", "model = DebertaForQuestionAnswering.from_pretrained(\"Palak/microsoft_deberta-base_squad\")\n", "\n", "ipu_config = IPUConfig(ipus_per_replica=2, matmul_proportion=0.2, executable_cache_dir=\"./exe_cache\")\n", - "pipelined_model = to_pipelined(model, ipu_config)\n", - "pipelined_model = poptorch.inferenceModel(pipelined_model)\n", + "pipelined_model = to_pipelined(model, ipu_config).eval().parallelize()\n", + "pipelined_model = poptorch.inferenceModel(pipelined_model, options=ipu_config.to_options(for_inference=True))\n", "```\n", "\n", "This method is demoed in this notebook, as Huggingface do not natively support the MNLI inference task." @@ -151,6 +156,14 @@ "model.half()" ] }, + { + "cell_type": "code", + "execution_count": null, + "id": "3bd484d3", + "metadata": {}, + "outputs": [], + "source": [] + }, { "attachments": {}, "cell_type": "markdown", @@ -295,7 +308,7 @@ "outputs": [], "source": [ "ipu_config = IPUConfig(ipus_per_replica=2, matmul_proportion=0.2, executable_cache_dir=executable_cache_dir)\n", - "pipelined_model = to_pipelined(model, ipu_config).parallelize()\n", + "pipelined_model = to_pipelined(model, ipu_config).eval().parallelize()\n", "pipelined_model = poptorch.inferenceModel(pipelined_model, options=ipu_config.to_options(for_inference=True))" ] },