InstantID notebook (openvinotoolkit#1698)

* InstantID notebook * formating and spell check * readme upd * Apply suggestions from code review * add examples for gradio, update requirements * add table of contents * move notebook * Update .ci/ignore_convert_execution.txt
nikita-savelyevv · Feb 13, 2024 · 4d251af · 4d251af
1 parent bbd325e
commit 4d251af
Show file tree

Hide file tree

Showing 9 changed files with 2,420 additions and 3 deletions.
diff --git a/.ci/ignore_convert_execution.txt b/.ci/ignore_convert_execution.txt
@@ -62,4 +62,5 @@ notebooks/271-sdxl-turbo/271-sdxl-turbo.ipynb
 notebooks/272-paint-by-example/272-paint-by-example.ipynb
 notebooks/273-stable-zephyr-3b-chatbot/273-stable-zephyr-3b-chatbot.ipynb
 notebooks/275-llm-question-answering/275-llm-question-answering.ipynb
+notebooks/286-instant-id/286-instant-id.ipynb
 notebooks/404-style-transfer-webcam/404-style-transfer.ipynb
diff --git a/.ci/ignore_treon_docker.txt b/.ci/ignore_treon_docker.txt
@@ -48,6 +48,7 @@
 281-kosmos2-multimodal-large-language-model
 283-photo-maker
 285-surya-line-level-text-detection
+286-instant-id
 301-tensorflow-training-openvino
 305-tensorflow-quantization-aware-training
 404-style-transfer-webcam
diff --git a/.ci/ignore_treon_linux.txt b/.ci/ignore_treon_linux.txt
@@ -51,4 +51,5 @@
 281-kosmos2-multimodal-large-language-model
 283-photo-maker
 285-surya-line-level-text-detection
+286-instant-id
 404-style-transfer-webcam
diff --git a/.ci/ignore_treon_mac.txt b/.ci/ignore_treon_mac.txt
@@ -50,4 +50,5 @@
 283-photo-maker
 284-openvoice
 285-surya-line-level-text-detection
+286-instant-id
 404-style-transfer-webcam
diff --git a/.ci/ignore_treon_win.txt b/.ci/ignore_treon_win.txt
@@ -49,4 +49,5 @@
 276-stable-diffusion-torchdynamo-backend
 281-kosmos2-multimodal-large-language-model
 283-photo-maker
-285-surya-line-level-text-detection
+285-surya-line-level-text-detection
+286-instant-id
diff --git a/.ci/spellcheck/.pyspelling.wordlist.txt b/.ci/spellcheck/.pyspelling.wordlist.txt
@@ -16,8 +16,10 @@ AlpacaEval
 aMUSEd
 analytics
 AnimeGAN
+Antelopev
 api
 APIs
+Arcface
 argmax
 artstation
 ASPP
@@ -92,6 +94,7 @@ ContentVec
 Contrastive
 contrastive
 ControlNet
+ControlNets
 controlnet
 ConvE
 conve
@@ -179,6 +182,7 @@ downsample
 downsampled
 DPO
 dpredictor
+DreamBooth
 Dreamshaper
 dropdown
 ECCV
@@ -256,6 +260,7 @@ hyperparameters
 ICIP
 ICPR
 iGPU
+IdentityNet
 iGPUs
 Ilija
 ImageBind
@@ -272,7 +277,9 @@ inferencing
 InferRequest
 InferRequests
 inpainting
+InsightFace
 installable
+InstantID
 instantiation
 InstructGPT
 InstructPix
@@ -447,6 +454,7 @@ OpenVINO
 openvino
 OpenVino
 OpenVINO's
+openvoice
 opset
 optimizable
 Orca
@@ -569,6 +577,7 @@ rescaling
 Rescaling
 ResNet
 resnet
+RetinaFace
 RGB
 Riffusion
 riffusion

diff --git a/README.md b/README.md
@@ -46,7 +46,6 @@ Check out the latest notebooks that show how to optimize and deploy popular mode
 | [FILM](notebooks/269-film-slowmo)<br> | Frame interpolation with FILM and OpenVINO™ | <img src="https://github.com/openvinotoolkit/openvino_notebooks/assets/1720147/8ac5178d-4a92-4a86-a3df-dd494434fed6" width=300> |
 | [Audio LDM 2](notebooks/270-sound-generation-audioldm2/)<br> | Sound Generation with AudioLDM2 and OpenVINO™ | <img src="https://github.com/openvinotoolkit/openvino_notebooks/assets/76463150/c93a0f86-d9cf-4bd1-93b9-e27532170d75" width=300> |
 | [SDXL-Turbo](notebooks/271-sdxl-turbo/)<br> | Single-step image generation using SDXL-turbo and OpenVINO | <img src="https://github.com/openvinotoolkit/openvino_notebooks/assets/29454499/79b625c7-0f0a-4f19-8e38-e6f896f75c3e" width=300> |
-| [Segmind-VegaRT](notebooks/248-stable-diffusion-xl/248-segmind-vegart.ipynb)<br> | High-resolution image generation with Segmind-VegaRT and OpenVINO | <img src="https://github.com/openvinotoolkit/openvino_notebooks/assets/29454499/66bfe823-01c8-4749-a8aa-419a1d78a070" width=300> |
 | [Stable-Zephyr chatbot](notebooks/273-stable-zephyr-3b-chatbot/)<br> | Use Stable-Zephyr as chatbot assistant with OpenVINO | <img src=https://github.com/openvinotoolkit/openvino_notebooks/assets/29454499/cfac6ddb-6f22-4343-855c-e513269cf2bf width=300> |
 | [Efficient-SAM](notebooks/274-efficient-sam)<br> | Object segmentation with EfficientSAM and OpenVINO | <img src=https://github.com/openvinotoolkit/openvino_notebooks/assets/29454499/15d0a23a-0550-43c6-9ca9-f689e772a79a width=300> |
 | [LLM Instruction following pipeline](notebooks/275-llm-question-answering)<br> | Usage variety of LLM models for answering questions using OpenVINO | <img src=https://github.com/openvinotoolkit/openvino_notebooks/assets/29454499/daafd702-5a42-4f54-ae72-2e4480d73501 width=300> |
@@ -56,6 +55,7 @@ Check out the latest notebooks that show how to optimize and deploy popular mode
 | [Kosmos-2: Grounding Multimodal Large Language Models](notebooks/281-kosmos2-multimodal-large-language-model)<br> | Kosmos-2: Grounding Multimodal Large Language Model and OpenVINO™ | <img src=https://huggingface.co/microsoft/kosmos-2-patch14-224/resolve/main/annotated_snowman.jpg width=225> |
 | [PhotoMaker](notebooks/283-photo-maker)<br> | Text-to-image generation using PhotoMaker and OpenVINO | <img src=https://github.com/openvinotoolkit/openvino_notebooks/assets/91237924/88bccc4a-5789-42ca-8a68-f402c3e7c5a4 width=300> |
 | [OpenVoice](notebooks/284-openvoice)<br>[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/openvinotoolkit/openvino_notebooks/HEAD?filepath=notebooks%2F284-openvoice%2F284-openvoice.ipynb)[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/openvinotoolkit/openvino_notebooks/blob/main/notebooks/284-openvoice/284-openvoice.ipynb) | OpenVoice a versatile instant voice tone transferring and generating speech in various languages. |<img src=https://github.com/openvinotoolkit/openvino_notebooks/assets/5703039/ca7eab80-148d-45b0-84e8-a5a279846b51 width=300> |
+| [InstantID](notebooks/286-instant-id)<br> | InstantID: Zero-shot Identity-Preserving Image Generation using OpenVINO| <img src=https://github.com/openvinotoolkit/openvino_notebooks/assets/29454499/082b3da7-6bb6-4551-bfa6-0e43d8e80b51 width=300> |
 
 ## Table of Contents
 
@@ -238,7 +238,9 @@ Demos that demonstrate inference on a particular model.
 | [281-kosmos2-multimodal-large-language-model](notebooks/281-kosmos2-multimodal-large-language-model)<br> | Kosmos-2: Multimodal Large Language Model and OpenVINO™ | <img src=https://huggingface.co/microsoft/kosmos-2-patch14-224/resolve/main/annotated_snowman.jpg width=225> |
 | [282-siglip-zero-shot-image-classification](notebooks/282-siglip-zero-shot-image-classification)<br>[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/openvinotoolkit/openvino_notebooks/blob/main/notebooks/282-siglip-zero-shot-image-classification/282-siglip-zero-shot-image-classification.ipynb) | Zero-shot Image Classification with SigLIP | <img src=https://github.com/openvinotoolkit/openvino_notebooks/assets/67365453/c4eb782c-0fef-4a89-a5c6-5cc43518490b width=500> |
 | [283-photo-maker](notebooks/283-photo-maker)<br> | Text-to-image generation using PhotoMaker and OpenVINO | <img src=https://github.com/openvinotoolkit/openvino_notebooks/assets/91237924/88bccc4a-5789-42ca-8a68-f402c3e7c5a4 width=225> | 
-| [285-surya-line-level-text-detection](notebooks/285-surya-line-level-text-detection)<br>[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/openvinotoolkit/openvino_notebooks/blob/main/notebooks/285-surya-line-level-text-detection/285-surya-line-level-text-detection.ipynb) | Line-level text detection with Surya | <img src=https://github.com/openvinotoolkit/openvino_notebooks/assets/67365453/7672eb6d-fafb-4ae3-b894-9f98acfeb53a width=225> | 
+| [284-openvoice](notebooks/284-openvoice)<br>[![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/openvinotoolkit/openvino_notebooks/HEAD?filepath=notebooks%2F284-openvoice%2F284-openvoice.ipynb)[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/openvinotoolkit/openvino_notebooks/blob/main/notebooks/284-openvoice/284-openvoice.ipynb) | OpenVoice a versatile instant voice tone transferring and generating speech in various languages. |<img src=https://github.com/openvinotoolkit/openvino_notebooks/assets/5703039/ca7eab80-148d-45b0-84e8-a5a279846b51 width=300> |
+| [285-surya-line-level-text-detection](notebooks/285-surya-line-level-text-detection)<br>[![Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/openvinotoolkit/openvino_notebooks/blob/main/notebooks/285-surya-line-level-text-detection/285-surya-line-level-text-detection.ipynb) | Line-level text detection with Surya | <img src=https://github.com/openvinotoolkit/openvino_notebooks/assets/67365453/7672eb6d-fafb-4ae3-b894-9f98acfeb53a width=225> |
+| [286-instant-id](notebooks/286-instant-id)<br> | InstantID: Zero-shot Identity-Preserving Image Generation using OpenVINO| <img src=https://github.com/openvinotoolkit/openvino_notebooks/assets/29454499/082b3da7-6bb6-4551-bfa6-0e43d8e80b51 width=225> |
 
 <div id='-model-training'></div>