Visual Question and Answering

Visual Question Answering (VQA) is the task of answering open-ended questions based on an image. The input to models supporting this task is typically a combination of an image and a question, and the output is an answer expressed in natural language.

Some noteworthy use case examples for VQA include:

Accessibility applications for visually impaired individuals.
Education: posing questions about visual materials presented in lectures or textbooks. VQA can also be utilized in interactive museum exhibits or historical sites.
Customer service and e-commerce: VQA can enhance user experience by letting users ask questions about products.
Image retrieval: VQA models can be used to retrieve images with specific characteristics. For example, the user can ask “Is there a dog?” to find all images with dogs from a set of images.

General architecture of VQA shows below:

The VisualQnA example is implemented using the component-level microservices defined in GenAIComps. The flow chart below shows the information flow between different microservices for this example.

---
config:
  flowchart:
    nodeSpacing: 400
    rankSpacing: 100
    curve: linear
  themeVariables:
    fontSize: 50px
---
flowchart LR
    %% Colors %%
    classDef blue fill:#ADD8E6,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
    classDef orange fill:#FBAA60,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
    classDef orchid fill:#C26DBC,stroke:#ADD8E6,stroke-width:2px,fill-opacity:0.5
    classDef invisible fill:transparent,stroke:transparent;
    style VisualQnA-MegaService stroke:#000000

    %% Subgraphs %%
    subgraph VisualQnA-MegaService["VisualQnA MegaService "]
        direction LR
        LVM([LVM MicroService]):::blue
    end
    subgraph UserInterface[" User Interface "]
        direction LR
        a([User Input Query]):::orchid
        Ingest([Ingest data]):::orchid
        UI([UI server<br>]):::orchid
    end


    LVM_gen{{LVM Service <br>}}
    GW([VisualQnA GateWay<br>]):::orange
    NG([Nginx MicroService]):::blue


    %% Questions interaction
    direction LR
    Ingest[Ingest data] --> UI
    a[User Input Query] --> |Need Proxy Server|NG
    a[User Input Query] --> UI
    NG --> UI
    UI --> GW
    GW <==> VisualQnA-MegaService


    %% Embedding service flow
    direction LR
    LVM <-.-> LVM_gen

Loading

This example guides you through how to deploy a LLaVA-NeXT (Open Large Multimodal Models) model on Intel Gaudi2 and Intel Xeon Scalable Processors. We invite contributions from other hardware vendors to expand the OPEA ecosystem.

Required Models

By default, the model is set to llava-hf/llava-v1.6-mistral-7b-hf. To use a different model, update the LVM_MODEL_ID variable in the set_env.sh file.

export LVM_MODEL_ID="llava-hf/llava-v1.6-mistral-7b-hf"

You can choose other llava-next models, such as llava-hf/llava-v1.6-vicuna-13b-hf, as needed.

Deploy VisualQnA Service

The VisualQnA service can be effortlessly deployed on either Intel Gaudi2 or Intel Xeon Scalable Processors.

Currently we support deploying VisualQnA services with docker compose.

Setup Environment Variable

To set up environment variables for deploying VisualQnA services, follow these steps:

Set the required environment variables:

# Example: host_ip="192.168.1.1"
export host_ip="External_Public_IP"
# Example: no_proxy="localhost, 127.0.0.1, 192.168.1.1"
export no_proxy="Your_No_Proxy"

If you are in a proxy environment, also set the proxy-related environment variables:
```
export http_proxy="Your_HTTP_Proxy"
export https_proxy="Your_HTTPs_Proxy"
```
Set up other environment variables:

Notice that you can only choose one command below to set up envs according to your hardware. Other that the port numbers may be set incorrectly.
```
# on Gaudi
source ./docker_compose/intel/hpu/gaudi/set_env.sh
# on Xeon
source ./docker_compose/intel/cpu/xeon/set_env.sh
```

Deploy VisualQnA on Gaudi

Refer to the Gaudi Guide to build docker images from source.

Find the corresponding compose.yaml.

cd GenAIExamples/VisualQnA/docker_compose/intel/hpu/gaudi/
docker compose up -d

Deploy VisualQnA on Xeon

Refer to the Xeon Guide for more instructions on building docker images from source.

Find the corresponding compose.yaml.

cd GenAIExamples/VisualQnA/docker_compose/intel/cpu/xeon/
docker compose up -d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Visual Question and Answering

Required Models

Deploy VisualQnA Service

Setup Environment Variable

Deploy VisualQnA on Gaudi

Deploy VisualQnA on Xeon

Files

README.md

Latest commit

History

README.md

File metadata and controls

Visual Question and Answering

Required Models

Deploy VisualQnA Service

Setup Environment Variable

Deploy VisualQnA on Gaudi

Deploy VisualQnA on Xeon