Skip to content

Latest commit

 

History

History
122 lines (97 loc) · 4.6 KB

File metadata and controls

122 lines (97 loc) · 4.6 KB
page_type languages products name urlFragment description azureDeploy
sample
csharp
azure
azure-cognitive-search
ai-services
Custom Vision integration sample skill for AI search
azure-customvision-sample
This custom skill extracts tags from a trained Custom Vision model (classification or object detection).

Custom Vision

This custom skill extracts tags from a trained Custom Vision model (classification or object detection).

Requirements

In addition to the common requirements described in the root README.md file, this function requires access to a Custom Vision resource.

You will need to train a model with your images before you can use this skill. Both classification and object detection models will work.

Settings

This function requires a CUSTOM_VISION_PREDICTION_URL and a CUSTOM_VISION_API_KEY settings set to a valid Custom Vision API key and to your Custom Vision prediction endpoint.

The function will attempt to send a binary representation of the input image to Custom Vision, so you should use the /image endpoint URL for Custom Vision as described here. If running locally, this can be set in your project's debug environment variables (go to project properties, in the debug tab). This ensures your key won't be accidentally checked in with your code. If running in an Azure function, this can be set in the application settings.

Optionally, you can set MAX_PAGES to control how many pages in the document will be sent to Custom Vision (default is 1, so only the first page will be sent).

Also, you can set MIN_PROBABILITY_THRESHOLD which will only return tags with a probability above the desired threshold (default is 0.5).

Deployment

Deploy to Azure

Sample Input:

{
    "values": [
        {
            "recordId": "record1",
            "data": { 
                "pages":  [
                    "Base64 encoding of first page image",
                    "Base64 encoding of second page image"
                    ...
                ]
            }
        }
    ]
}

Sample Output:

{
    "values": [
        {
            "recordId": "record1",
            "data": {
                "tags" : ["tag 1", "tag 2", ...]
            },
            "errors": null,
            "warnings": null
        }
    ]
}

Sample Skillset Integration

In order to use this skill in a AI search pipeline, you'll need to add a skill definition to your skillset. Here's a sample skill definition for this example (inputs and outputs should be updated to reflect your particular scenario and skillset environment):

{
    "@odata.type": "#Microsoft.Skills.Custom.WebApiSkill",
    "context": "/document",
    "uri": "[AzureFunctionEndpointUrl]/api/custom-vision?code=[AzureFunctionDefaultHostKey]",
    "batchSize": 1,
    "inputs": [
    {
        "name": "pages",
        "source": "/document/normalized_images/*/data"
    }
    ],
    "outputs": [
    {
        "name": "tags",
        "targetName": "tags"
    }
    ]
}

Generating normalized images for the skill to use

The skill requires an array of images - one for each page in the original document - in the pages input. This example uses the built-in document cracking pipeline to extract normalized images, one for each page in the document.

Below is an example of the indexer configuration for this step. Notice the use of the generateNormalizedImagePerPage image action.

    "parameters": {
        "configuration": {
            "dataToExtract": "contentAndMetadata",
            "imageAction": "generateNormalizedImagePerPage",
            "normalizedImageMaxWidth": 3000,
            "normalizedImageMaxHeight": 3000
        }
    }

As an alternative, you can use the built-in Document Extraction cognitive skill as part of your skillset.