diff --git a/docs/2-Portal-First/1-Overview/00.md b/docs/2-Portal-First/1-Overview/00.md index d28f521..660fce4 100644 --- a/docs/2-Portal-First/1-Overview/00.md +++ b/docs/2-Portal-First/1-Overview/00.md @@ -1,19 +1,32 @@ -# 1.1 Learning Objectives +# 1.1 Learning Roadmap -!!! info "This workshop focuses on providing developers with an end-to-end workflow using the Azure AI Foundry Portal for development. It is assembled using a diverse set of resources from the official documentation." +!!! info "TThis workshop teaches you the capabilities of the Azure AI Foundry Portal with a set of interactive labs that take you from catalog (model selection) to cloud (application deployment). The labs are derived from the documentation but assembled into an end-to-end narrative for an AI Engineer journey." -This lab teaches you how to build a RAG-based copilot using the Azure AI Foundry **Portal** as the default developer environment. By completing this lab, you'll gain a complete understanding of the Azure AI Foundry Portal features and learn to do the following tasks: +--- +## 1. Core Objectives + +This workshop has two core objectives: + +- Develop familiarity with the layout and capabilities of the Azure AI Foundry Portal (web UI) +- Learn how to build, evaluate, and deploy, a RAG-based generative AI app portal-first. + +In this context, _portal-first_ means that we prioritize using the Azure AI Foundry portal for the end-to-end developer workflow. By comprison, the [Hybrid](./../../1-Hybrid-Workshop/1-Overview/00.md) approach uses the Azure AI Foundry Portal (low-code) for setup and the Azure AI Foundry SDK (code-first) for ideation and evaluation. + +--- + +## 2. Learning Journey + +By completing the labs in this workshop, you will learn to do the following: + +1. **Model Selection** - use the Azure AI model catalog to discover and compare models. +1. **Project Setup** - create an Azure AI hub & project with models and connected resources. +1. **Ideation** - go from initial prompt to functional prototype using model (with & without data). +1. **Evaluation** - learn about built-in and custom evaluators, run an evaluation flow & view results. +1. **Observability** - learn about tracing and app insights, view run traces in the portal. +1. **Deployment** - go from prototype to production by deploying an app and using the endpoint. -1. **SETUP your Azure AI Foundry Project** - - create a new Azure AI project - - customize it by creating a new Azure AI hub resource - - customize it by adding an Azure AI Search resource - - customize it by adding an Application Insights resource -1. **SELECT models from Azure AI model catalog** -1. **ADD DATA to your application using the RAG pattern** -1. **EVALUATE your application using built-in evaluators** -1. **DEPLOY your application from the portal** +Along the way, we'll also understand how to orchestrate complex workflows in the portal using the currently-provided tooling (prompt flow) and a retrieval-augmented generation pattern (RAG) to improve responses by grounding them in your data. --- -??? info "BONUS → Once you've completed this exercise, try the [Hybrid Workshop](./../../1-Hybrid-Workshop/1-Overview/00.md) to get your first experience with the Azure AI Foundry SDK for a code-first development workflow." +??? quote "OPTIONAL → Once you've completed this exercise, try the [Hybrid Workshop](./../../1-Hybrid-Workshop/1-Overview/00.md) to get your first experience with the Azure AI Foundry SDK for a code-first development workflow in Python." diff --git a/docs/2-Portal-First/1-Overview/01.md b/docs/2-Portal-First/1-Overview/01.md index ded0024..ac6dd39 100644 --- a/docs/2-Portal-First/1-Overview/01.md +++ b/docs/2-Portal-First/1-Overview/01.md @@ -1,18 +1,54 @@ -# 1.1 Application Scenario +# 1.2 Application Lifecycle -To simplify our walkthrough, we'll use the same Application Scenario and Application Data resources defined in the [Hybrid Workshop](./../../1-Hybrid-Workshop/1-Overview/00.md) path. +## 1. Generative AI Operations -- See: [Contoso Outdoor](./../../1-Hybrid-Workshop/1-Overview/01.md#2-contoso-outdoor-chat-ui) to understand the enterprise retail application scenario. -- See: [Application Data](./../../1-Hybrid-Workshop/1-Overview/02.md) to understand customer, product & manual data formats. +When we think about the AI Engineer's journey from prompt to production, we also need to understand [the paradigm shifts in Generative AI Operations](https://techcommunity.microsoft.com/blog/aiplatformblog/the-future-of-ai-the-paradigm-shifts-in-generative-ai-operations/4254216) based on the following challenges faced by customers: + +- **Complex Model Landscape** - how can I select the right model for my use case? +- **Data Quality & Quantity** - how can I discover or generate quality datasets for use? +- **Operational Performance** - how can I balance tokens, cost & performance optimization? +- **Security & Compliance** - how can I meet regulatory requirements & deliver trustworthy AI? + +The result is a paradigm shift from traditional MLOps to LLMOps - and now **GenAIOps** - with focus on a _comprehensive set of practives, tools, foundation models, and frameworks_ to intergrate people, processes and platforms. The Azure AI platform offers a robust suite of tools and services to support this end-to-end developer journey, as shown below. + +![GenAIOps toolchain](./../img/overview-genaiops-toolchains.png) + +Let's look at how we can go from prompt to production using a **Portal-first** approach where we prioritize usage of tools and processes in the browser-based UI for a low-code experience. --- -![Contoso Chat](./../../1-Hybrid-Workshop/img/contoso-chat.png) +## 2. E2E Development Worflow + +To put the labs in context, let's look at the end-to-end application lifecycle from an AI Engineer perspective. The process can be broken into three stages: _ideation_ (prompt to prototype), _augmentation_ (prototype to production) and _operationalization_ (performance optimization). + +![GenAIOps toolchain](./../img/overview-genaiops-flow.png) + +These stages map loosely onto GenAI Ops toolchains as follows: + +1. **Ideation** = _Getting started_, _Customization_, _Prompt Management_ → prompt to prototype. +- **Augmentation** = _Evaluation_ and _Orchestration_ → prototype to production +- **Operationalization** - _Automation_ and _Monitoring_ → usage to optimization -Assume that you are building the retail copilot chat AI (backend) that can be accessed from the Contoso Outdoor UI (frontend) shown below. Your chat AI needs to do the following: +In the **Hybrid** track we used the Azure AI Foundry portal for the initial setup but prioritized the SDK for development and production stages. In **this track** we'll instead look at each of the toolchains steps with a **Portal-first** approach. + + +## 3. Retail RAG scenario + +While we can explore the development journey in abstract, it can help to have an application scenario to contextualize and frame the discussion. For convenience, let's repurpose the same application scenario used in the **Hybrid Track** (summarized below). + +!!! quote "Some labs (e.g., Model Selection) may be general-purpose and not reflect this specific scenario." + +_Assume that you are building the retail copilot chat AI (backend) that can be accessed from the Contoso Outdoor UI (frontend) shown below. Your chat AI needs to do the following:_ - Answer customer queries in natural language (= generative AI) - Give answers grounded in product data (= RAG design pattern) - Give answers that are **also** coherent, fluent & relevant (= evaluators) - Block customer requests that have harmful intent (= content safety) - Block customer requests that break the rules (= jailbreak protection) + +![Contoso Chat](./../../1-Hybrid-Workshop/img/contoso-chat.png) + +- See: [Contoso Outdoor](./../../1-Hybrid-Workshop/1-Overview/01.md#2-contoso-outdoor-chat-ui) to understand the enterprise retail application scenario. +- See: [Application Data](./../../1-Hybrid-Workshop/1-Overview/02.md) to understand customer, product & manual data formats. + +--- \ No newline at end of file diff --git a/docs/2-Portal-First/1-Overview/02.md b/docs/2-Portal-First/1-Overview/02.md index c3c2e90..7e315c9 100644 --- a/docs/2-Portal-First/1-Overview/02.md +++ b/docs/2-Portal-First/1-Overview/02.md @@ -1,19 +1,24 @@ -# 1.2 Azure AI Foundry Portal +# 1.3 Azure AI Foundry Portal -In this workshop track we will prioritize the use of the browser-based web UI (Azure AI Foundry portal). Before we begin, let's understand the architecture and some key concepts. Use the following resources for self-guided exploration of the docs: +In the previous section, we spoke broadly about the Generative AI Operations toolchains in Azure AI. In this section, we'll dive into more details about the **Azure AI Foundry** platform that streamlines this experience for AI Engineers and application developers. + +![GenAIOps toolchain](./../img/overview-genaiops-toolchains.png) + +Want to get a deeper dive into the details of the platform? Start with these two resources: - [Azure AI Foundry Documentation](https://learn.microsoft.com/en-us/azure/ai-studio/) - canonical source. - [Azure AI Foundry Documentation Markmap](https://markmap.js.org/full#?d=github%3Anitya%2Flearns-with-markmaps%40refs%3Aheads%2Fmain%2Fdocs%2Fazure-ai-foundry.mm.md) - interactive visualization --- + ## 1. Azure AI Foundry Architecture -!!! info "UNDERSTAND THE KEY COMPONENTS OF THE AZURE AI FOUNDRY ARCHITECTURE" +!!! info "Azure AI Foundry is the recommended platform for E2E development of _customizable_ gen AI apps on Azure" - ![Landing](https://learn.microsoft.com/en-us/azure/ai-studio/media/concepts/ai-studio-architecture.png) +![Landing](https://learn.microsoft.com/en-us/azure/ai-studio/media/concepts/ai-studio-architecture.png) -1. The Azure AI Foundry provides _a unified experience_ for building, evaluating, and deploying, AI models and applications on Azure. +1. The Azure AI Foundry platform provides _a unified experience_ for building, evaluating, and deploying, AI models and applications on Azure. 1. Developers can build applications end-to-end using the web portal (low-code), the SDK (code-first) or the CLI (code-agnostic) based on preferences. @@ -63,7 +68,7 @@ You can accomplish these tasks from this page: An Azure resource provider is a set of REST operations that enable functionality for a specific Azure service. _Registering_ resource providers helps you define the Azure resources you can deploy to your account (subscription). -??? info "Review this section to learn about Azure & AI resource providers _(click to expand)_" +??? info "REVIEW: Learn about Azure & AI resource providers _(click to expand)_" A resource type's name follows the format: _{resource-provider}/{resource-type}_. @@ -75,7 +80,7 @@ An Azure resource provider is a set of REST operations that enable functionality The Azure AI Foundry is built on the Azure Machine Learning resource provider, and takes a dependency on several other Azure services. -??? info "Review list of REQUIRED resource providers for Azure AI. _(expand to view)_" +??? info "REVIEW: list of _required_ resource providers for Azure AI. _(expand to view)_" 1. `Microsoft.MachineLearningServices/workspace (kind=hub)` - for hub 1. `Microsoft.MachineLearningServices/workspace (kind=project)` - for project @@ -84,7 +89,7 @@ The Azure AI Foundry is built on the Azure Machine Learning resource provider, a 1. `Microsoft.Storage/storageAccounts` - for storing artifacts 1. `Microsoft.KeyVault/vaults` - for storing secrets -??? info "Review list of ADDITIONAL resource providers useful for RAG. _(click to expand)_" +??? info "REVIEW: list of _additional_ resource providers useful for RAG. _(click to expand)_" 1. `Microsoft.Search/searchServices` - for search & retrieval 1. `Microsoft.ContainerRegistry/registries` - for registering docker images @@ -114,7 +119,7 @@ Review these links to accomplish these tasks using the Azure Portal (in browser) 1. Every project **must** have a parent hub. Every hub **may** have one or more child projects. 1. Hubs are **collaboration** environments (team). Projects are **development** environments (app). -??? quote "FIGURE: Understand how AI hub, project, and services, resources interact _(click to expand)_" +??? info "FIGURE: Understand how AI hub, project, and services, resources interact _(click to expand)_" ![Landing](https://learn.microsoft.com/en-us/azure/ai-studio/media/concepts/resource-provider-connected-resources.svg) --- diff --git a/docs/2-Portal-First/2-Setup/01.md b/docs/2-Portal-First/2-Setup/01.md deleted file mode 100644 index b4c2eed..0000000 --- a/docs/2-Portal-First/2-Setup/01.md +++ /dev/null @@ -1,155 +0,0 @@ -# 1. Model Selection - -Models are the brains of our generative AI applications. The first step of your end-to-end developer workflow is _model selection_. This consists of three steps: - -1. Discovery - see if there exists an AI model for your need. -1. Selection - choose the right model from available matches. -1. Usage - deploy, customize, and evaluate, the model for fit. - -The Azure AI Foundry portal helps support the model selection journey with three features: - -1. [Model catalog](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/model-catalog-overview) - for discovery -1. [Model benchmarks](https://learn.microsoft.com/en-us/azure/ai-studio/concepts/model-benchmarks) - for comparison -1. [Model deployment](https://learn.microsoft.com/en-us/azure/ai-studio/concepts/deployments-overview) - for evaluation - -Let's explore some of these features. By the end of this section you should know how to: - -- Filter the model catalog to discover relevant models -- Compare selected models using the model benchmarks -- Select a model and explore its model card for details -- Deploy a model to Azure AI Foundry for hands-on usage - - ---- - -## 1.1 Model catalog - -The Azure AI Foundry model catalog is the starting point for model selection. It currently has 1800+ frontier, industry, and open-source, models that can be filtered by collection, industy, deployment option, inference task, and license. You can also take advantage of the built-in search capability to find models by name or other criteria. Let's explore this. - -Start by opening a new private browser in guest mode and navigating to the [Azure AI Model catalog](https://ai.azure.com/explore/models) page in Azure AI Foundry. You should see this. **Note model count** (ex: 1819 models) - -![Selection](./../img/setup-selection-01.png) - ---- - -### 1.1.1 Filter By Inference Task - -The first step is to see if the catalog has _any_ models that will fit your specific needs. Typically, this will involve knowing the **inference task** you want to perform, and **filtering** the catalog to see matching options. Inference tasks can fall under various categories like: - -- natural language processing (e.g., text generation, question answering), -- computer vision (e.g., image classification, image segmentation) -- audio (text-to-speech, audio generation) -- multimodal (visual question answering, document question answering) etc. - -!!! task "Filter the catalog by a specific inference task to see matching models" - -1. Filter by [Text generation](https://ai.azure.com/explore/models?selectedTask=text-generation) → see: 375+ models -1. Filter by [Embeddings](https://ai.azure.com/explore/models?selectedTask=embeddings) → see: 11+ models -1. Filter by [Chat completion](https://ai.azure.com/explore/models?selectedTask=chat-completion) → see: 62+ models - ---- - -### 1.1.2 Filter By Deployment Type - -Now let's look at the first filter (text generation) - this gives us 375+ results that match. How can we filter this down further? One way is to filter by deployment options. - -- [Managed compute](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/deploy-models-managed) - provides a managed online endpoint (API) in a provisioned VM. -- [Serverless API](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-models-serverless?view=azureml-api-2&tabs=azure-studio) - provide pay-as-you-go billing and a models-as-a-service (MaaS) approach. - -The serverless API option can be more cost-effective and does not consume your model quota while still providing enterprise security and compliance guarantees. Let's try this out: - -!!! task "Filter the catalog by a inference task & deployment type to see matching models" - -1. First, Filter by [Text generation](https://ai.azure.com/explore/models?selectedTask=text-generation) → see: 375+ models -1. Then, Filter By [Serverless API](https://ai.azure.com/explore/models?selectedTask=text-generation&selectedDeploymentTypes=serverless-inference) deployment → see: 3 models (manageable subset) - ---- - -### 1.1.3 Filter By Collection Type - -Another way to filter models is by _collection_. At a high level, there are 3 key collections: - -- [Curated by AI](https://ai.azure.com/explore/models?selectedCollection=-curated-by-azure-ai-,aoai,phi,meta,mistral,nvidia,ai21,deci,nixtla,core42,cohere,databricks,snowflake,sdaia,paige,bria,nttdata,saifr,rockwell,bayer,cerence,sightmachine) - frontier models that have been scanned for vulnerabilities. -- [Hugging Face](https://ai.azure.com/explore/models?selectedCollection=huggingface) - open-source model variants from the community -- [Benchmark Results](https://ai.azure.com/explore/models?selectedCollection=-benchmark-results-) - models that we can compare benchmarks on - -You can also select a specific model provider in the collections filter, to see only models from that provider. This is a particularly useful filter to use if you want to prioritize using an **open-source** model, or want to pick models that you can compare benchmarks on. Let's try it. - -!!! task "Filter the catalog by inference task and benchmark results collection to see matching models" - -1. First, Filter by [Text generation](https://ai.azure.com/explore/models?selectedTask=text-generation) → see: 375+ models -1. Then, Filter By [Benchmark Results](https://ai.azure.com/explore/models?selectedTask=text-generation&selectedCollection=-benchmark-results-) collection → see: 22 models (that I can compare) -1. OR Filter by [Hugging Face](https://ai.azure.com/explore/models?selectedTask=text-generation&selectedCollection=huggingface) → see: 322 models (that are open-source) - ---- - -### 1.1.4 Filter By Industry Domain - -Last but not least, we now have a specialized filter for Industry, allowing you to select models that have been specifically curated and tailored for use in vertical domains like Health and Life Sciences, Financial Services etc. Because these are industry-specific, they can be more effective as the _first_ filter for discovery. Let's try it. - - -!!! task "Filter the catalog by a industry to see matching models" - -1. Filter by [Financial Services](https://ai.azure.com/explore/models?selectedIndustryFilter=financial-services) → see: 10 models _including Saifr_ → Clear results -1. First,Filter by [Health & Life Sciences Industry](https://ai.azure.com/explore/models?selectedIndustryFilter=health-and-life-sciences) → see: 20 models -1. Then, Filter by [Embeddings Inference Task](https://ai.azure.com/explore/models?selectedIndustryFilter=health-and-life-sciences,financial-services) → see: 2 models - - ---- - -### 1.1.5 Search By Keyword - -Sometimes, the predefined filters are not sufficient to reduce the model subset to a manageable level for manual evaluation. This can be for various reasons: - -1. You want to see if the catalog _has a specific model name_. -1. The model inference task _may not be a standard option_. -1. You want to see if there are models _with a specific capability_. - - -!!! task "Example 1 (Taxonomy mismatch) - search by category name" - - -1. First, look for [Embeddings](https://ai.azure.com/explore/models?selectedTask=embeddings) inference task. → see: 10 models (no Hugging Face) -1. Now, search for "Sentence Similarity" ([HF taxonomy](https://huggingface.co/tasks/sentence-similarity)) → see: 7 open-source models - - -!!! task "Example 2 (Known entity) - search by name" - -1. Search for "smol" → see: 1 model = flagship SLM from Hugging Face -1. Search for "unsloth" → see: 2 models = from specific community creator - - -!!! task "Example 3 (Other keywords) - search by capability" - -1. Search for "sql" → see: 2 models = create sql queries using natural language -1. Search for "biomed" → see: models = focus on biomedical applications & data - ---- - -## 1.2 Model Benchmarks - ---- - -## 1.3 Data, Privacy & Security - -When you deploy and customize models in the Azure AI Foundry portal, the service processes data in different contexts. These include: - -1. **Prompts** - the initial user request, as well as metaprompts and RAG-enhanced context. -1. **Generated Content** - the response generated by the model, potentially as chat history. -1. **Uploaded Data** - loaded into a datastore for fine-tuning or other AI customization. -1. **Content Filters** - analyze prompts and responses to detect and filter harmful content. - -Some of the common questions on data security and privacy are: - -- Is the data stored, shared with providers, or reused for training? -- Who is processing the data, and what are their data commitments? - -To learn more, check out these two core resources: - -1. [Data, privacy, and security for use of models through the model catalog](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/concept-data-privacy) -1. [Data, privacy, and security for Azure AI Content Safety](https://learn.microsoft.com/en-us/legal/cognitive-services/content-safety/data-privacy) - - ---- - -_This completes the guest tour of the Azure AI Foundry Portal. To explore further capabilities, you will need to login with an Azure subscription (as explored in next section). First, three things to know: Azure AI Foundry architecture, Azure AI Project resource, and Management Center_. \ No newline at end of file diff --git a/docs/2-Portal-First/2-Setup/Lab-01.md b/docs/2-Portal-First/2-Setup/Lab-01.md new file mode 100644 index 0000000..085ad37 --- /dev/null +++ b/docs/2-Portal-First/2-Setup/Lab-01.md @@ -0,0 +1,313 @@ +# 1. Lab: Model Selection + +??? info "GENAIOPS: Model Selection is part of the _Getting Started_ stage. (click to view figure)" + ![GenAIOps toolchain](./../img/overview-genaiops-toolchains.png) + +Models are the brains of our generative AI applications. The first step of your end-to-end developer workflow is _model selection_. This consists of three steps: + +1. Discovery - see if there exists an AI model for your need. +1. Selection - choose the right model from available matches. +1. Usage - deploy, customize, and evaluate, the model for fit. + +The Azure AI Foundry portal helps support the model selection journey with three features: + +1. [Model catalog](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/model-catalog-overview) - for discovery +1. [Model benchmarks](https://learn.microsoft.com/en-us/azure/ai-studio/concepts/model-benchmarks) - for comparison +1. [Model deployment](https://learn.microsoft.com/en-us/azure/ai-studio/concepts/deployments-overview) - for evaluation + +!!! quote "By the end of this section you should know how to:" + + - [X] Filter the model catalog to discover relevant models + - [X] Compare selected models using the model benchmarks + - [X] Select a model and explore its model card for details + - [X] Start a model deployment to experiment with it in Azure + +--- + +## 1.1 Model catalog + +The Azure AI Foundry model catalog is the starting point for model selection. It currently has 1800+ frontier, industry, and open-source, models that can be filtered by collection, industy, deployment option, inference task, and license. You can also take advantage of the built-in search capability to find models by name or other criteria. Let's explore this. + +Start by opening a new private browser in guest mode and navigating to the [Azure AI Model catalog](https://ai.azure.com/explore/models) page in Azure AI Foundry. You should see this: + +??? info "FIGURE: click to expand for example screenshot. **Note model count** (ex: 1819 models)" + ![Selection](./../img/setup-selection-01.png) + +--- + +### 1.1.1 Filter By Inference Task + +The first step is to see if the catalog has _any_ models that will fit your specific needs. Typically, this will involve knowing the **inference task** you want to perform, and **filtering** the catalog to see matching options. Inference tasks can fall under various categories like: + +- natural language processing (e.g., text generation, question answering), +- computer vision (e.g., image classification, image segmentation) +- audio (text-to-speech, audio generation) +- multimodal (visual question answering, document question answering) etc. + +!!! task "Filter the catalog by a specific inference task to see matching models" + +1. Filter by [Text generation](https://ai.azure.com/explore/models?selectedTask=text-generation) → see: 375+ models +1. Filter by [Embeddings](https://ai.azure.com/explore/models?selectedTask=embeddings) → see: 11+ models +1. Filter by [Chat completion](https://ai.azure.com/explore/models?selectedTask=chat-completion) → see: 62+ models + +--- + +### 1.1.2 Filter By Deployment Type + +Now let's look at the first filter (text generation) - this gives us 375+ results that match. How can we filter this down further? One way is to filter by deployment options. + +- [Managed compute](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/deploy-models-managed) - provides a managed online endpoint (API) in a provisioned VM. +- [Serverless API](https://learn.microsoft.com/en-us/azure/machine-learning/how-to-deploy-models-serverless?view=azureml-api-2&tabs=azure-studio) - provide pay-as-you-go billing and a models-as-a-service (MaaS) approach. + +The serverless API option can be more cost-effective and does not consume your model quota while still providing enterprise security and compliance guarantees. Let's try this out: + +!!! task "Filter the catalog by a inference task & deployment type to see matching models" + +1. First, Filter by [Text generation](https://ai.azure.com/explore/models?selectedTask=text-generation) → see: 375+ models +1. Then, Filter By [Serverless API](https://ai.azure.com/explore/models?selectedTask=text-generation&selectedDeploymentTypes=serverless-inference) deployment → see: 3 models (manageable subset) + +--- + +### 1.1.3 Filter By Collection Type + +Another way to filter models is by _collection_. At a high level, there are 3 key collections: + +- [Curated by AI](https://ai.azure.com/explore/models?selectedCollection=-curated-by-azure-ai-,aoai,phi,meta,mistral,nvidia,ai21,deci,nixtla,core42,cohere,databricks,snowflake,sdaia,paige,bria,nttdata,saifr,rockwell,bayer,cerence,sightmachine) - frontier models that have been scanned for vulnerabilities. +- [Hugging Face](https://ai.azure.com/explore/models?selectedCollection=huggingface) - open-source model variants from the community +- [Benchmark Results](https://ai.azure.com/explore/models?selectedCollection=-benchmark-results-) - models that we can compare benchmarks on + +You can also select a specific model provider in the collections filter, to see only models from that provider. This is a particularly useful filter to use if you want to prioritize using an **open-source** model, or want to pick models that you can compare benchmarks on. Let's try it. + +!!! task "Filter the catalog by inference task and benchmark results collection to see matching models" + +1. First, Filter by [Text generation](https://ai.azure.com/explore/models?selectedTask=text-generation) → see: 375+ models +1. Then, Filter By [Benchmark Results](https://ai.azure.com/explore/models?selectedTask=text-generation&selectedCollection=-benchmark-results-) collection → see: 22 models (that I can compare) +1. OR Filter by [Hugging Face](https://ai.azure.com/explore/models?selectedTask=text-generation&selectedCollection=huggingface) → see: 322 models (that are open-source) + +--- + +### 1.1.4 Filter By Industry Domain + +Last but not least, we now have a specialized filter for Industry, allowing you to select models that have been specifically curated and tailored for use in vertical domains like Health and Life Sciences, Financial Services etc. Because these are industry-specific, they can be more effective as the _first_ filter for discovery. Let's try it. + + +!!! task "Filter the catalog by a industry to see matching models" + +1. Filter by [Financial Services](https://ai.azure.com/explore/models?selectedIndustryFilter=financial-services) → see: 10 models _including Saifr_ → Clear results +1. First,Filter by [Health & Life Sciences Industry](https://ai.azure.com/explore/models?selectedIndustryFilter=health-and-life-sciences) → see: 20 models +1. Then, Filter by [Embeddings Inference Task](https://ai.azure.com/explore/models?selectedIndustryFilter=health-and-life-sciences,financial-services) → see: 2 models + + +### 1.1.5 Filter By Fine-Tuning Task + +Model selection is typically followed by model **customization** - using prompt engineering, retrieval augmented generation, or fine-tuning - to improve the model response to suit your application quality and safety criteria. [Fine-tuning](https://learn.microsoft.com/en-us/azure/ai-studio/concepts/fine-tuning-overview) works by performing _additional training_ on an existing pre-trained model using a relevant new dataset to enhacne performance or add new skills. + +Currently only a subset of models in the catalog can be fine-tuned, and these may have [added constraints](https://learn.microsoft.com/en-us/azure/ai-studio/concepts/fine-tuning-overview#supported-models-for-fine-tuning) like regional availability for fine-tuning. Let's see how this works. + +!!! task "Filter the catalog by for a fine-tuning model for text generation" + +1. Filter by [Text generation for INFERENCE](https://ai.azure.com/explore/models?selectedTask=text-generation) → see: 375+ models → Clear results +1. Filter by [Text generation for FINE-TUNING](https://ai.azure.com/explore/models?selectedFineTuningTask=text-generation) → see: 14 models +1. Then, Filter by [Serverless API Deployment](https://ai.azure.com/explore/models?selectedFineTuningTask=text-generation&selectedDeploymentTypes=serverless-inference) → see: 3 models (Llama-2) + + + +--- + +### 1.1.6 Search By Keyword + +Sometimes, the predefined filters are not sufficient to reduce the model subset to a manageable level for manual evaluation. This can be for various reasons: + +1. You want to see if the catalog _has a specific model name_. +1. The model inference task _may not be a standard option_. +1. You want to see if there are models _with a specific capability_. + + +!!! task "Example 1 (Taxonomy mismatch) - search by category name" + + +1. First, look for [Embeddings](https://ai.azure.com/explore/models?selectedTask=embeddings) inference task. → see: 10 models (no Hugging Face) +1. Now, search for "Sentence Similarity" ([HF taxonomy](https://huggingface.co/tasks/sentence-similarity)) → see: 7 open-source models + + +!!! task "Example 2 (Known entity) - search by name" + +1. Search for "smol" → see: 1 model = flagship SLM from Hugging Face +1. Search for "unsloth" → see: 2 models = from specific community creator + + +!!! task "Example 3 (Other keywords) - search by capability" + +1. Search for "sql" → see: 2 models = create sql queries using natural language +1. Search for "biomed" → see: models = focus on biomedical applications & data + +--- + +## 1.2 Model Benchmarks + +### 1.2.1 Filter By Benchmarks + +For RAG architectures, we need a _chat completion_ model and an _embedding_ model. To select a model for prototyping, we'll filter by inference task, then look for models with benchmarks, then compare a few by available metrics to make a decision. **Let's find our chat model**: + +1. Filter by [Chat Completion](https://ai.azure.com/explore/models?selectedTask=chat-completion) → see: 62 models +1. Now, Filter by [Benchmark Results](https://ai.azure.com/explore/models?selectedTask=chat-completion&selectedCollection=-benchmark-results-) → see: 51 models +1. You should see something like this: + + ??? info "FIGURE: click to expand for example screenshot" + ![](./../img/setup-filter-benchmarks.png) + +1. Click **Compare Models** → see: [Assess model performance with evaluated metrics](https://ai.azure.com/explore/models/benchmarks) + +Let's use this page to compare the model options by available benchmarks. + +--- + +### 1.2.2 Compare By Benchmarks + +In the previous step, we saw 51 choices that included the 4 models below. + +- _gpt-4o, gpt-4o-mini, AI21-Jamba-1.5-Mini, and Phi-3-mini-128k-instruct_. + +Let's use these as a sample for an exercise in using benchmarks for model selection. + +1. The [Benchmarks Compare View](https://ai.azure.com/explore/models/benchmarks) will have default models selected. **Delete the defaults.** +1. Now, add the 4 models above (one at a time) using the `+ Model to compare` button. +1. You should see something like this: + + ??? info "FIGURE: click to expand for example screenshot" + ![](./../img//setup-model-benchmarks.png) + +1. Explore the available critera for comparisons (click each drop-down in the chart) + - Criteria include: _quality_, _embeddings_, _cost_ and _latency_. +1. Select _Accuracy_ for x-axis and _Cost_ for y-axis as shown in figure above + - The chart will update to show where models fit on this comparison + - Higher accuracy values - and lower cost values - are better. +1. Observe the chart. We can see: + - the `AI21-Jamba-1.5-Mini` model costs the least but is also the least accurate + - the `gpt-4o` model has the highest accuracy but also the highest cost. + - the `gpt-4o-mini` has comparable cost to (1) and is second in accuracy to (2). +1. Make an informed decision: select `gpt-4o-mini` + - we'll review the Model card in the next section to determine next steps. + +!!! task "HOMEWORK: Walk through a similar process to select an **embedding** model." + + +--- + +### 1.2.3 List By Benchmarks + +The compare view above lets you assess model choices relative to each other based on specific criteria like accuracy, cost and other metrics. The **list view** provides more detailed metrics for each model, giving insights into their effectiveness for various tasks. Learn more: + +1. [Benchmarking of LLMs and SLMs](https://learn.microsoft.com/en-us/azure/ai-studio/concepts/model-benchmarks#benchmarking-of-llms-and-slms). + +1. [Benchmarking of embedding models](https://learn.microsoft.com/en-us/azure/ai-studio/concepts/model-benchmarks#benchmarking-of-embedding-models). + +Let's explore this briefly for the `gpt-4o-mini` model we selected earlier. + +1. Search for the model by name as shown below. You should see: + - A row of benchmarks for that model, each with a model version and associated dataset + - Each row has columns for relevant quality metrics (with values, where assessed) + - The top row provides the _average_ for each metric, across all assessed benchmark + ??? info "FIGURE: click to expand for example screenshot" + ![](./../img/setup-benchmark-listview.png) + +1. We see this model ranks well on accuracy and prompt-based metrics like coherence, fluency, and groundedness - but does less well on GPTSimilarity. See: [Quality docs](https://learn.microsoft.com/en-us/azure/ai-studio/concepts/model-benchmarks#quality) for explainers on what each metric means. _Overall, we see the selected model quality is acceptable_. +1. Each row of benchmarks for a model defines a _dataset_ and a _task_. The dataset contains examples of inputs relevant to the task, along with information to assess quality of model response to that input. The resulting quality metrics are listed in that row. Click on a dataset to get more details on what it does, and how. + + - Ex 1: Click `human_eval` which assesses _accuracy_ for Text generation tasks + - it assesses functional correctness of _code generation_ from given word problem. + - it assesses this model at 0.841 accuracy for this text generation task. + + ??? info "FIGURE: (click to expand) Dataset details for `HumanEval`" + ![Human eva](./../img/setup-benchmark-humaneval.png) + + - Ex 2: Click `squad_v2` which assesses _groundedness_ and _relevance_ for QA tasks + - it assesses _reading comprehension_ using questions on a Wikipedia dataset. + - it assesses this model at 4.146 for Groundedness and 3.753 for GPTSimilarity. + + ??? info "FIGURE: (click to expand_ Dataset details for `squad_v2`" + ![squad](./../img/setup-benchmark-squadv2.png) + +This allows us to get a quick sense of the _general suitability_ of the selected model based on benchmarks. The next step, is to explore the model card. + +--- + +## 1.3 Model card + +The model card for a selected model provides all the necessary information to help you understand its capabilities, pricing, quality and more. And, it provides the starting point for **deploying** the model to explore it interactively. + +### 1.3.1 Overview + +1. Click the [gpt-4o-mini](https://ai.azure.com/explore/models/gpt-4o-mini/version/2024-07-18/registry/azure-openai) result to navigate to the model card in Azure. + + - You should see this - note the links to pricing and estimated cost. + + ??? info "FIGURE: (click to expand) Model Card Overview (Details tab - top)" + ![Details](./../img/setup-card-0-details.png) + + - Scroll down. You see model provider details on tasks and benchmarks of relevance. + + ??? info "FIGURE: (click to expand) Model Card Overview (Details tab - bottom)" + ![More details](./../img/setup-card-0-more-details.png) + + +### 1.3.2 Benchmarks + +1. Click the `Benchmarks` tab in the model card. + + - The top half of the page provides this view. Clicking [Compare with other models](https://ai.azure.com/explore/models/benchmarks?modelId=gpt-4o-mini) takes you to the Benchmarks view from earlier, but with this model as main focus (and other example models for comparison). + + ??? info "FIGURE: (click to expand) Benchmarks tab - compare other models" + ![Benchmarks](./../img/setup-card-1-benchmarks.png) + + - Scroll down. You should see options to try evaluating the model with your own data. + + ??? info "FIGURE: (click to expand) Benchmarks tab - try with your own data" + ![Benchmarks](./../img/setup-card-2-benchmarks.png) + +### 1.3.3 Deployment + +1. Click the `Code samples` tab in the model card. You should see code snippets for using this model _programmatically_ with the Azure AI Inference SDK, for various languages. **But what if you want to explore this model in a playground in the portal?** + + ??? info "FIGURE: (click to expand) Code Samples tab - pick your language" + ![Benchmarks](./../img/setup-card-3-code.png) + +1. Click the `Details` tab to get back to the overview (guest mode). Note the `Create a subscription to deploy` button indicating we need to log into Azure before we can proceed. Let's do that next. + + ??? info "FIGURE: (click to expand) Model Card Overview (Guest Mode)" + ![Details](./../img/setup-card-0-details.png) + +1. Logging in gives us a `Deploy` button as shown. Clicking that now gives you the choice of deploying the model to an _existing_ project, or _creating a new project_ for this purpose. + + + ??? info "FIGURE: (click to expand) Model Card Overview (Authenticated)" + ![Details](./../img/setup-card-4-deploy.png) + + +In the [next lab](Lab-02.md) we'll continue from this point to explore Project setup and model deployment. But first, a quick note on data, privacy, and security considerations when working with models in the Azure AI model catalog. + +--- + +## 1.3 Data, Privacy, & Security + +When you deploy and customize models in the Azure AI Foundry portal, the service processes data in different contexts. These include: + +1. **Prompts** - the initial user request, as well as metaprompts and RAG-enhanced context. +1. **Generated Content** - the response generated by the model, potentially as chat history. +1. **Uploaded Data** - loaded into a datastore for fine-tuning or other AI customization. +1. **Content Filters** - analyze prompts and responses to detect and filter harmful content. + +Some of the common questions on data security and privacy are: + +- Is the data stored, shared with providers, or reused for training? +- Who is processing the data, and what are their data commitments? + +To learn more, check out these two core resources: + +1. [Data, privacy, and security for use of models through the model catalog](https://learn.microsoft.com/en-us/azure/ai-studio/how-to/concept-data-privacy) +1. [Data, privacy, and security for Azure AI Content Safety](https://learn.microsoft.com/en-us/legal/cognitive-services/content-safety/data-privacy) + + +--- + +_This completes the guest tour of the Azure AI Foundry Portal. To explore further capabilities, you will need to login with an Azure subscription (as explored in next section). First, three things to know: Azure AI Foundry architecture, Azure AI Project resource, and Management Center_. \ No newline at end of file diff --git a/docs/2-Portal-First/2-Setup/Lab-02.md b/docs/2-Portal-First/2-Setup/Lab-02.md new file mode 100644 index 0000000..f24ac59 --- /dev/null +++ b/docs/2-Portal-First/2-Setup/Lab-02.md @@ -0,0 +1,19 @@ +# 1. Lab: Model Deployment + +??? info "GENAIOPS: Model Deployment is part of the _Getting Started_ stage. (click to view figure)" + ![GenAIOps toolchain](./../img/overview-genaiops-toolchains.png) + +The first step of your end-to-end developer workflow is _model selection_ where you **discover** one or more models that can **solve** your application requirements. + +This second step focuses on _model deployment_ where you create an Azure AI project with a managed model deployment, and use the Azure AI portal features to **experiment** with the model (using scenario-specific prompts) to verify that it fits your application needs. + +!!! quote "By the end of this section you should know how to:" + + - [X] Deploy a model from an Azure AI model catalog card + - [X] Create an Azure AI project with a customized hub for this purpose + - [X] View model deployment details in the Azure AI project + - [X] View project and hub resource details in the Management center + - [X] Launch the chat playground for the deployed model (and use it) + - [X] Explore additional model card tabs to understand capabilities + +--- \ No newline at end of file diff --git a/docs/2-Portal-First/7-Troubleshooting/01.md b/docs/2-Portal-First/7-Troubleshooting/01.md deleted file mode 100644 index e69de29..0000000 diff --git a/docs/2-Portal-First/img/overview-genaiops-flow.png b/docs/2-Portal-First/img/overview-genaiops-flow.png new file mode 100644 index 0000000..18e95a7 Binary files /dev/null and b/docs/2-Portal-First/img/overview-genaiops-flow.png differ diff --git a/docs/2-Portal-First/img/overview-genaiops-toolchains.png b/docs/2-Portal-First/img/overview-genaiops-toolchains.png new file mode 100644 index 0000000..849875e Binary files /dev/null and b/docs/2-Portal-First/img/overview-genaiops-toolchains.png differ diff --git a/docs/2-Portal-First/img/setup-benchmark-humaneval.png b/docs/2-Portal-First/img/setup-benchmark-humaneval.png new file mode 100644 index 0000000..56f051e Binary files /dev/null and b/docs/2-Portal-First/img/setup-benchmark-humaneval.png differ diff --git a/docs/2-Portal-First/img/setup-benchmark-listview.png b/docs/2-Portal-First/img/setup-benchmark-listview.png new file mode 100644 index 0000000..6012ee2 Binary files /dev/null and b/docs/2-Portal-First/img/setup-benchmark-listview.png differ diff --git a/docs/2-Portal-First/img/setup-benchmark-squadv2.png b/docs/2-Portal-First/img/setup-benchmark-squadv2.png new file mode 100644 index 0000000..f96b89b Binary files /dev/null and b/docs/2-Portal-First/img/setup-benchmark-squadv2.png differ diff --git a/docs/2-Portal-First/img/setup-card-0-details.png b/docs/2-Portal-First/img/setup-card-0-details.png new file mode 100644 index 0000000..27d305d Binary files /dev/null and b/docs/2-Portal-First/img/setup-card-0-details.png differ diff --git a/docs/2-Portal-First/img/setup-card-0-more-details.png b/docs/2-Portal-First/img/setup-card-0-more-details.png new file mode 100644 index 0000000..e480283 Binary files /dev/null and b/docs/2-Portal-First/img/setup-card-0-more-details.png differ diff --git a/docs/2-Portal-First/img/setup-card-1-benchmarks.png b/docs/2-Portal-First/img/setup-card-1-benchmarks.png new file mode 100644 index 0000000..47fd9bd Binary files /dev/null and b/docs/2-Portal-First/img/setup-card-1-benchmarks.png differ diff --git a/docs/2-Portal-First/img/setup-card-2-benchmarks.png b/docs/2-Portal-First/img/setup-card-2-benchmarks.png new file mode 100644 index 0000000..d593ea8 Binary files /dev/null and b/docs/2-Portal-First/img/setup-card-2-benchmarks.png differ diff --git a/docs/2-Portal-First/img/setup-card-3-code.png b/docs/2-Portal-First/img/setup-card-3-code.png new file mode 100644 index 0000000..c5311c8 Binary files /dev/null and b/docs/2-Portal-First/img/setup-card-3-code.png differ diff --git a/docs/2-Portal-First/img/setup-card-4-deploy.png b/docs/2-Portal-First/img/setup-card-4-deploy.png new file mode 100644 index 0000000..b28939b Binary files /dev/null and b/docs/2-Portal-First/img/setup-card-4-deploy.png differ diff --git a/docs/2-Portal-First/img/setup-filter-benchmarks.png b/docs/2-Portal-First/img/setup-filter-benchmarks.png new file mode 100644 index 0000000..0ba051e Binary files /dev/null and b/docs/2-Portal-First/img/setup-filter-benchmarks.png differ diff --git a/docs/2-Portal-First/img/setup-model-benchmarks.png b/docs/2-Portal-First/img/setup-model-benchmarks.png new file mode 100644 index 0000000..98bf324 Binary files /dev/null and b/docs/2-Portal-First/img/setup-model-benchmarks.png differ