From 02550bb9e02c84209d84f507ee51e38ae52f2a29 Mon Sep 17 00:00:00 2001 From: "David B. Kinder" Date: Tue, 30 Jul 2024 13:19:53 -0700 Subject: [PATCH] doc: update 2024-2025 roadmap Replace roadmap with updated content approved by OPEA TSC Signed-off-by: David B. Kinder --- roadmap/2024-2025.md | 366 ++++++++++++++++++++++++++++--------------- 1 file changed, 244 insertions(+), 122 deletions(-) diff --git a/roadmap/2024-2025.md b/roadmap/2024-2025.md index bc60accc..874581e2 100644 --- a/roadmap/2024-2025.md +++ b/roadmap/2024-2025.md @@ -1,130 +1,252 @@ # OPEA 2024 - 2025 Roadmap -## Milestone 1 (May, Done) - -- [x] Components contribution - - [x] ASR - - [x] Data Prep - - [x] Embedding - - [x] Guardrails - - [x] LLM(Gaudi TGI) - - [x] RAG Rerank - - [x] RAG Retrieval - - [x] TTS - - [x] RAG VectorDB - - [x] Open Telemetry support - -- [x] UseCases/Examples - - [x] ChatQnA - - [x] CodeGen - - [x] CodeTrans - -- [x] Cloud Native - - [x] OneClickOPEA on ChatQnA - - [x] OneClickOPEA on CodeGen - -- [x] Evaluation & Others - - [x] CICD & Validation - - [x] lm-eval-harness - - [x] bigcode-eval-harness - - [x] End2End evaluation on GenAIComps & GenAIExamples - -## Milestone 2 (June) - -- [ ] Components contribution - - [ ] LLM on Xeon by vLLM + Ray, Ollama - - [ ] OVMS - - [ ] Prompting - - [ ] User Feedback Management - - [ ] MI6 Mega Components(MI6 RAG Service) - -- [ ] UseCases/Examples - - [ ] DocSum - - [ ] SearchQnA - - [ ] FAQGen - - [ ] End-to-end RAG example using OPEA on Xeon and cloud - -- [ ] Cloud Native - - [ ] OneClickOPEA on 2 or more examples - -- [ ] Evaluation & Others - - [ ] CICD & Validation - - [ ] End2End evaluation on GenAIComps & GenAIExamples - - [ ] RAG evaluation - -## Milestone 3 (July) - -- [ ] Components contribution - - [ ] LLM on Gaudi by vLLM + Ray - - [ ] LVM on Gaudi by vLLM + Ray - - [ ] VectorDB(svs) - - [ ] Telemetry - -- [ ] UseCases/Examples - - [ ] VisualQnA - - [ ] Windows Desktop App for AIPC - -- [ ] Cloud Native - - [ ] OpenShift enablement - - [ ] GenAI Microservice Connector - - [ ] OneClickOPEA on 3 or more examples - -- [ ] Evaluation & Others - - [ ] CICD & Validation - - [ ] End2End evaluation on GenAIComps & GenAIExamples - -## Milestone 4 (Aug) - -- [ ] Components contribution - - [ ] Documentation - - [ ] Automation test script - -- [ ] UseCases/Examples - - [ ] Documentation - - [ ] Automation test script - -- [ ] Cloud Native - - [ ] K8s Resource Management - - [ ] Documentation - - [ ] AutoScaler Analysis - -- [ ] Evaluation & Others - - [ ] CICD & Validation - - [ ] End2End evaluation on GenAIComps & GenAIExamples - -## Milestone 5 (from Sep to Dec) - -- [ ] Components contribution - - [ ] More micro service components for image and video - - [ ] Fine-tuning support - - [ ] Knowledge graph support - - [ ] OPEA Playground support - -- [ ] UseCases/Examples - - [ ] More use cases like language translation and AudioQnA - -- [ ] Cloud Native - - [ ] Docker Containerization through Docker Composer - - [ ] Static tuning on Resource management for deployment - -- [ ] Evaluation & Others - - [ ] CICD & Validation - - [ ] End2End evaluation +## May 2024 +### Contribution -## Milestone 6 (2025) +- **Components** + - ASR + - Data Prep + - Embedding + - Guardrails + - LLM (Gaudi TGI) + - Rerank + - Retrieval + - TTS + - VectorDB -- [ ] Components contribution - - [ ] More micro service components per community request - - [ ] AI Agent support +- **Use Cases/Examples** + - ChatQnA + - CodeGen + - CodeTrans -- [ ] UseCases/Examples - - [ ] No code solution (GenAIStudio) - - [ ] OPEA Model Hub +- **Cloud Native** + - OneClick OPEA on ChatQnA + - OneClick OPEA on CodeGen + - GenAI microservice connector -- [ ] Cloud Native - - [ ] Dynamic tuning on Resource management through K8s +- **Evaluation & Others** + - CICD & Validation + - Eval: E2E (GenAIComps & GenAIExamples), lm-eval-harness, bigcode-eval-harness + - RAGAS evaluation service -- [ ] Evaluation & Others - - [ ] CICD & Validation - - [ ] End2End evaluation +### AI Models + +- LLM: llama2 (7b, 13b, 70b), llama3 (8b, 70b), code-llama, Llama guard +- Embedding: BGE-base + +### AI Tools Integration + +- VectorDB: Chroma +- Framework: Langchain + +### Deployment Type + +- On Prem,IDC (Xeon, Gaudi) + +## June 2024 + +### Contribution + +- **Components** + - LLM (Xeon vLLM & Ray, Ollama) + - OVMS + - prompting + - user feedback management + - Mega Component (MI6 RAG service) + +- **Use Cases/Examples** + - DocSum + - SearchQnA + +- **Cloud Native** + - OneClick OPEA for 2 more examples + - GMC with switch support (dynamic pipelines) + - Helm charts/templates for custom yamls (refactoring) + +- **Evaluation & Others** + - CICD & Validation + - Eval: E2E (GenAIComps & GenAIExamples) Gaudi (2) and CPUs in CICD cluster + +### AI Models + +- LLM: mistral-7B, mixtral-8x7B +- Embedding: E5-mistral-7b-instruct, all-mpnet-base-v2 + +### AI Tools Integration + +- VectorDB: Pinecone, Redis +- Framework: Llamaindex, Haystack + +### Deployment Type + +- On Prem,IDC (Xeon, Gaudi) + +## July 2024 + +### Contribution + +- **Components** + - LVM (Gaudi vLLM & Ray) + - vectordb (svs) + - Gateway guardrail, Auth Z/N + +- **Use Cases/Examples** + - FAQGen + +- **Cloud Native** + - OpenShift enablement for OPEA + - OneClick OPEA for 3 more examples + - Security (Service Mesh, guardrails) + +- **Evaluation & Others** + - CICD & Validation + - Eval: E2E (GenAIComps & GenAIExamples) + +### AI Models + +- LLM: Phi, Gemma +- Embedding: all-MiniLM-L6-v2, paraphrase-albert-small-v2 + +### AI Tools Integration + +- VectorDB: PGVector, Qdrant + +### Deployment Type + +## Aug 2024 + +### Contribution + +- **Components** + - Documentation + - Test automation script + - Telemetry + +- **Use Cases/Examples** + - Documentation + - Test automation script + +- **Cloud Native** + - Demo K8s resource management + - Documentation on autoscaler analysis + +- **Evaluation & Others** + - CICD & Validation + - Eval: E2E (GenAIComps & GenAIExamples) + +### AI Models + +- Vision: llava +- Mixtral-8x22B + +### AI Tools Integration + +- VectorDB: Milvus + +### Deployment Type + +- Public Cloud AWS (Xeon CPU & NV GPU) + +## Sep 2024 + +### Contribution + +- **Components** + - Microservice for Image and Video + +- **Use Cases/Examples** + - Text to Image generation + - Image to Video generation + - Playground (composable and configurable) + +- **Cloud Native** + +- **Evaluation & Others** + - CICD & Validation + - Eval: E2E (GenAIComps & GenAIExamples) + +### AI Models + +- Diffusion model: + - Stable Diffusion XL + - Stable Diffusion 3M + - Stable Video Diffusion + +### AI Tools Integration + +- VectorDB: Weaviate + +### Deployment Type + +## Q4 2024 + +### Contribution + +- **Components** + - Fine-tuning E2E pipeline + - Knowledge Graph + +- **Use Cases/Examples** + - Fine-tuning (Lora) + - AI Agent (single Agent with text and Audio as user interface) + - Closed source LLM + - GraphRAG + +- **Cloud Native** + - Static tuning on Resource management for deployment + +- **Evaluation & Others** + - CICD & Validation + - Eval: E2E (GenAIComps & GenAIExamples) + +### AI Models + +- LLM open: Grok 1 +- LLM Close: GPT3.5/4/4o, Claude 3/3.5 +- AWS Bedrock endpoint + +### AI Tools Integration + +- Knowledge graph: Neo4j +- Agent: LangGraph + +### Deployment Type + +- Public Cloud (Azure, GCP, Oracle, AWS) +- AI PC (Intel) + +## Q1 2025 + +### Contribution + +- **Components** + - more Microservice request from community + - Confidential Container + +- **Use Cases/Examples** + - AI Agent (Multi Agent) + - Fine-tuning (Adpative) + - Long context window (>1M) + - GenAI Studio + +- **Cloud Native** + - Dynamic tuning on Resource management through K8s + +- **Evaluation & Others** + - CICD & Validation + - Eval: E2E (GenAIComps & GenAIExamples) + +### AI Models + +- LLM: SetFit +- More to be defined + +### AI Tools Integration + +- AutoGen, CrewAI + +### Deployment Type + +- Public Cloud (tier2 CSP) +- AI PC (others)