Skip to content

Commit

Permalink
Add .pre-commit-config.yaml for codespell
Browse files Browse the repository at this point in the history
Signed-off-by: Sun, Xuehao <[email protected]>
  • Loading branch information
XuehaoSun committed May 27, 2024
1 parent 635ab7a commit 10cf5f1
Show file tree
Hide file tree
Showing 5 changed files with 17 additions and 7 deletions.
10 changes: 10 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
ci:
autofix_prs: true
autoupdate_schedule: quarterly

repos:
- repo: https://github.com/codespell-project/codespell
rev: v2.2.6
hooks:
- id: codespell
args: [-w]
4 changes: 2 additions & 2 deletions community/rfc_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,11 +33,11 @@ List other alternatives if have, and corresponding pros/cons to each proposal.

### Compatibility

list possbile incompatible interface or workflow changes if exists.
list possible incompatible interface or workflow changes if exists.

### Miscs

List other informations user and developer may care about, such as:
List other information user and developer may care about, such as:

- Performance Impact, such as speed, memory, accuracy.
- Engineering Impact, such as binary size, startup time, build time, test times.
Expand Down
4 changes: 2 additions & 2 deletions community/rfcs/24-05-16-001-OPEA-Overall-Design.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ The requirements include but not limited to:

have the ability of offer config-based definition or low-code for constructing complex LLM applications.

2. component registery
2. component registry

allow user to register new service for building complex GenAI applications

Expand All @@ -30,7 +30,7 @@ The requirements include but not limited to:

**Motivation**

This RFC is used to present the OPEA overall design philosophy, including overall architecture, working flow, componenet design, for community discussion.
This RFC is used to present the OPEA overall design philosophy, including overall architecture, working flow, component design, for community discussion.

**Design Proposal**

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
Under Review

# Objective
This RFC aims to introduce the OPEA microservice design and demonstrate its application to Retrieval-Augmented Generation (RAG). The objective is to address the challenge of designing a flexible architecture for Enterprise AI applicaitons by adopting a microservice approach. This approach facilitates easier deployment, enabling one or multiple microservices to form a megaservice. Each megaservice interfaces with a gateway, allowing users to access services through endpoints exposed by the gateway. The architecture is general and RAG is the first example that we want to apply.
This RFC aims to introduce the OPEA microservice design and demonstrate its application to Retrieval-Augmented Generation (RAG). The objective is to address the challenge of designing a flexible architecture for Enterprise AI applications by adopting a microservice approach. This approach facilitates easier deployment, enabling one or multiple microservices to form a megaservice. Each megaservice interfaces with a gateway, allowing users to access services through endpoints exposed by the gateway. The architecture is general and RAG is the first example that we want to apply.


# Motivation
Expand Down
4 changes: 2 additions & 2 deletions framework.md
Original file line number Diff line number Diff line change
Expand Up @@ -396,7 +396,7 @@ and applying the Linux Foundation licensing considerations._
| ---------- | ----------- | ------------ | -------------------- |
| Agent framework | Orchestration software for building and deploying workflows combining information retrieval components with LLMs for building AI agents with contextualized information | Langchain, LlamaIndex, Haystack, Semantic Kernel
| Ingest/Data Processing | Software components that can be used to enhance the data that is indexed for retrieval. For example: process, clean, normalization, information extraction, chunking, tokenization, meta data enhancement. | NLTK, spaCY, HF Tokenizers, tiktoken, SparkNLP
| Embedding models/service | Models or services that covert text chunks into embedding vectors to be stored in a vector database | HF Transformers, S-BERT | HF TEI, OpenAI, Cohere, GCP, Azure embedding APIs, JinaAI
| Embedding models/service | Models or services that convert text chunks into embedding vectors to be stored in a vector database | HF Transformers, S-BERT | HF TEI, OpenAI, Cohere, GCP, Azure embedding APIs, JinaAI
| Indexing/Vector store | A software for indexing information (sparse/vector) and for retrieving given a query | Elasticsearch, Qdrant, Milvus, ChromaDB, Weaviate, FAISS, Vespa, HNSWLib, SVS, PLAID | Pinecone, Redis
| Retrieval/Ranking | A SW component that can re-evaluate existing contexts relevancy order | S-BERT, HF Transformers, Bi/Cross-encoders, ColBERT | Cohere
| Prompt engine | A component that creates task specific prompts given queries and contexts, tracks user sessions (maintain history/memory) | Langchain hub
Expand Down Expand Up @@ -688,7 +688,7 @@ more capabilities than necessary. OWASP container best practices.
* High availability
* Replication & Data/Instance Protection
* Resiliency – time to relaunch an instance when burned down to zero.
* Privides support and instrumentation for enterprise 24/7 support
* Provides support and instrumentation for enterprise 24/7 support
* Licensing model and SW Distribution
* Scalable from small to large customers
* Ability to customize for specific enterprise needs
Expand Down

0 comments on commit 10cf5f1

Please sign in to comment.