Add .pre-commit-config.yaml for codespell

Signed-off-by: Sun, Xuehao <[email protected]>
opea-project · May 27, 2024 · 10cf5f1 · 10cf5f1
1 parent 635ab7a
commit 10cf5f1
Show file tree

Hide file tree

Showing 5 changed files with 17 additions and 7 deletions.
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -0,0 +1,10 @@
+ci:
+  autofix_prs: true
+  autoupdate_schedule: quarterly
+
+repos:
+  - repo: https://github.com/codespell-project/codespell
+    rev: v2.2.6
+    hooks:
+      - id: codespell
+        args: [-w]
diff --git a/community/rfc_template.md b/community/rfc_template.md
@@ -33,11 +33,11 @@ List other alternatives if have, and corresponding pros/cons to each proposal.
 
 ### Compatibility
 
-list possbile incompatible interface or workflow changes if exists.
+list possible incompatible interface or workflow changes if exists.
 
 ### Miscs
 
-List other informations user and developer may care about, such as:
+List other information user and developer may care about, such as:
 
 - Performance Impact, such as speed, memory, accuracy.
 - Engineering Impact, such as binary size, startup time, build time, test times.

diff --git a/community/rfcs/24-05-16-001-OPEA-Overall-Design.md b/community/rfcs/24-05-16-001-OPEA-Overall-Design.md
@@ -16,7 +16,7 @@ The requirements include but not limited to:
 
     have the ability of offer config-based definition or low-code for constructing complex LLM applications.
 
-2. component registery
+2. component registry
 
     allow user to register new service for building complex GenAI applications
 
@@ -30,7 +30,7 @@ The requirements include but not limited to:
 
 **Motivation**
 
-This RFC is used to present the OPEA overall design philosophy, including overall architecture, working flow, componenet design, for community discussion.
+This RFC is used to present the OPEA overall design philosophy, including overall architecture, working flow, component design, for community discussion.
 
 **Design Proposal**
 

diff --git a/...nity/rfcs/GenAIExamples-24-05-16-001-Using_MicroService_to_implement_ChatQnA.md b/...nity/rfcs/GenAIExamples-24-05-16-001-Using_MicroService_to_implement_ChatQnA.md
@@ -5,7 +5,7 @@
 Under Review
 
 # Objective
-This RFC aims to introduce the OPEA microservice design and demonstrate its application to Retrieval-Augmented Generation (RAG). The objective is to address the challenge of designing a flexible architecture for Enterprise AI applicaitons by adopting a microservice approach. This approach facilitates easier deployment, enabling one or multiple microservices to form a megaservice. Each megaservice interfaces with a gateway, allowing users to access services through endpoints exposed by the gateway. The architecture is general and RAG is the first example that we want to apply.
+This RFC aims to introduce the OPEA microservice design and demonstrate its application to Retrieval-Augmented Generation (RAG). The objective is to address the challenge of designing a flexible architecture for Enterprise AI applications by adopting a microservice approach. This approach facilitates easier deployment, enabling one or multiple microservices to form a megaservice. Each megaservice interfaces with a gateway, allowing users to access services through endpoints exposed by the gateway. The architecture is general and RAG is the first example that we want to apply.
 
 
 # Motivation

diff --git a/framework.md b/framework.md
@@ -396,7 +396,7 @@ and applying the Linux Foundation licensing considerations._
 | ---------- | ----------- | ------------ | -------------------- |
 | Agent framework | Orchestration software for building and deploying workflows combining information retrieval components with LLMs for building AI agents with contextualized information | Langchain, LlamaIndex, Haystack, Semantic Kernel
 | Ingest/Data Processing | Software components that can be used to enhance the data that is indexed for retrieval. For example: process, clean, normalization, information extraction, chunking, tokenization, meta data enhancement.  | NLTK, spaCY, HF Tokenizers, tiktoken, SparkNLP
-| Embedding models/service | Models or services that covert text chunks into embedding vectors to be stored in a vector database | HF Transformers, S-BERT | HF TEI, OpenAI, Cohere, GCP, Azure embedding APIs, JinaAI
+| Embedding models/service | Models or services that convert text chunks into embedding vectors to be stored in a vector database | HF Transformers, S-BERT | HF TEI, OpenAI, Cohere, GCP, Azure embedding APIs, JinaAI
 | Indexing/Vector store | A software for indexing information (sparse/vector) and for retrieving given a query | Elasticsearch, Qdrant, Milvus, ChromaDB, Weaviate, FAISS, Vespa, HNSWLib, SVS, PLAID | Pinecone, Redis
 | Retrieval/Ranking | A SW component that can re-evaluate existing contexts relevancy order | S-BERT, HF Transformers, Bi/Cross-encoders, ColBERT | Cohere
 | Prompt engine | A component that creates task specific prompts given queries and contexts, tracks user sessions (maintain history/memory) | Langchain hub
@@ -688,7 +688,7 @@ more capabilities than necessary. OWASP container best practices.
   *	High availability
     * Replication & Data/Instance Protection 
     * Resiliency – time to relaunch an instance when burned down to zero.
-    * Privides support and instrumentation for enterprise 24/7 support
+    * Provides support and instrumentation for enterprise 24/7 support
   *	Licensing model and SW Distribution
     * Scalable from small to large customers
     * Ability to customize for specific enterprise needs