Skip to content

Commit

Permalink
doc: fix missing references to README.md
Browse files Browse the repository at this point in the history
Cross-repo references using an URL that only has the folder name cause
links in the github.io site to go to the github.com site.  Most of these
uses want to link to the documentation (README.md) so add that where
appropriate so links in the github.io site stay there.

Now that we've got those links fixed, we can reduce the complexity in
the fix-github-md-refs script that was trying to add a README.md to
folder names. Also added an area to keep test files for future work.

Also updates the charter.md to indicate we've moved TSC information into
the docs folder in the community/TSC.rst file.

Signed-off-by: David B. Kinder <[email protected]>
  • Loading branch information
dbkinder committed Sep 23, 2024
1 parent 421f050 commit cf69ef8
Show file tree
Hide file tree
Showing 9 changed files with 90 additions and 14 deletions.
6 changes: 3 additions & 3 deletions community/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ Thanks for considering contributing to OPEA project. The contribution process is
```
- **File Descriptions**:
- `embedding_tei.py`: This file defines and registers the microservice. It serves as the entrypoint of the Docker container. Refer to [whisper ASR](https://github.com/opea-project/GenAIComps/tree/main/comps/asr/whisper) for a simple example or [TGI](https://github.com/opea-project/GenAIComps/blob/main/comps/llms/text-generation/tgi/llm.py) for a more complex example that required adapting to the OpenAI API.
- `embedding_tei.py`: This file defines and registers the microservice. It serves as the entrypoint of the Docker container. Refer to [whisper ASR](https://github.com/opea-project/GenAIComps/tree/main/comps/asr/whisper/README.md) for a simple example or [TGI](https://github.com/opea-project/GenAIComps/blob/main/comps/llms/text-generation/tgi/llm.py) for a more complex example that required adapting to the OpenAI API.
- `requirements.txt`: This file is used by Docker to install the necessary dependencies.
- `Dockerfile`: Used to generate the service container image. Please follow naming conventions:
- Dockerfile: `Dockerfile.[vendor]_[hardware]`, vendor and hardware in lower case (i,e Dockerfile.amd_gpu)
Expand All @@ -103,9 +103,9 @@ Thanks for considering contributing to OPEA project. The contribution process is
### Contribute a GenAI Example
Each of the samples in OPEA GenAIExamples are a common oft used solution. They each have scripts to ease deployment, and have been tested for performance and scalability with Docker compose and Kubernetes. When contributing an example, a Docker Compose deployment is the minimum requirement. However, since OPEA is intended for enterprise applications, supporting Kubernetes deployment is highly encouraged. You can find [examples for Kubernetes deployment](https://github.com/opea-project/GenAIExamples?tab=readme-ov-file#deploy-examples) using manifests, Helms Charts, and the [GenAI Microservices Connector (GMC)](https://github.com/opea-project/GenAIInfra/tree/main/microservices-connector). GMC offers additional enterprise features, such as the ability to dynamically adjust pipelines on Kubernetes (e.g., switching to a different LLM on the fly, adding guardrails), composing pipeleines that include external services hosted in public cloud or on-premisees via URL, and supporting sequential, parallel and conditional flows in the pipelines.
Each of the samples in OPEA GenAIExamples are a common oft used solution. They each have scripts to ease deployment, and have been tested for performance and scalability with Docker compose and Kubernetes. When contributing an example, a Docker Compose deployment is the minimum requirement. However, since OPEA is intended for enterprise applications, supporting Kubernetes deployment is highly encouraged. You can find [examples for Kubernetes deployment](https://github.com/opea-project/GenAIExamples/tree/main/README.md#deploy-examples) using manifests, Helms Charts, and the [GenAI Microservices Connector (GMC)](https://github.com/opea-project/GenAIInfra/tree/main/microservices-connector/README.md). GMC offers additional enterprise features, such as the ability to dynamically adjust pipelines on Kubernetes (e.g., switching to a different LLM on the fly, adding guardrails), composing pipeleines that include external services hosted in public cloud or on-premisees via URL, and supporting sequential, parallel and conditional flows in the pipelines.
- Navigate to [OPEA GenAIExamples](https://github.com/opea-project/GenAIExamples/tree/main) and check the catalog of examples. If you find one that is very similar to what you are looking for, you can contribute your variation of it to that particular example folder. If you are bringing a completly new application you will need to create a separate example folder.
- Navigate to [OPEA GenAIExamples](https://github.com/opea-project/GenAIExamples/tree/main/README.md) and check the catalog of examples. If you find one that is very similar to what you are looking for, you can contribute your variation of it to that particular example folder. If you are bringing a completly new application you will need to create a separate example folder.
- Before stitching together all the microservices to build your application, let's make sure all the required building blocks are available!. Take a look at this **ChatQnA Flow Chart**:
Expand Down
2 changes: 1 addition & 1 deletion community/charter.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ This Charter sets forth the responsibilities and procedures for technical contri
## 2. Technical Steering Committee

1. The Technical Steering Committee (the “TSC”) will be responsible for all technical oversight of the open source Project.
2. The TSC voting members are initially those individuals listed as voting members of the TSC in the GOVERNANCE.MD file in the Project’s governance repo. At the inception of the project, the Maintainers of the Project will be as set forth within the “CONTRIBUTING” file within the Project’s code repository. The TSC may choose an alternative approach for determining the voting members of the TSC, and any such alternative approach will be documented in the GOVERNANCE file. The Project intends to determine additional details on composition of the TSC to enable increased diversity of organizations represented on the TSC within 12 months following the inception of the Project, or such other time as determined by the TSC (the “Steady State Transition”). The TSC expects to have no one company employing more than 50% of the voting members of the TSC by the Steady State Transition. It is expected that the terms of TSC voting members will vary initially (with roughly half 1 year and the remainder 2 years) so that elections will be staggered. Any meetings of the Technical Steering Committee are intended to be open to the public, and can be conducted electronically, via teleconference, or in person.
2. The TSC voting members are initially those individuals listed as voting members of the TSC in the GOVERNANCE.MD file in the Project’s governance repo (moved to community/TSC.rst in the docs repo). At the inception of the project, the Maintainers of the Project will be as set forth within the “CONTRIBUTING” file within the Project’s code repository. The TSC may choose an alternative approach for determining the voting members of the TSC, and any such alternative approach will be documented in the GOVERNANCE file (now the TSC.rst file). The Project intends to determine additional details on composition of the TSC to enable increased diversity of organizations represented on the TSC within 12 months following the inception of the Project, or such other time as determined by the TSC (the “Steady State Transition”). The TSC expects to have no one company employing more than 50% of the voting members of the TSC by the Steady State Transition. It is expected that the terms of TSC voting members will vary initially (with roughly half 1 year and the remainder 2 years) so that elections will be staggered. Any meetings of the Technical Steering Committee are intended to be open to the public, and can be conducted electronically, via teleconference, or in person.
3. TSC projects generally will involve Contributors and Maintainers. The TSC may adopt or modify roles so long as the roles are documented in the CONTRIBUTING file. Unless otherwise documented:
1. Contributors include anyone in the technical community that contributes code, documentation, or other technical artifacts to the Project;
2. Maintainers are Contributors who have earned the ability to modify (“commit” or merge pull requests) source code, documentation or other technical artifacts in a project’s repository; and
Expand Down
2 changes: 1 addition & 1 deletion conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -110,7 +110,7 @@
'docs_title': docs_title,
'is_release': is_release,
'versions': ( ("latest", "/latest/"),
# ("1.0", "/1.0/"), # No doc versions yet...
("1.0", "/1.0/"),
)
}

Expand Down
2 changes: 1 addition & 1 deletion guide/installation/gmc_install/gmc_install.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ GMC can be used to compose and adjust GenAI pipelines dynamically on Kubernetes.
**Prerequisites**

- For the ChatQnA example ensure your cluster has a running Kubernetes cluster with at least 16 CPUs, 32GB of memory, and 100GB of disk space. To install a Kubernetes cluster refer to:
["Kubernetes installation"](../k8s_install/)
["Kubernetes installation"](../k8s_install/README.md)

**Download the GMC github repository**

Expand Down
8 changes: 8 additions & 0 deletions guide/installation/k8s_install/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Kubernetes Installation Options

Here are a variety of ways to install Kubernetes:

* [Using AWS EKS Cluster](k8s_instal_aws_eks.md)
* [Using kubeadm](k8s_install_kubeadm.md)
* [Using Kubespray](k8s_install_kubespray.md)

12 changes: 4 additions & 8 deletions scripts/fix-github-md-refs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -21,16 +21,12 @@ mdfiles=`grep -ril --include="*.md" 'github.com/opea-project.*\/[^\)]*'`
# subsequent path to the md file \1 is repo \3 is file path \4 is an optional #xxx target

#sed -i 's/(https:\/\/github.com\/opea-project\/\([^\/]*\)\/\(blob\|tree\)\/main\/\([^)]*\.md\)/(\/\1\/\3/g' $mdfiles
sed -i 's/(https:\/\/github.com\/opea-project\/\([^\/]*\)\/\(blob\|tree\)\/main\/\([^#)]*\)\(#[^)]*\)*)/(\/\1\/\3\/README.md\4)/g' $mdfiles
#sed -i 's/(https:\/\/github.com\/opea-project\/\([^\/]*\)\/\(blob\|tree\)\/main\/\([^#)]*\)\(#[^)]*\)*)/(\/\1\/\3\/README.md\4)/g' $mdfiles
sed -i 's/(https:\/\/github.com\/opea-project\/\([^\/]*\)\/\(blob\|tree\)\/main\/\([^#)]*\.md\)\(#[^)]*\)*)/(\/\1\/\3\4)/g' $mdfiles

# That sed script might have introduced an error of "xxx.md/README.md", so
# clean that up just in case (keep the xxx.md)
# After that, inks to the docs repo such as [blah](docs/...) should have the repo name removed since docs repo is the build root

sed -i 's/\(\/[^\.]*\.md\)\/README\.md/\1/g' $mdfiles

# links to the docs repo such as (docs/...) should have the repo name removed since docs repo is the build root

sed -i 's/(\/docs\//(\//g' $mdfiles
sed -i 's/](\/docs\//](\//g' $mdfiles

# links to a folder should instead be to the folder's README.md
# Not automating this for now since there are valid folder references
Expand Down
2 changes: 2 additions & 0 deletions scripts/rsync-include.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,6 @@
*.rst
*.md
*.rst
*.txt
CODEOWNERS
LICENSE
35 changes: 35 additions & 0 deletions scripts/test/test.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Test markdown file with cross-repo links

This folder contains a collection of Kubernetes manifest files for deploying the ChatQnA service across scalable nodes. It includes a comprehensive [benchmarking tool](/GenAIEval/evals/benchmark/README.md) that enables throughput analysis to assess inference performance.

This comment has been minimized.

Copy link
@ZePan110

ZePan110 Sep 24, 2024

Collaborator

This doesn't seem like the correct link.

This comment has been minimized.

Copy link
@dbkinder

dbkinder Sep 24, 2024

Author Contributor

That's actually the correct link after being processed by the sed script to convert hard URL links into relative links. (This test.md is the version that got processed vs. the test.md.saved that's the before version). In the doc building process, all the five repos (GenAI* and docs) are copied into one folder so all the content can be processed into the github.io site. We want the hard URL links to a README.md because of a cross-repo reference to turn into relative links so the link stays on the github.io site and doesn't link to the README.md on the github.com site. The Sphinx processing deals with relative links to .md and .rst files turning them into references to the generated HTML page.

This comment has been minimized.

Copy link
@ZePan110

ZePan110 Sep 24, 2024

Collaborator

Do you mean that there are now tools for checking the validity of paths and links?

This comment has been minimized.

Copy link
@dbkinder

dbkinder Sep 24, 2024

Author Contributor

These is a sphinx-build option to do link checking that can be run.

The sed script I mention is looking for references to https://github.com/opea-project/ repos that folks are using for cross-repo linking and turning them into relative link so we keep the reader within the github.io site when they click on a link that's to documentation. That sed script doesn't validate whether the doc actually exists.

This comment has been minimized.

Copy link
@dbkinder

dbkinder Sep 24, 2024

Author Contributor

I plan to talk about the github.io doc processing at this week's OPEA tech alignment meeting.

This comment has been minimized.

Copy link
@ZePan110

ZePan110 Sep 25, 2024

Collaborator

It will cause integration testing to fail when checking the validity of paths/links. So will this be directly aimed at users? If this is the case, it is unreasonable as users clicking on it will result in a 404 interface. If it is not customer facing, I will block this part in the detection tool.


We have created the [BKC manifest](https://github.com/opea-project/GenAIExamples/tree/main/ChatQnA/benchmark) for single node, two nodes and four nodes K8s cluster. In order to apply, we need to check out and configure some values.

The test uses the [benchmark tool](https://github.com/opea-project/GenAIEval/tree/main/evals/benchmark) to do performance test. We need to set up benchmark tool at the master node of Kubernetes which is k8s-master.

This document outlines the deployment process for a CodeGen application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline on Intel Gaudi2 server. The steps include Docker images creation, container deployment via Docker Compose, and service execution to integrate microservices such as `llm`. We will publish the Docker images to the Docker Hub soon, further simplifying the deployment process for this service.

Install GMC in your Kubernetes cluster, if you have not already done so, by following the steps in Section "Getting Started" at [GMC Install](https://github.com/opea-project/GenAIInfra/tree/main/microservices-connector#readme). We will soon publish images to Docker Hub, at which point no builds will be required, further simplifying install.

If you get errors like "Access Denied", [validate micro service](https://github.com/opea-project/GenAIExamples/tree/main/CodeGen/docker_compose/intel/cpu/xeon#validate-microservices) first.

Update Knowledge Base via Local File [nke-10k-2023.pdf](https://github.com/opea-project/GenAIComps/blob/main/comps/retrievers/redis/data/nke-10k-2023.pdf)

Please refer to [Xeon README](/GenAIExamples/AudioQnA/docker_compose/intel/cpu/xeon/README.md) or [Gaudi README](/GenAIExamples/AudioQnA/docker_compose/intel/hpu/gaudi/README.md) to build the OPEA images. These too will be available on Docker Hub soon to simplify use.

Here's a [Link](https://github.com/opea-project/GenAIComps/blob/main/comps/reranks/tei/Dockerfile) to a Docker file.

You can take a look at the tools yaml and python files in this example. For more details, please refer to the "Provide your own tools" section in the instructions [here](https://github.com/opea-project/GenAIComps/tree/main/comps/agent/langchain#5-customize-agent-strategy).

Here's another [Link](https://github.com/opea-project/GenAIExamples/blob/main/ChatQnA/ui/docker/Dockerfile.react) to examine.

Here is a nice one [Docker Xeon README](/GenAIExamples/DocSum/docker_compose/intel/cpu/xeon/README.md) and that with a section reference [Docker Xeon README](/GenAIExamples/DocSum/docker_compose/intel/cpu/xeon/README.md#section)

And a reference to a python file [finetune_config](https://github.com/opea-project/GenAIComps/blob/main/comps/finetuning/finetune_config.py) to keep things interesting.

Here's an [issue](https://github.com/opea-project/GenAIExamples/issues/763)
reference and
[Actions](https://github.com/opea-project/GenAIExamples/actions) reference too.
Might as well test [PRs](https://github.com/opea-project/GenAIExamples/pulls)
and [Projects](https://github.com/opea-project/GenAIExamples/projects) too.

In release notes will find [88b3c1](https://github.com/opea-project/GenAIInfra/commit/88b3c108e5b5e3bfb6d9346ce2863b69f70cc2f1) commit references.
35 changes: 35 additions & 0 deletions scripts/test/test.md.saved
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Test markdown file with cross-repo links

This folder contains a collection of Kubernetes manifest files for deploying the ChatQnA service across scalable nodes. It includes a comprehensive [benchmarking tool](https://github.com/opea-project/GenAIEval/blob/main/evals/benchmark/README.md) that enables throughput analysis to assess inference performance.

We have created the [BKC manifest](https://github.com/opea-project/GenAIExamples/tree/main/ChatQnA/benchmark) for single node, two nodes and four nodes K8s cluster. In order to apply, we need to check out and configure some values.

The test uses the [benchmark tool](https://github.com/opea-project/GenAIEval/tree/main/evals/benchmark) to do performance test. We need to set up benchmark tool at the master node of Kubernetes which is k8s-master.

This document outlines the deployment process for a CodeGen application utilizing the [GenAIComps](https://github.com/opea-project/GenAIComps.git) microservice pipeline on Intel Gaudi2 server. The steps include Docker images creation, container deployment via Docker Compose, and service execution to integrate microservices such as `llm`. We will publish the Docker images to the Docker Hub soon, further simplifying the deployment process for this service.

Install GMC in your Kubernetes cluster, if you have not already done so, by following the steps in Section "Getting Started" at [GMC Install](https://github.com/opea-project/GenAIInfra/tree/main/microservices-connector#readme). We will soon publish images to Docker Hub, at which point no builds will be required, further simplifying install.

If you get errors like "Access Denied", [validate micro service](https://github.com/opea-project/GenAIExamples/tree/main/CodeGen/docker_compose/intel/cpu/xeon#validate-microservices) first.

Update Knowledge Base via Local File [nke-10k-2023.pdf](https://github.com/opea-project/GenAIComps/blob/main/comps/retrievers/redis/data/nke-10k-2023.pdf)

Please refer to [Xeon README](https://github.com/opea-project/GenAIExamples/blob/main/AudioQnA/docker_compose/intel/cpu/xeon/README.md) or [Gaudi README](https://github.com/opea-project/GenAIExamples/blob/main/AudioQnA/docker_compose/intel/hpu/gaudi/README.md) to build the OPEA images. These too will be available on Docker Hub soon to simplify use.

Here's a [Link](https://github.com/opea-project/GenAIComps/blob/main/comps/reranks/tei/Dockerfile) to a Docker file.

You can take a look at the tools yaml and python files in this example. For more details, please refer to the "Provide your own tools" section in the instructions [here](https://github.com/opea-project/GenAIComps/tree/main/comps/agent/langchain#5-customize-agent-strategy).

Here's another [Link](https://github.com/opea-project/GenAIExamples/blob/main/ChatQnA/ui/docker/Dockerfile.react) to examine.

Here is a nice one [Docker Xeon README](https://github.com/opea-project/GenAIExamples/blob/main/DocSum/docker_compose/intel/cpu/xeon/README.md) and that with a section reference [Docker Xeon README](https://github.com/opea-project/GenAIExamples/blob/main/DocSum/docker_compose/intel/cpu/xeon/README.md#section)

And a reference to a python file [finetune_config](https://github.com/opea-project/GenAIComps/blob/main/comps/finetuning/finetune_config.py) to keep things interesting.

Here's an [issue](https://github.com/opea-project/GenAIExamples/issues/763)
reference and
[Actions](https://github.com/opea-project/GenAIExamples/actions) reference too.
Might as well test [PRs](https://github.com/opea-project/GenAIExamples/pulls)
and [Projects](https://github.com/opea-project/GenAIExamples/projects) too.

In release notes will find [88b3c1](https://github.com/opea-project/GenAIInfra/commit/88b3c108e5b5e3bfb6d9346ce2863b69f70cc2f1) commit references.

0 comments on commit cf69ef8

Please sign in to comment.