Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Codegen] Replace codegen default Model to Qwen/Qwen2.5-Coder-7B-Instruct. #1013

Merged
merged 6 commits into from
Oct 28, 2024
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions CodeGen/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,12 +85,12 @@ Currently we support two ways of deploying ChatQnA services with docker compose:

By default, the LLM model is set to a default value as listed below:

| Service | Model |
| ------------ | ------------------------------------------------------------------------------- |
| LLM_MODEL_ID | [meta-llama/CodeLlama-7b-hf](https://huggingface.co/meta-llama/CodeLlama-7b-hf) |
| Service | Model |
| ------------ | --------------------------------------------------------------------------------------- |
| LLM_MODEL_ID | [Qwen/Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) |

[meta-llama/CodeLlama-7b-hf](https://huggingface.co/meta-llama/CodeLlama-7b-hf) is a gated model that requires submitting an access request through Hugging Face. You can replace it with another model.
Change the `LLM_MODEL_ID` below for your needs, such as: [Qwen/CodeQwen1.5-7B-Chat](https://huggingface.co/Qwen/CodeQwen1.5-7B-Chat), [deepseek-ai/deepseek-coder-6.7b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct)
[Qwen/Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) may be a gated model that requires submitting an access request through Hugging Face. You can replace it with another model.
Change the `LLM_MODEL_ID` below for your needs, such as: [deepseek-ai/deepseek-coder-6.7b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-6.7b-instruct)

If you choose to use `meta-llama/CodeLlama-7b-hf` as LLM model, you will need to visit [here](https://huggingface.co/meta-llama/CodeLlama-7b-hf), click the `Expand to review and access` button to ask for model access.

Expand Down
2 changes: 1 addition & 1 deletion CodeGen/docker_compose/intel/cpu/xeon/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ export your_no_proxy=${your_no_proxy},"External_Public_IP"
export no_proxy=${your_no_proxy}
export http_proxy=${your_http_proxy}
export https_proxy=${your_http_proxy}
export LLM_MODEL_ID="meta-llama/CodeLlama-7b-hf"
export LLM_MODEL_ID="Qwen/Qwen2.5-Coder-7B-Instruct"
export TGI_LLM_ENDPOINT="http://${host_ip}:8028"
export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
export MEGA_SERVICE_HOST_IP=${host_ip}
Expand Down
2 changes: 1 addition & 1 deletion CodeGen/docker_compose/intel/hpu/gaudi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ Since the `compose.yaml` will consume some environment variables, you need to se
export no_proxy=${your_no_proxy}
export http_proxy=${your_http_proxy}
export https_proxy=${your_http_proxy}
export LLM_MODEL_ID="meta-llama/CodeLlama-7b-hf"
export LLM_MODEL_ID="Qwen/Qwen2.5-Coder-7B-Instruct"
export TGI_LLM_ENDPOINT="http://${host_ip}:8028"
export HUGGINGFACEHUB_API_TOKEN=${your_hf_api_token}
export MEGA_SERVICE_HOST_IP=${host_ip}
Expand Down
2 changes: 1 addition & 1 deletion CodeGen/docker_compose/set_env.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
# SPDX-License-Identifier: Apache-2.0


export LLM_MODEL_ID="meta-llama/CodeLlama-7b-hf"
export LLM_MODEL_ID="Qwen/Qwen2.5-Coder-7B-Instruct"
export TGI_LLM_ENDPOINT="http://${host_ip}:8028"
export MEGA_SERVICE_HOST_IP=${host_ip}
export LLM_SERVICE_HOST_IP=${host_ip}
Expand Down
2 changes: 1 addition & 1 deletion CodeGen/kubernetes/intel/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
```
cd GenAIExamples/CodeGen/kubernetes/intel/cpu/xeon/manifests
export HUGGINGFACEHUB_API_TOKEN="YourOwnToken"
export MODEL_ID="meta-llama/CodeLlama-7b-hf"
export MODEL_ID="Qwen/Qwen2.5-Coder-7B-Instruct"
sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" codegen.yaml
sed -i "s/meta-llama\/CodeLlama-7b-hf/${MODEL_ID}/g" codegen.yaml
kubectl apply -f codegen.yaml
Expand Down
2 changes: 1 addition & 1 deletion CodeGen/kubernetes/intel/cpu/xeon/gmc/codegen_xeon.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,6 @@ spec:
internalService:
serviceName: tgi-service
config:
MODEL_ID: meta-llama/CodeLlama-7b-hf
MODEL_ID: Qwen/Qwen2.5-Coder-7B-Instruct
endpoint: /generate
isDownstreamService: true
2 changes: 1 addition & 1 deletion CodeGen/kubernetes/intel/cpu/xeon/manifest/codegen.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ metadata:
app.kubernetes.io/version: "2.1.0"
app.kubernetes.io/managed-by: Helm
data:
MODEL_ID: "meta-llama/CodeLlama-7b-hf"
MODEL_ID: "Qwen/Qwen2.5-Coder-7B-Instruct"
PORT: "2080"
HF_TOKEN: "insert-your-huggingface-token-here"
http_proxy: ""
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,6 @@ spec:
internalService:
serviceName: tgi-gaudi-svc
config:
MODEL_ID: meta-llama/CodeLlama-7b-hf
MODEL_ID: Qwen/Qwen2.5-Coder-7B-Instruct
endpoint: /generate
isDownstreamService: true
2 changes: 1 addition & 1 deletion CodeGen/kubernetes/intel/hpu/gaudi/manifest/codegen.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ metadata:
app.kubernetes.io/version: "2.1.0"
app.kubernetes.io/managed-by: Helm
data:
MODEL_ID: "meta-llama/CodeLlama-7b-hf"
MODEL_ID: "Qwen/Qwen2.5-Coder-7B-Instruct"
PORT: "2080"
HF_TOKEN: "insert-your-huggingface-token-here"
http_proxy: ""
Expand Down
2 changes: 1 addition & 1 deletion ProductivitySuite/docker_compose/intel/cpu/xeon/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,7 @@ export COLLECTION_NAME="Conversations"
export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"
export RERANK_MODEL_ID="BAAI/bge-reranker-base"
export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
export LLM_MODEL_ID_CODEGEN="meta-llama/CodeLlama-7b-hf"
export LLM_MODEL_ID_CODEGEN="Qwen/Qwen2.5-Coder-7B-Instruct"
export TEI_EMBEDDING_ENDPOINT="http://${host_ip}:6006"
export TEI_RERANKING_ENDPOINT="http://${host_ip}:8808"
export TGI_LLM_ENDPOINT="http://${host_ip}:9009"
Expand Down
2 changes: 1 addition & 1 deletion ProductivitySuite/docker_compose/intel/cpu/xeon/set_env.sh
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ export COLLECTION_NAME="Conversations"
export EMBEDDING_MODEL_ID="BAAI/bge-base-en-v1.5"
export RERANK_MODEL_ID="BAAI/bge-reranker-base"
export LLM_MODEL_ID="Intel/neural-chat-7b-v3-3"
export LLM_MODEL_ID_CODEGEN="meta-llama/CodeLlama-7b-hf"
export LLM_MODEL_ID_CODEGEN="Qwen/Qwen2.5-Coder-7B-Instruct"
export TEI_EMBEDDING_ENDPOINT="http://${host_ip}:6006"
export TEI_RERANKING_ENDPOINT="http://${host_ip}:8808"
export TGI_LLM_ENDPOINT="http://${host_ip}:9009"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ metadata:
app.kubernetes.io/version: "1.4"
app.kubernetes.io/managed-by: Helm
data:
MODEL_ID: "meta-llama/CodeLlama-7b-hf"
MODEL_ID: "Qwen/Qwen2.5-Coder-7B-Instruct"
PORT: "2080"
HUGGING_FACE_HUB_TOKEN: "insert-your-huggingface-token-here"
HF_TOKEN: "insert-your-huggingface-token-here"
Expand Down
6 changes: 3 additions & 3 deletions supported_examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,9 +63,9 @@ This document introduces the supported examples of GenAIExamples. The supported

[CodeGen](./CodeGen/README.md) is an example of copilot designed for code generation in Visual Studio Code.

| Framework | LLM | Serving | HW | Description |
| ------------------------------------------------------------------------------ | ------------------------------------------------------------------------------- | --------------------------------------------------------------- | ----------- | ----------- |
| [LangChain](https://www.langchain.com)/[LlamaIndex](https://www.llamaindex.ai) | [meta-llama/CodeLlama-7b-hf](https://huggingface.co/meta-llama/CodeLlama-7b-hf) | [TGI](https://github.com/huggingface/text-generation-inference) | Xeon/Gaudi2 | Copilot |
| Framework | LLM | Serving | HW | Description |
| ------------------------------------------------------------------------------ | --------------------------------------------------------------------------------------- | --------------------------------------------------------------- | ----------- | ----------- |
| [LangChain](https://www.langchain.com)/[LlamaIndex](https://www.llamaindex.ai) | [Qwen/Qwen2.5-Coder-7B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct) | [TGI](https://github.com/huggingface/text-generation-inference) | Xeon/Gaudi2 | Copilot |

### CodeTrans

Expand Down
Loading