update readme.md.

b4rtaz · Jul 31, 2024 · c66a4b8 · c66a4b8
1 parent 4938276
commit c66a4b8
Showing 1 changed file with 7 additions and 6 deletions.
diff --git a/README.md b/README.md
@@ -13,12 +13,13 @@ Tensor parallelism is all you need. Run LLMs on weak devices or make powerful de
 
 Python 3 and C++ compiler required. The command will download the model and the tokenizer.
 
-| Model                     | Purpose   | Size     | Command                                     |
-| ------------------------- | --------- | -------- | ------------------------------------------- |
-| TinyLlama 1.1B 3T Q40     | Benchmark | 844 MB   | `python launch.py tinyllama_1_1b_3t_q40`    |
-| Llama 3 8B Q40            | Benchmark | 6.32 GB  | `python launch.py llama3_8b_q40`            |
-| Llama 3 8B Instruct Q40   | Chat, API | 6.32 GB  | `python launch.py llama3_8b_instruct_q40`   |
-| Llama 3.1 8B Instruct Q40 | Chat, API | 6.32 GB  | `python launch.py llama3_1_8b_instruct_q40` |
+| Model                       | Purpose   | Size     | Command                                       |
+| --------------------------- | --------- | -------- | --------------------------------------------- |
+| TinyLlama 1.1B 3T Q40       | Benchmark | 844 MB   | `python launch.py tinyllama_1_1b_3t_q40`      |
+| Llama 3 8B Q40              | Benchmark | 6.32 GB  | `python launch.py llama3_8b_q40`              |
+| Llama 3 8B Instruct Q40     | Chat, API | 6.32 GB  | `python launch.py llama3_8b_instruct_q40`     |
+| Llama 3.1 8B Instruct Q40   | Chat, API | 6.32 GB  | `python launch.py llama3_1_8b_instruct_q40`   |
+| Llama 3.1 405B Instruct Q40 | Chat, API | 238 GB   | `python launch.py llama3_1_405b_instruct_q40` |
 
 ### 🛠️ Convert Model Manually