Skip to content
This repository has been archived by the owner on Aug 30, 2024. It is now read-only.

Commit

Permalink
[pre-commit.ci] auto fixes from pre-commit.com hooks
Browse files Browse the repository at this point in the history
for more information, see https://pre-commit.ci
  • Loading branch information
pre-commit-ci[bot] committed Feb 29, 2024
1 parent adb9de6 commit 10524cc
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion neural_speed/models/llama/llama.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ static bool llama_model_eval_internal(model_context* ctx, const model_input* inp
// input shape will be [1, l_sum]
if (batch_size > 1)
MODEL_ASSERT(
("llama arch only supports contiuous batching inference when giving multi prompts.", lctx.cont_batching));
("llama arch only supports continuous batching inference when giving multi prompts.", lctx.cont_batching));
const bool concat_multi_seqs = batch_size > 1 ? true : false;
std::vector<int> n_tokens(batch_size);
std::vector<int> n_pasts(batch_size);
Expand Down

0 comments on commit 10524cc

Please sign in to comment.