Skip to content
This repository has been archived by the owner on Aug 30, 2024. It is now read-only.

Commit

Permalink
remove useless code
Browse files Browse the repository at this point in the history
Signed-off-by: Yu, Zhentao <[email protected]>
  • Loading branch information
zhentaoyu committed Feb 29, 2024
1 parent 15c5c10 commit adb9de6
Showing 1 changed file with 0 additions and 4 deletions.
4 changes: 0 additions & 4 deletions neural_speed/models/llama/llama.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -268,10 +268,6 @@ static bool llama_model_eval_internal(model_context* ctx, const model_input* inp
struct ne_tensor* const v_cache =
ne_view_1d(ctx0, kv_self.v, n_ctx * n_embd_gqa * kv_n_ctx_block,
il * n_ctx * ne_element_size(kv_self.v) * n_embd_gqa * kv_n_ctx_block);
std::vector<ne_tensor*> Kcur_bs(batch_size);
std::vector<ne_tensor*> Vcur_bs(batch_size);
std::vector<ne_tensor*> k_bs(batch_size);
std::vector<ne_tensor*> v_bs(batch_size);
// cache = [tokens, beams, requests, layers],
// tokens = [head_dim, head_num, n_ctx] (may different orders)
size_t off_N_i = 0;
Expand Down

0 comments on commit adb9de6

Please sign in to comment.