Skip to content

Commit

Permalink
llama: rwkv6: Add quantization tensor exclusion
Browse files Browse the repository at this point in the history
Signed-off-by: Molly Sophia <[email protected]>
  • Loading branch information
MollySophia committed Aug 13, 2024
1 parent 5245608 commit 57269c2
Showing 1 changed file with 3 additions and 0 deletions.
3 changes: 3 additions & 0 deletions src/llama.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -16547,6 +16547,9 @@ static void llama_model_quantize_internal(const std::string & fname_inp, const s
quantize &= name.find("ssm_x.weight") == std::string::npos;
quantize &= name.find("ssm_dt.weight") == std::string::npos;

// do not quantize RWKV's time_mix_first tensors
quantize &= name.find("time_mix_first.weight") == std::string::npos;

// do not quantize relative position bias (T5)
quantize &= name.find("attn_rel_b.weight") == std::string::npos;

Expand Down

0 comments on commit 57269c2

Please sign in to comment.