Skip to content

Commit

Permalink
Big hammer for Mixtral sliding window config
Browse files Browse the repository at this point in the history
  • Loading branch information
cg123 committed Dec 18, 2023
1 parent 2939acd commit 66523c3
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions mergekit/scripts/mixtral_moe.py
Original file line number Diff line number Diff line change
Expand Up @@ -162,6 +162,7 @@ def build(
out_cfg = MixtralConfig(**base_cfg.to_dict())
out_cfg.architectures = ["MixtralForCausalLM"]
out_cfg.num_local_experts = len(config.experts)
out_cfg.sliding_window = None
out_cfg.save_pretrained(out_path)

if (out_cfg.num_local_experts & (out_cfg.num_local_experts - 1)) != 0:
Expand Down

0 comments on commit 66523c3

Please sign in to comment.