Skip non-selected experts for mixtral and qwen2_moe #32429

Coco58323 · 2024-08-05T13:26:41Z

This PR avoids redundant computation for some MoE models (mixtral and qwen2_moe).
The current implementation loops all the experts and inevitably loads experts' weight, which brings extra IO costs.
@ArthurZucker

Coco58323 · 2024-08-06T06:21:02Z

@ArthurZucker, I encountered a conflict between torch.fx and a dynamic for loop in my implementation. I haven't found a concise solution for this issue yet. Could you help or just give up for this?

zhengyaowei · 2024-08-07T10:58:33Z

#30209

Coco58323 · 2024-08-07T14:08:45Z

#30209

Thanks for the info. I am not quite familiar with 'torch.fx'. Seems like there is a trade-off between enabling FX tracing and skipping experts.

ArthurZucker

#31173 is related as well.
Yeah if fx's failing not 100% sure we want to do this, but we do needs benches of some sort regarding reduced IO, because for qwen with a lot of experts, it can be significatn

ArthurZucker

Still down to get this merged! WOuld you mind just producing small benches of before / after?

Coco58323 added 3 commits August 5, 2024 21:19

Skip non-selected experts for mixtral and qwen2_moe

00e19b4

Fix: tensor tolist()

774f2c8

WIP: tokenization test

7c092a6

ArthurZucker reviewed Aug 8, 2024

View reviewed changes

ArthurZucker reviewed Sep 27, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Skip non-selected experts for mixtral and qwen2_moe #32429

Skip non-selected experts for mixtral and qwen2_moe #32429

Coco58323 commented Aug 5, 2024 •

edited

Loading

Coco58323 commented Aug 6, 2024

zhengyaowei commented Aug 7, 2024

Coco58323 commented Aug 7, 2024

ArthurZucker left a comment

ArthurZucker left a comment

Skip non-selected experts for mixtral and qwen2_moe #32429

Are you sure you want to change the base?

Skip non-selected experts for mixtral and qwen2_moe #32429

Conversation

Coco58323 commented Aug 5, 2024 • edited Loading

Coco58323 commented Aug 6, 2024

zhengyaowei commented Aug 7, 2024

Coco58323 commented Aug 7, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

ArthurZucker left a comment

Choose a reason for hiding this comment

Coco58323 commented Aug 5, 2024 •

edited

Loading