qwen2.5-coder-33B 相比 qwen2.5-33B-instruct 训练需要消耗更多的显存么? #6370
Unanswered
yangyang6666
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Reminder
System Info
qwen2.5-coder-33B 相比 qwen2.5-33B-instruct 训练需要消耗更多的显存么?
现在训练发现32k用 16张910B可以训qwen2.5-33B-instruct。 但是128张也寻不了coder。
Reproduction
1
Expected behavior
No response
Others
No response
Beta Was this translation helpful? Give feedback.
All reactions