-
Notifications
You must be signed in to change notification settings - Fork 38
[BesTLA] New thread pool and hybrid dispatcher #118
Conversation
d96013c
to
ea90d51
Compare
f58d0e1
to
309bcfe
Compare
848f44c
to
add466e
Compare
…e presets for CMake
just a suggestion: default cmake disable our thread-pool, if customer install ns via pip, then they can't get the perf benefit from our thread-pool on client platform. |
@zhewang1-intc In this PR, it initially enables thread pool with good first-token speed. But it may slow down the next token. So it's disabled for now. |
some performance data on 12900K: ---------------input = 32-------------------- |
@zhewang1-intc next token perf issue is fixed. but we need more validation for this new thread pool. so, it will stay disabled as default. |
CPU:MTL 6P+8E 65W (use 12 thread to keep performance stable)
|
Type of Change