-
Notifications
You must be signed in to change notification settings - Fork 38
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pls note that the perf of fp4_e2m1 & fp4_bnb are also poor on E-core.
Please remember to use int4 config as default when hybrid |
@yuchengliu1 Can you check whether the new thread pool dispatch all jobs to P cores if E cores have poor performance? |
f4 performance will be remeasured after #172. |
Some parallel created with schedule2D do not dispatch, such as parallel of PrologueA and MHA reorder |
@yuchengliu1 Just run benchmark. Focus on GEMM only. |
Type of Change
Automatic modify the thread number and disable to run nf4 on E-core when inferencing nf4 model on hybrid CPU.
Quantizing to nf4 model on hybrid CPU will get a warning now.
Description
detail description
Issues: xxx
Expected Behavior & Potential Risk
the expected behavior that triggered by this PR
How has this PR been tested?
how to reproduce the test (including hardware information)
Dependency Change?
any library dependency introduced or removed