-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] add heterogeneous computing capabilities to UADK #638
base: master
Are you sure you want to change the base?
Conversation
Synchronize internal development codes to keep basic functional codes consistent. Signed-off-by: Longfang Liu <[email protected]>
Synchronize interface layer code to ensure that basic functions are consistent before adding new functions Signed-off-by: Longfang Liu <[email protected]>
Synchronize the code of the test tool UADK Tools to ensure that the test tool code is normal before adding new functions Signed-off-by: Longfang Liu <[email protected]>
Added heterogeneous scheduling function in UADK. Combined hard computing acceleration and soft computing instruction acceleration functions,. keeping both types of acceleration functions effective at the same time. This improves acceleration capability. Signed-off-by: Longfang Liu <[email protected]>
Added a scheduler for heterogeneous computing. Added a dynamic scheduling solution. Balance the load of soft and hard computing to maintain the best performance Signed-off-by: Longfang Liu <[email protected]>
Added heterogeneous hybrid computing function for cipher, digest and comp Signed-off-by: Longfang Liu <[email protected]>
Add heterogeneous computing functions to the soft and hard computing drivers of UADK. Adapt the drivers to ensure that different devices can perform heterogeneous computing at the same time and provide acceleration functions. In order to ensure normal compilation, some drivers have been processed with hac mode, and can be compiled directly through UADK_MK.SH Signed-off-by: Longfang Liu <[email protected]>
In uadk tools, enable the heterogeneous computing function of init2 mode of cipher and digest. This allows the init2 interface to directly complete heterogeneous computing. Signed-off-by: Longfang Liu <[email protected]>
Performance test results of the new framework:
tds------init1(HW)-----init1(HW + CE)----increase
tds-------init1(HW)----init1(HW + CE)---------increase |
In the current UADK framework, the hardware acceleration function and
the software acceleration functioIn the current UADK framework, the hardware acceleration function and
the software acceleration function are merged to ensure that the software
function of instruction acceleration and the hardware function of hardware
offload can run at the same time, thus providing users with stronger performance
Under the heterogeneous scheduling mode enabled in the current scheduler,
the test performance data is as follows:
Alg Mode(1KB) Performance(MB/s) CPU
sync async sync async
sm4-ecb init1(HW) 454 1322 100% 200.00%
init2(HW+CE) 1445.1 1864 100% 195.00%
increase 218.30% 41.00% 0.00% -2.50%
sm3 init1(HW) 153.1 1481 99% 199.80%
init2(HW+CE) 431.5 508 100% 199.80%
increase 181.84% -65.70% 0.91% 0.00%
Alg Mode(8KB) Performance(MB/s) CPU
sync async sync async
sm4-ecb init1(HW) 1407.5 9092 100% 198.00%
init2(HW+CE) 3626.8 6021 100% 199.80%
increase 157.68% -33.78% 0.00% 0.91%
sm3 init1(HW) 960.4 5161.1 100% 183.80%
init2(HW+CE) 549.6 530.1 100% 199.80%
increase -42.77% -89.73% -0.40% 8.71%
Without increasing the CPU usage, the performance improvement of the
synchronous mode is very huge.
In the asynchronous mode, the performance is reduced because the CPU is
used for soft calculations, which can be solved by creating dedicated
calculation threads later.