-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[STFT][CPU] Improve performance of STFT for CPU by reusage RDFT jit Executor #26967
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
… mitruska/stft_cpu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍🏼
}); | ||
if (m_transpose_frames) { | ||
const auto stft_transp_out_shape = VectorDims{batch_size, fft_out_shape[0], num_frames, fft_out_shape[1]}; | ||
transpose_out4d(reinterpret_cast<const uint8_t*>(dst), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: the simpliest way to perform such things like transpose in CPU plugin is to use the following interface: https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/src/cpu_memory.h#L190 with correctly initialized memory descriptors. In theory it should provides better performance.
Anyway I don't insist it should be refactored in bounds of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for this hint! Worth to consider as a follow up, we have also briefly discussed potential further optimizations for the transpose here: #26967 (comment)
Overall the changes looks good to me. The only remaining question is about accuracy. @mitruska Do you have any updates? |
The input test data range for RDFT is defined as: openvino/src/tests/functional/shared_test_classes/include/shared_test_classes/base/utils/ranges.hpp Lines 79 to 80 in 8f0094d
For STFT it is: openvino/src/tests/functional/shared_test_classes/include/shared_test_classes/base/utils/ranges.hpp Line 243 in 8f0094d
The We can align the threshold and input data range (also modified in STFT GPU PR) |
The input data range for common layer tests has been updated on master by other PR: Currently custom threshold is not needed, I suggest to apply and review any additional ideas for tests adjustments separately as it will affect both CPU and GPU. |
…xecutor (openvinotoolkit#26967) ### Details: - Improve performance of STFT for CPU plugin by reusage RDFT jit Executor - Use parallel loops in stft - No changes in the logic of the existing RDFT executor, RDFTExecutor::build function has been added to keep the RDFT details hidden in cpp as is. - Perf numbers collected within the ticket ### Tickets: - 156115 --------- Co-authored-by: Michal Lukaszewski <[email protected]>
Details:
Tickets: