Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[STFT][CPU] Improve performance of STFT for CPU by reusage RDFT jit Executor #26967

Merged
merged 33 commits into from
Dec 19, 2024

Conversation

mitruska
Copy link
Contributor

@mitruska mitruska commented Oct 9, 2024

Details:

  • Improve performance of STFT for CPU plugin by reusage RDFT jit Executor
  • Use parallel loops in stft
  • No changes in the logic of the existing RDFT executor, RDFTExecutor::build function has been added to keep the RDFT details hidden in cpp as is.
  • Perf numbers collected within the ticket

Tickets:

  • 156115

@mitruska mitruska self-assigned this Oct 9, 2024
@github-actions github-actions bot added category: Core OpenVINO Core (aka ngraph) category: CPU OpenVINO CPU plugin category: TEMPLATE OpenVINO Template plugin labels Oct 9, 2024
@github-actions github-actions bot added the category: IE Tests OpenVINO Test: plugins and common label Oct 18, 2024
@mitruska mitruska changed the title [WIP] Enable STFT impl for CPU plugin [WIP] Enable STFT impl for CPU plugin by RDFT Executor Oct 18, 2024

This comment was marked as outdated.

@github-actions github-actions bot added the Stale label Nov 17, 2024
@github-actions github-actions bot removed the Stale label Nov 19, 2024
@github-actions github-actions bot removed category: Core OpenVINO Core (aka ngraph) category: IE Tests OpenVINO Test: plugins and common labels Nov 29, 2024
@github-actions github-actions bot removed the category: TEMPLATE OpenVINO Template plugin label Dec 3, 2024
@github-actions github-actions bot added the category: IE Tests OpenVINO Test: plugins and common label Dec 11, 2024
@mitruska mitruska marked this pull request as ready for review December 11, 2024 12:14
@mitruska mitruska requested a review from a-sidorova December 13, 2024 14:16
src/plugins/intel_cpu/src/nodes/stft.cpp Outdated Show resolved Hide resolved
src/plugins/intel_cpu/src/nodes/stft.cpp Outdated Show resolved Hide resolved
src/plugins/intel_cpu/src/nodes/rdft.h Outdated Show resolved Hide resolved
@mitruska mitruska requested a review from a-sidorova December 16, 2024 15:25
Copy link
Contributor

@a-sidorova a-sidorova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍🏼

});
if (m_transpose_frames) {
const auto stft_transp_out_shape = VectorDims{batch_size, fft_out_shape[0], num_frames, fft_out_shape[1]};
transpose_out4d(reinterpret_cast<const uint8_t*>(dst),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: the simpliest way to perform such things like transpose in CPU plugin is to use the following interface: https://github.com/openvinotoolkit/openvino/blob/master/src/plugins/intel_cpu/src/cpu_memory.h#L190 with correctly initialized memory descriptors. In theory it should provides better performance.
Anyway I don't insist it should be refactored in bounds of this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this hint! Worth to consider as a follow up, we have also briefly discussed potential further optimizations for the transpose here: #26967 (comment)

@dmitry-gorokhov
Copy link
Contributor

Overall the changes looks good to me. The only remaining question is about accuracy. @mitruska Do you have any updates?
image

@mitruska
Copy link
Contributor Author

Overall the changes looks good to me. The only remaining question is about accuracy. @mitruska Do you have any updates?

@dmitry-gorokhov

The input test data range for RDFT is defined as:

{ov::op::v7::DFT::get_type_info_static(), Range({{0, 1}}, {{0, 1, 1000000}})},
{ov::op::v9::RDFT::get_type_info_static(), Range({{0, 1}}, {{0, 1, 1000000}})},

For STFT it is:
{ov::op::v15::STFT::get_type_info_static(), Range({{16, 24}, {1, 16}}, {{-100, 100}, {-100, 100}})},

The abs_threshold for existing STFT tests can be smaller (1e-5) if the same test data range as for RDFT is set, while the abs_threshold for RDFT test is currently set to (1e-4).

We can align the threshold and input data range (also modified in STFT GPU PR)

@github-actions github-actions bot removed the category: IE Tests OpenVINO Test: plugins and common label Dec 19, 2024
@mitruska
Copy link
Contributor Author

The input data range for common layer tests has been updated on master by other PR:

Currently custom threshold is not needed, I suggest to apply and review any additional ideas for tests adjustments separately as it will affect both CPU and GPU.

@mitruska mitruska enabled auto-merge December 19, 2024 19:00
@mitruska mitruska added this pull request to the merge queue Dec 19, 2024
Merged via the queue into openvinotoolkit:master with commit 13d60b1 Dec 19, 2024
178 checks passed
@mitruska mitruska deleted the mitruska/stft_cpu branch December 19, 2024 22:36
11happy pushed a commit to 11happy/openvino that referenced this pull request Dec 23, 2024
…xecutor (openvinotoolkit#26967)

### Details:
- Improve performance of STFT for CPU plugin by reusage RDFT jit
Executor
 - Use parallel loops in stft
- No changes in the logic of the existing RDFT executor,
RDFTExecutor::build function has been added to keep the RDFT details
hidden in cpp as is.
 - Perf numbers collected within the ticket
 
### Tickets:
 - 156115

---------

Co-authored-by: Michal Lukaszewski <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: CPU OpenVINO CPU plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants