-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add OpenMP SIMD pragmas to portable path.
The omp simd pragma is supported by OpenMP 4.0+, but several compilers also offer the SIMD portion of OpenMP behind a separate flag since it doesn't require OpenMP support at runtime. If you want to use these pragmas without enabling full OpenMP support, make sure to define _ENABLE_OPENMP_SIMD at compile time. In my own extremely limited testing, this patch results in a roughly 25% performance increase (with GCC 8.3 with -O3 -march=native on a Xeon E3-1225 v3 running Fedora 29). It's still vastly slower than the hand-optimized AVX2 code path, but this should be portable. I haven't really spent much time optimizing this yet, there is probably a fair amount of room for improvement by adding safelen and aligned clauses where appropriate.
- Loading branch information
Showing
1 changed file
with
36 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters