Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version 3: Investigate poor throughput on Skylake. #14

Open
Mysticial opened this issue Mar 30, 2017 · 2 comments
Open

Version 3: Investigate poor throughput on Skylake. #14

Mysticial opened this issue Mar 30, 2017 · 2 comments

Comments

@Mysticial
Copy link
Owner

The add/sub benchmark fails to achieve max throughput on Skylake when running single-threaded. Figure out why and fix it.

@ravenschade
Copy link

What is the maximum throughput, that you expect for Add/Sub on Skylake?

@Mysticial
Copy link
Owner Author

On Skylake Desktop (not server), the Haswell binary (FMA3) only seems to get about 80 - 90% of the theoretical flops for add/sub when running single-threaded. Multi-threaded is fine since the hyperthread seems to fill up those pipeline bubbles.

Single-Precision - 256-bit AVX - Add/Sub
    GFlops = 41.856
    Result = 5.37046e+06

Double-Precision - 256-bit AVX - Add/Sub
    GFlops = 21.664
    Result = 2.77755e+06

Single-Precision - 256-bit AVX - Multiply
    GFlops = 50.592
    Result = 6.41972e+06

Double-Precision - 256-bit AVX - Multiply
    GFlops = 26.016
    Result = 3.31828e+06

Single-Precision - 256-bit AVX - Multiply + Add
    GFlops = 45.12
    Result = 4.8147e+06

Double-Precision - 256-bit AVX - Multiply + Add
    GFlops = 22.224
    Result = 2.33547e+06

Single-Precision - 256-bit FMA3 - Fused Multiply Add
    GFlops = 107.328
    Result = 6.82334e+06

Double-Precision - 256-bit FMA3 - Fused Multiply Add
    GFlops = 55.392
    Result = 3.54084e+06

Add/Sub, Multiply, and Multiply+Add should all be the same for the same sized datatype, but Add/sub 20% less and Multiply-Add is 15% less.

This affects both Add/sub and Multiply-Add.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants