-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Amdahl's Law #3
Comments
What type of machines do you have access to and would like access to? |
The 88-vcore Broadwell-EP system I played with last year had 768GB of ram which is enough to run 100 billion digits of Pi all in ram. That is large enough for me to see long parts of the computation under Task Manager using only 1 core of CPU usage (due to Amdahl's Law on the non-parallelized routines). The largest machine I have access to right now is a 10-core/20-thread Core i9 7900X with 128GB of ram. While it's not large enough to see the non-parallelized parts in Task Manager with the naked eye (as single-core CPU usage), it's visible under a suitable profiler with millisecond granularity. Since I am able to see them under a profiler, I've been able to track down a number of Amdahl's Law offenders in the code and fix (parallelize) them. These will come out in v0.7.5 and the speedup on the 7900X is small and barely noticeable. The real benchmark is to see how much things will have improved on a large system similar to the 88-core Broadwell I played with last year. A Knights Landing system with a lot of memory should also be a good benchmark for Amdahl's Law effects. |
Maybe you could get in contact with one of the guys from Linus Tech Tips (a tech YouTube channel if you haven't heard of them). They use y-cruncher as part of their standard benchmarking for CPU's and they are always playing around with high core count systems. |
@MikeS159 True. Do they still use it? I have not heard much mention of it. |
I saw it in his Ryzen and Skylake X reviews. Not sure if he used it in his Coffee Lake review since I haven't really paid attention to that line. |
Amdahl's Law is apparent on very large machines.
I had the opportunity back in 2016 to play with an 88-vcore Broadwell system and it was obvious just by looking at Task Manager. This is caused by a number of unparallelized linear operations such as bignum addition and subtraction.
These operations have historically been memory-bound and were largely neglected as far as optimizations go. At this point, we're entering an era where a parallelized bignum multiply may actually be faster than an unparallelized bignum add.
Some work has been done in the v0.7.x releases to parallelize some of these linear operations. So this needs to be re-tested. Unfortunately I don't regularly have access to these types of machines to see what kind of progress have been made.
The text was updated successfully, but these errors were encountered: