-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prompt processing in A100 SXM 80GB #19
Comments
It’s definitely surprising, but based on the detailed benchmarks, I wouldn’t say the A100 SXM 80GB is 10x slower than the A100 PCIe 80GB. However, it is indeed performing somewhat unexpectedly slower. One possible explanation is that the A100 SXM version is optimized for NVIDIA's NVLink interconnects in specialized systems, offering high-bandwidth communication between GPUs. When these SXM cards are converted to PCIe format—as with the sudden influx of SXM4 to PCIe converted cards—you might encounter inefficiencies or bottlenecks that aren't present in native PCIe cards. These hardware modifications could lead to suboptimal performance due to limitations introduced during the conversion process. Additionally, there could be compatibility issues with CUDA drivers or software optimizations that favor the PCIe version over the converted SXM cards. Newer or unconventional hardware setups often face challenges with software support, which can impact performance. For now, opting for the A100 PCIe might offer a more straightforward, cost-effective solution until any compatibility issues with the SXM version are fully addressed. |
Actually want to double confirm the prompt processing table entry: A100 PCIe 80GB | 5800.48 | 7504.24 | 726.65 | OOMA100 SXM 80GB | 5863.92 | 681.47 | 796.81 | OOM |
I'm sure you've noticed it too. A100 SXM 80GB prompt processing is like 10x SLOWER than A100 PCIE 80GB. Any insight into why?
Given there's a flood of SXM4 to PCIE converted card suddenly, it's very interesting why it's so much slower.
The text was updated successfully, but these errors were encountered: