Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prompt processing in A100 SXM 80GB #19

Open
charleswg opened this issue Oct 27, 2024 · 2 comments
Open

Prompt processing in A100 SXM 80GB #19

charleswg opened this issue Oct 27, 2024 · 2 comments

Comments

@charleswg
Copy link

I'm sure you've noticed it too. A100 SXM 80GB prompt processing is like 10x SLOWER than A100 PCIE 80GB. Any insight into why?

Given there's a flood of SXM4 to PCIE converted card suddenly, it's very interesting why it's so much slower.

@XiongjieDai
Copy link
Owner

It’s definitely surprising, but based on the detailed benchmarks, I wouldn’t say the A100 SXM 80GB is 10x slower than the A100 PCIe 80GB. However, it is indeed performing somewhat unexpectedly slower.

One possible explanation is that the A100 SXM version is optimized for NVIDIA's NVLink interconnects in specialized systems, offering high-bandwidth communication between GPUs. When these SXM cards are converted to PCIe format—as with the sudden influx of SXM4 to PCIe converted cards—you might encounter inefficiencies or bottlenecks that aren't present in native PCIe cards. These hardware modifications could lead to suboptimal performance due to limitations introduced during the conversion process.

Additionally, there could be compatibility issues with CUDA drivers or software optimizations that favor the PCIe version over the converted SXM cards. Newer or unconventional hardware setups often face challenges with software support, which can impact performance.

For now, opting for the A100 PCIe might offer a more straightforward, cost-effective solution until any compatibility issues with the SXM version are fully addressed.

@charleswg
Copy link
Author

Actually want to double confirm the prompt processing table entry:

A100 PCIe 80GB | 5800.48 | 7504.24 | 726.65 | OOM

A100 SXM 80GB | 5863.92 | 681.47 | 796.81 | OOM

@charleswg charleswg reopened this Nov 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants