Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gemm-flops support of Ada Lovelace (L4, L40, L40s), compute capability: 8.9 #624

Closed
avnf opened this issue May 23, 2024 · 1 comment
Closed

Comments

@avnf
Copy link

avnf commented May 23, 2024

What's the issue, what's expected?:
I started superbenchmark on server with NVIDIA L40 and got error message "Unsupported architecture" from gemm-flops benchmark. L40 and L4 are CUDA-capable NVIDIA GPUs with 8.9 Compute Capability, as listed in https://developer.nvidia.com/cuda-gpus

How to reproduce it?:
sb run -f local.ini -c gemm-flops.yaml
where gemm-flops.yaml is default.yaml with enable: ['gemm-flops'] and proc_num: 1

Log message or shapshot?:

[2024-05-23 16:39:42,832 l40-server:365][executor.py:248][INFO] Executor is going to execute gemm-flops.
[2024-05-23 16:39:43,450 l40-server:365][cuda_gemm_flops_performance.py:77][ERROR] Unsupported architecture - benchmark: gemm-flops, compute capability: 8.9, supports 7.0 7.5 8.0 8.6 9.0
[2024-05-23 16:39:43,450 l40-server:365][executor.py:133][INFO] benchmark: gemm-flops, return code: 34, result: {'return_code': [34]}.
[2024-05-23 16:39:43,450 l40-server:365][executor.py:140][ERROR] Executor failed in gemm-flops.

Additional information:

$ nvidia-smi --query-gpu=compute_cap --format=csv
compute_cap
8.9
$ nvidia-smi --query-gpu=gpu_name --format=csv
name
NVIDIA L40

I think compute capability 8.9 should be added to superbench/benchmarks/micro_benchmarks/cuda_gemm_flops_performance.py CudaGemmFlopsBenchmark __kernel_map similar to 8.6 (AD10x are similar to this group by having limited FP64 TFLOP rate). And there are two lists of ARCHS in third_party/Makefile for case CUDA Toolkit >= 11.8 with 86 and 90 which should be expanded by adding 89.

@yukirora
Copy link
Contributor

Thanks for capturing the issue, we have created a PR(#634) to support the 8.0 compute capability, please check if works for you and let us know if you have more questions!

@cp5555 cp5555 closed this as completed Aug 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants