You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
What's the issue, what's expected?:
I started superbenchmark on server with NVIDIA L40 and got error message "Unsupported architecture" from gemm-flops benchmark. L40 and L4 are CUDA-capable NVIDIA GPUs with 8.9 Compute Capability, as listed in https://developer.nvidia.com/cuda-gpus
How to reproduce it?: sb run -f local.ini -c gemm-flops.yaml
where gemm-flops.yaml is default.yaml with enable: ['gemm-flops'] and proc_num: 1
I think compute capability 8.9 should be added to superbench/benchmarks/micro_benchmarks/cuda_gemm_flops_performance.py CudaGemmFlopsBenchmark __kernel_map similar to 8.6 (AD10x are similar to this group by having limited FP64 TFLOP rate). And there are two lists of ARCHS in third_party/Makefile for case CUDA Toolkit >= 11.8 with 86 and 90 which should be expanded by adding 89.
The text was updated successfully, but these errors were encountered:
Thanks for capturing the issue, we have created a PR(#634) to support the 8.0 compute capability, please check if works for you and let us know if you have more questions!
What's the issue, what's expected?:
I started superbenchmark on server with NVIDIA L40 and got error message "Unsupported architecture" from gemm-flops benchmark. L40 and L4 are CUDA-capable NVIDIA GPUs with 8.9 Compute Capability, as listed in https://developer.nvidia.com/cuda-gpus
How to reproduce it?:
sb run -f local.ini -c gemm-flops.yaml
where gemm-flops.yaml is default.yaml with
enable: ['gemm-flops']
andproc_num: 1
Log message or shapshot?:
Additional information:
I think compute capability 8.9 should be added to superbench/benchmarks/micro_benchmarks/cuda_gemm_flops_performance.py CudaGemmFlopsBenchmark __kernel_map similar to 8.6 (AD10x are similar to this group by having limited FP64 TFLOP rate). And there are two lists of ARCHS in third_party/Makefile for case CUDA Toolkit >= 11.8 with 86 and 90 which should be expanded by adding 89.
The text was updated successfully, but these errors were encountered: