Micro optimization for softmax_forward_kernel5
#1684
ci.yml
on: pull_request
build-cuda-windows
1m 19s
build-ubuntu20-04
2m 41s
build-cuda-fp32
1m 20s
build-cuda-bf16
1m 21s
build-cuda-fp16
1m 15s
build-cuda-kernels
1m 36s
Matrix: build-and-test-cpu