You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Motivation
I'm currently profiling the cuda code generated by nnfusion, to better profile the program, I suggest using NVIDIA Tools Extension, by inserting nvtxRangePush and nvtxRangePop into the start and the end of the function, we can observe which low-level kernel function that cudnn invoked.
Motivation
I'm currently profiling the cuda code generated by nnfusion, to better profile the program, I suggest using NVIDIA Tools Extension, by inserting
nvtxRangePush
andnvtxRangePop
into the start and the end of the function, we can observe which low-level kernel function that cudnn invoked.the output will contain kernels used by this op function:
The text was updated successfully, but these errors were encountered: