You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ncu_rep only profile forward for fwd_bwd mode and it fails for bwd tests.
To reproduce:
python run.py --op rms_norm --mode bwd --precision fp32 --metrics ncu_rep,kineto_trace --cudagraph
0%| | 0/6 [00:00<?, ?it/s]I1202 15:21:27.471187 1905825 DynoCmdLine.cpp:1393] Target Host: localhost (port 1777)
Failed to configure DCGM profiling, it may be DCGM is not supported or failed==PROF== Connected to process 1906325 (/home/yhao/.conda/envs/ptd/bin/python3.11)
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 1.40it/s]
(M, H) llama_rms-_ncu_trace_in_task
------------ ------------------------------
(2048, 1024) success
==PROF== Disconnected from process 1906325
==WARNING== No kernels were profiled.
==WARNING== Note that specified NVTX include expressions match only push/pop ranges.
==WARNING== Refer https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html#nvtx-filtering for NVTX Filtering usage.
I1202 15:21:34.731528 1906690 DynoCmdLine.cpp:1393] Target Host: localhost (port 1777)
Failed to configure DCGM profiling, it may be DCGM is not supported or failed==PROF== Connected to process 1906796 (/home/yhao/.conda/envs/ptd/bin/python3.11)
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:01<00:00, 1.08s/it]
(M, H) liger_rms-_ncu_trace_in_task
------------ ------------------------------
(2048, 1024) success
==PROF== Disconnected from process 1906796
==WARNING== No kernels were profiled.
==WARNING== Note that specified NVTX include expressions match only push/pop ranges.
==WARNING== Refer https://docs.nvidia.com/nsight-compute/NsightComputeCli/index.html#nvtx-filtering for NVTX Filtering usage.
I1202 15:21:42.512831 1907289 DynoCmdLine.cpp:1393] Target Host: localhost (port 1777)
The text was updated successfully, but these errors were encountered:
ncu_rep only profile forward for fwd_bwd mode and it fails for bwd tests.
To reproduce:
The text was updated successfully, but these errors were encountered: