You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I want to deploy the yolov5 model on both gpu and dla at the same time. Will there be resource competition issues between the two? What I learned before is that dla has unsupported layers, such as yolov5, which will use cuda resources, resulting in a significant decrease in efficiency.
The text was updated successfully, but these errors were encountered:
Using cuDLA requires all layers can be supported by DLA, we moved several unsupported layers into post-processing, so that it won't use GPU resource in the runtime. Comparing to cuDLA Hybrid mode, cuDLA Standalone mode won't create CUDA context, that will be no CUDA context switching overhead for multiple processes case.
I conducted tests, and when I executed the command with USE_DLA_STANDALONE_MODE=1 and USE_DETERMINISTIC_SEMAPHORE=1 along with another deep learning model program, the time taken increased significantly compared to running either one individually. It appears that these two options do have an impact.
Then it should be due to bandwidth-bound. DLA and GPU both consume the same resource: system DRAM. The more bandwidth-bound a workload is, the higher the chances that both DLA and GPU will become bottlenecked for memory access when running in parallel.
I want to deploy the yolov5 model on both gpu and dla at the same time. Will there be resource competition issues between the two? What I learned before is that dla has unsupported layers, such as yolov5, which will use cuda resources, resulting in a significant decrease in efficiency.
The text was updated successfully, but these errors were encountered: