Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error during Model Conversion Process - Impact Inquiry #3

Open
liuweixue001 opened this issue Aug 23, 2023 · 7 comments
Open

Error during Model Conversion Process - Impact Inquiry #3

liuweixue001 opened this issue Aug 23, 2023 · 7 comments
Assignees
Labels

Comments

@liuweixue001
Copy link

Hello,

I hope this message finds you well. I followed the tutorial to successfully convert the model; however, an error occurred during the model conversion process. I am seeking clarification on the potential impact of this error.

The specific error message I encountered is as follows:

[08/23/2023-10:06:30] [V] [TRT] Engine Layer Information:
Layer(DLA): {ForeignNode[/model.0/conv/Conv.../model.24/m.2/Conv]}, Tactic: 0x0000000000000003, images (Half[1,3:16,672,672]) -> s8 (Half[1,255:16,84,84]), s16 (Half[1,255:16,42,42]), s32 (Half[1,255:16,21,21])
[08/23/2023-10:06:30] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in building engine: CPU +14, GPU +0, now: CPU 14, GPU 0 (MiB)
[08/23/2023-10:06:30] [I] Engine built in 13.8529 sec.
[08/23/2023-10:06:30] [I] [TRT] Loaded engine size: 14 MiB
[08/23/2023-10:06:30] [E] Error[9]: Cannot deserialize serialized engine built with EngineCapability::kDLA_STANDALONE, use cuDLA APIs instead.
[08/23/2023-10:06:30] [E] Error[4]: [runtime.cpp::deserializeCudaEngine::65] Error Code 4: Internal Error (Engine deserialization failed.)
[08/23/2023-10:06:30] [E] Engine deserialization failed
[08/23/2023-10:06:30] [I] Skipped inference phase since --buildOnly is added.
&&&& PASSED TensorRT.trtexec [TensorRT v8502] # /usr/src/tensorrt/bin/trtexec --onnx=data/model/yolov5s_trimmed_reshape_tranpose.onnx --verbose --fp16 --saveEngine=data/loadable/yolov5.fp16.fp16chw16in.fp16chw16out.standalone.bin --inputIOFormats=fp16:chw16 --outputIOFormats=fp16:chw16 --buildDLAStandalone --useDLACore=0

I would appreciate it if you could kindly provide information about the potential consequences of this error. Does it affect the converted model's functionality or performance?

Thank you very much for your assistance.

@mrfsc
Copy link

mrfsc commented Aug 25, 2023

I encountered the same problem, but the code ran successfully. However, the inference time of the DLA model is much longer than the 4ms in the tutorial, and the actual inference time is about 30ms:

**
././build/cudla_yolov5_app --engine ./data/loadable/yolov5.int8.int8hwc4in.fp16chw16out.standalone.bin --image ./data/images/image.jpg --backend cudla_int8

DLA CTX INIT !!!
ALL MEMORY REGISTERED SUCCESSFULLY
Run Yolov5 DLA pipeline for ./data/images/image.jpg
SUBMIT CUDLA TASK
Input Tensor Num: 1
Output Tensor Num: 3
SUBMIT IS DONE !!!
Inference time: 30.567 ms
Num object detect: 919
detect result has been write to result.jpg
**

@DC-Zhou
Copy link

DC-Zhou commented Sep 6, 2023

I get same error on Jetson Orin AGX, I think maybe the trt version need 8.6.0 but for jetpack, trt only is 8.5.3, so when trtexec deal with the onnx , DLAalone feature is not support.
Maybe need repo docker images?

@zerollzeng
Copy link
Collaborator

I encountered the same problem, but the code ran successfully. However, the inference time of the DLA model is much longer than the 4ms in the tutorial, and the actual inference time is about 30ms:

** ././build/cudla_yolov5_app --engine ./data/loadable/yolov5.int8.int8hwc4in.fp16chw16out.standalone.bin --image ./data/images/image.jpg --backend cudla_int8

DLA CTX INIT !!! ALL MEMORY REGISTERED SUCCESSFULLY Run Yolov5 DLA pipeline for ./data/images/image.jpg SUBMIT CUDLA TASK Input Tensor Num: 1 Output Tensor Num: 3 SUBMIT IS DONE !!! Inference time: 30.567 ms Num object detect: 919 detect result has been write to result.jpg

Which DOS/Jetpack you are using? You need DOS 6080+ or JP 6.0+ to get the perf in our readme.

@mrfsc
Copy link

mrfsc commented Sep 11, 2023

I encountered the same problem, but the code ran successfully. However, the inference time of the DLA model is much longer than the 4ms in the tutorial, and the actual inference time is about 30ms:
** ././build/cudla_yolov5_app --engine ./data/loadable/yolov5.int8.int8hwc4in.fp16chw16out.standalone.bin --image ./data/images/image.jpg --backend cudla_int8
DLA CTX INIT !!! ALL MEMORY REGISTERED SUCCESSFULLY Run Yolov5 DLA pipeline for ./data/images/image.jpg SUBMIT CUDLA TASK Input Tensor Num: 1 Output Tensor Num: 3 SUBMIT IS DONE !!! Inference time: 30.567 ms Num object detect: 919 detect result has been write to result.jpg

Which DOS/Jetpack you are using? You need DOS 6080+ or JP 6.0+ to get the perf in our readme.

Thanks for reply,
My JetPack version is 5.1.2, and it's the latest version I can get in all jetpack archives: (https://developer.nvidia.com/embedded/jetpack-archive)
how can I get JetPack 6.0+? or is there a docker image to verify the performance?

@zerollzeng
Copy link
Collaborator

Unfortunately no, You have to wait for its release :-(

@zerollzeng zerollzeng self-assigned this Sep 11, 2023
@jinzhongxiao
Copy link

I encountered the same problem, but the code ran successfully. However, the inference time of the DLA model is much longer than the 4ms in the tutorial, and the actual inference time is about 30ms:
** ././build/cudla_yolov5_app --engine ./data/loadable/yolov5.int8.int8hwc4in.fp16chw16out.standalone.bin --image ./data/images/image.jpg --backend cudla_int8
DLA CTX INIT !!! ALL MEMORY REGISTERED SUCCESSFULLY Run Yolov5 DLA pipeline for ./data/images/image.jpg SUBMIT CUDLA TASK Input Tensor Num: 1 Output Tensor Num: 3 SUBMIT IS DONE !!! Inference time: 30.567 ms Num object detect: 919 detect result has been write to result.jpg

Which DOS/Jetpack you are using? You need DOS 6080+ or JP 6.0+ to get the perf in our readme.

Thanks for reply, My JetPack version is 5.1.2, and it's the latest version I can get in all jetpack archives: (https://developer.nvidia.com/embedded/jetpack-archive) how can I get JetPack 6.0+? or is there a docker image to verify the performance?

I don't think Jetpack 6.0+ is workful, I have tried jetpack 6.0, it had some other issue in running bash loadle.sh

@lxzatwowone1
Copy link

result.jpg has no result boxes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants