Replies: 2 comments 1 reply
-
I wonder the same. The GPU I'm testing is a Geforce RTX 2080. When comparing yolov8n against yolov10n, version 8 actually take less time per iteration during inference (with a batch size of 1). yolov8n averaged 6.0ms per image. (not including preprocessing/postprocessing) | allocated GPU memory 286mb Now when comparing yolov8L against yolov10L (both optimized with Tensorrt FP16), TRT yolov8L averaged 20ms per image. (including preprocessing/postprocessing) | allocated GPU memory 422mb So it seems that we might see the actual improvements by optimizing the .pt models on Tensorrt. |
Beta Was this translation helpful? Give feedback.
-
Thank you for the response.
My thoughts are as follows:
YOLOv10 seems to have additional elements in the model to eliminate NMS
process in the post-processing.
In other words, it is designed to simplify the post-processing to reduce
inference time.
However, since my dataset has only one object per image, the
post-processing time for NMS etc. was relatively short.
Therefore, I think YOLOv10's inference time is actually longer.
Thank you for your good comments.
2024년 10월 26일 (토) 오전 3:55, nicholasguimaraes ***@***.***>님이
작성:
… I wonder the same.
The GPU I'm testing is a Geforce RTX 2080.
When comparing yolov8n against yolov10n, version 8 actually take less time
per iteration during inference (with a batch size of 1).
yolov8n averaged 6.0ms per image. (not including
preprocessing/postprocessing) | allocated GPU memory 286mb
yolov10n averaged 7.0ms per image. (not including
preprocessing/postprocessing) | allocated GPU memory 286mb
Now when comparing yolov8L against yolov10L (both optimized with Tensorrt
FP16),
TRT yolov8L averaged 20ms per image. (including
preprocessing/postprocessing) | allocated GPU memory 422mb
TRT yolov10L averaged 18ms per image. (including
preprocessing/postprocessing) | allocated GPU memory 386mb
So it seems that we might see the actual improvements by optimizing the
.pt models on Tensorrt.
—
Reply to this email directly, view it on GitHub
<#441 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AS7EEF2ZDJ2IUN4476HNS23Z5KHY5AVCNFSM6AAAAABONNPNASVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTCMBVGYYDKMQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
--
======================================================
충북대학교 산업인공지능연구센터 초빙교수 김현용
연락처 : 043) 249-1462 / 010-3023-3601
이메일 : ***@***.***
주 소 : 충북 청주시 청원구 오창읍 양청4길 45,
충북대학교 오창캠퍼스 융합기술원 C655호
홈페이지 : https://indai.cbnu.ac.kr/
======================================================
|
Beta Was this translation helpful? Give feedback.
-
I normally use YOLOv5, v8, and I've been researching the recently released YOLOv10.
Specifically, I'm trying to learn the Nano model and apply it to the Jetson Nano.
I have FLOPs for 640 in YOLOv5, 8, and 10, but I don't have FLOPs for 416 and 224, is there a way to calculate them?
If it is a general classification model, I can calculate it with pytorch's profile, but if it includes post-processing.
I wonder if I can get it myself. (How to calculate)
I developed a v10 model with YOLOv8 and actually tried to infer it on Jetson Nano,
contrary to what the paper says, it was actually more time consuming.
Can you explain the results that contradict the paper?
In particular, the FLOPs of YOLOv8n and YOLOv10n were 8.1G and 8.2G, respectively, during the training process, and v10 was actually higher.
Thank you
Beta Was this translation helpful? Give feedback.
All reactions