Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Method to convert python model to c++ #47

Open
quocnhat opened this issue Nov 24, 2020 · 21 comments
Open

Method to convert python model to c++ #47

quocnhat opened this issue Nov 24, 2020 · 21 comments

Comments

@quocnhat
Copy link

Hi Nist's participants, I have successfully submitted my solutions to Nist for several months. NCNN is the way to convert the python model to c++, but it quite slow for the inference. I find that some teams submitted a larger model ( Resnet 50 or Resnet 100,...), May I ask for the best method to port the python model to inference in c++?

@mlourencoeb
Copy link

This heavily depends on your deep learning framework. Please provide more details.

@quocnhat
Copy link
Author

quocnhat commented Dec 2, 2020

Thanks for your reply, Sorry for my mistake. My model using pytorch framework. By using NCNN to port pytorch model tobe able to read in C++, I can not utilize Resnet 50 because of the big inference time. But other teams can. May I ask for the solution?

@mlourencoeb
Copy link

I would simply conver to onnx format and run it directly with c++ onnx interface or (easier) with opencv deep learning inference.

Any of solutions above will get you there.

@quocnhat
Copy link
Author

quocnhat commented Dec 3, 2020

Thanks very much, But ONNX (NCNN->ONNX) is exactly what I did. It does not help for large models like ResNet 50 or 100 due to big inference time. I will take the time for opencv deep learning interface solution. Thanks

@mlourencoeb
Copy link

What is your execution time? Now you can go up to 1.5secs, and I am sure that the opencv solution will work even in single thread mode.

@quocnhat
Copy link
Author

quocnhat commented Dec 7, 2020

Sorry, but We can not estimate execution time exactly up on Nist's server. The last submission (model verification Resnet 50) failed because of the overloading execution time (they did not feedback the execution time for the failed submission). The Resnet 18 is accepted with 0.675(s). (Assume that detection pharse need 0.5s and it remains 1s for verification)

@mlourencoeb
Copy link

0.5s for detection seems high to me. Which network are you using?

@quocnhat
Copy link
Author

quocnhat commented Dec 7, 2020

The detection net is the modified version of the mobilenetv2(backbone) + retinaFace(head) and can be changed somehow. But let's assume the verification Net remain 1s for inference. By using onnx, Resnet 50( or 101) seems overloading execution time. Opencv deep learning still on my schedule.

@xsacha
Copy link

xsacha commented Jan 8, 2021

You should be able to test performance by using a Xeon processor (AVX-512 compatible) or Core i7 (AVX2 compatible) and building your project without threading or targeting a single core.

From my own testing on the sample images provided in this Github:
MobileNetV2 detector + 101 layer resnet should take approximately 5 minutes to detect and generate models for the 653 enrolment images using off-the-shelf frameworks such as ONNXRuntime or Torch. Or roughly half a second per image.
I was also able to run a custom DetNAS + modified 150 layer resnet in 6 minutes.

Speed should not be your limitation. The detector is most certainly the fast bit.

If you are taking longer than this, I would profile the app using CPU profiler (Linux: perf record, Windows: Visual Studio) and have a look at where your CPU time is being spent. My guess is it is not spent generating models, but perhaps setting up or manipulating the images. Especially as you seem unfamiliar with C++, you may not recognise the redundant operations and how they may affect performance. Do you have anyone at your company that specialises in C++?

@cao-nv
Copy link

cao-nv commented Mar 29, 2021

You should be able to test performance by using a Xeon processor (AVX-512 compatible) or Core i7 (AVX2 compatible) and building your project without threading or targeting a single core.

From my own testing on the sample images provided in this Github:
MobileNetV2 detector + 101 layer resnet should take approximately 5 minutes to detect and generate models for the 653 enrolment images using off-the-shelf frameworks such as ONNXRuntime or Torch. Or roughly half a second per image.
I was also able to run a custom DetNAS + modified 150 layer resnet in 6 minutes.

Speed should not be your limitation. The detector is most certainly the fast bit.

If you are taking longer than this, I would profile the app using CPU profiler (Linux: perf record, Windows: Visual Studio) and have a look at where your CPU time is being spent. My guess is it is not spent generating models, but perhaps setting up or manipulating the images. Especially as you seem unfamiliar with C++, you may not recognise the redundant operations and how they may affect performance. Do you have anyone at your company that specialises in C++?

Thank you for your information. Would you mind providing more details about your configs, e.g. input image size?
At present, it takes me about 600ms to run Retinaface for only face detection at size of 480x480.

@xsacha
Copy link

xsacha commented Mar 29, 2021

@cao-nv Make sure your detector isn't unnecessarily deep. Most the images in this test are frontal. It sounds like you're using a resnet 50 retinaface judging by the timings. If that's on mobilenet, you are doing too much work somewhere and I'd refer back to my suggestion to profile the app.
I target 640x640 resolution btw.

@cao-nv
Copy link

cao-nv commented Mar 30, 2021

@cao-nv Make sure your detector isn't unnecessarily deep. Most the images in this test are frontal. It sounds like you're using a resnet 50 retinaface judging by the timings. If that's on mobilenet, you are doing too much work somewhere and I'd refer back to my suggestion to profile the app.
I target 640x640 resolution btw.

Yes, I used Resnet50. Actually, this is my first try for FRVT. I'll consider backbone. Thank you a lot.

@xsacha
Copy link

xsacha commented May 6, 2021

@cao-nv Looks like your model successfully ran in just under 0.5 seconds and without much memory usage. Good work :)
Same with yours @quocnhat.
Time quota is 1.5 seconds so as you can see, there's plenty to play with there.

Issue can probably be closed now.

@cao-nv
Copy link

cao-nv commented May 7, 2021

Actually, I couldn't run RetinaFace with ResNet-50 as its backbone with he input size of 640x640 in less than a seconds. I chose to reduce the input size to 480x480, then it took about 1 second just for detecting faces.

@xsacha
Copy link

xsacha commented May 7, 2021

Might just be your machine then because the NIST results show your company doing under 0.5 seconds.

@cao-nv
Copy link

cao-nv commented May 7, 2021

Might just be your machine then because the NIST results show your company doing under 0.5 seconds.

That result was our first submission. We are preparing for the next time.
You are right about the machine, I compiled and tested everything on a laptop with Intel core-i5 10th gen which supports AVX2 but not AVX512 as Xeon E5 CPUs. I don't know if the core-i5 is much slower than Xeon E5.

@HYL-Dave
Copy link

I have the similar question about convert python inference to c++ inference. Is there any way to not modify the python inference code? I want to use C++ to call python inference. Could anyone give me some advises?

@RamatovInomjon
Copy link

Hi Nist's participants, I have successfully submitted my solutions to Nist for several months. NCNN is the way to convert the python model to c++, but it quite slow for the inference. I find that some teams submitted a larger model ( Resnet 50 or Resnet 100,...), May I ask for the best method to port the python model to inference in c++?

Hi there, How do you convert const FRVT::Image &image to the ncnn:Mat format?

@mlourencoeb
Copy link

Hi @RamatovInomjon

From https://github.com/Tencent/ncnn/blob/master/src/mat.h there is a method you can use to create from_pixels:

static Mat from_pixels(const unsigned char* pixels, int type, int w, int h, Allocator* allocator = 0);

This should be enough.

Best,
/M

@RamatovInomjon
Copy link

RamatovInomjon commented Mar 28, 2023

@mlourencoeb
thanks for replying!
I'm using this, ncnn::Mat enroll_ncnn_img = ncnn::Mat::from_pixels(image.data.get(), ncnn::Mat::PIXEL_RGB, image.width, image.height);

when i save this image
wface-11

actually image is

rotated90

@iqbalfarz
Copy link

I have the similar question about convert python inference to c++ inference. Is there any way to not modify the python inference code? I want to use C++ to call python inference. Could anyone give me some advises?

you may use pybind11 or simply Python.h to call python runtime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants