-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to launch Triton Server with hps backend using latest HugeCTR and hugectr_backend repos #58
Comments
Hi @sezhiyanhari, I am moving this issue to the Hierarchical Parameter Server Backend team to better address this issue. |
@sezhiyanhari According to the reply in NVIDIA-Merlin/HugeCTR#431 , the HugeCTR Triton Backend implementation has been completely removed from the current repo , so you may have misunderstood that it is not that the HugeCTR backend has been merged into the HPS backend, but that the HugeCTR Triton backend has been completely deprecated, that is the model trained through native HugeCTR(https://github.com/NVIDIA-Merlin/Merlin/blob/main/examples/scaling-criteo/03-Training-with-HugeCTR.ipynb) no longer supports deployment in the Triton Server and we will delete the examples about HugeCTR model deployment in Merlin repo. Sorry again for the confusion. So if you still need to use the HugeCTR backend to deploy the native HugeCTR model, you can only use the NGC image before 22.08. If you need to compile HugeCTR backend from scratch, please note that the Trtion REPO_TAG release version( In addition, you can also directly use TF or Pytorch training models to deploy on the Triton server through the HPS plug-ins: |
@yingcanw got it, thank you for the explanation! Do you know how I can rework this example (https://github.com/NVIDIA-Merlin/Merlin/blob/main/examples/scaling-criteo/03-Training-with-HugeCTR.ipynb) so it's compatible for inference with the HPS backend? I'm specifically trying to train the Criteo TB model using the DLRM architecture and doing inference on the trained model using HPS on an image where HPS is compiled from source. I'd also like for HugeCTR to be at the latest version possible. |
We have completely stopped supporting the deployment of the native HugeCTR training model on triton server with HugeCTR backend. If you insist on using the Hugectr backend for deployment, the only way is to use an old version of the image or compile it from scratch to get the HugeCTR triton backend (before v23.08 ).
Because the HPS inference implementation is decoupled from the training implementation of native HugeCTR, you can use the latest HugeCTR code for model training. However, if you want to deploy on Triton Server, you can only compile the old version of HugeCTR Triton backend as described above. Our recommended deployment solution is to use TRT Trtion backend, combined with HPS TRT plugins to deploy on Triton server. I don't think this will hinder your purpose of using the native HugeCTR to train a DLRM model using Criteo dataset, and deployment on Triton server. Please refer to the demo_for_hugectr_trained_model.ipynb |
Description
I'm unable to install and run the Triton server using the HPS backend.
Triton Information
Triton v23.06
To Reproduce
Steps to reproduce the behavior.
I'm following steps (1) and (2) here (https://github.com/triton-inference-server/hugectr_backend) under the
Build the HPS Backend from Scratch
section. I follow all the steps exactly.I'm doing all the steps in a container built from this image (https://github.com/NVIDIA-Merlin/Merlin/blob/main/docker/dockerfile.ctr).
After trying to launch Triton using
tritonserver --model-repository=/opt/hugectr_testing/data/test_dask/output/model_inference --backend-config=hps,ps=/opt/hugectr_testing/data/test_dask/output/model_inference/ps.json
, I get the following:I trained my model using this example notebook: https://github.com/NVIDIA-Merlin/Merlin/blob/main/examples/scaling-criteo/03-Training-with-HugeCTR.ipynb
However, this notebook came out before the HugeCTR backend was merged with the HPS backend. As a result, I needed to manually a line in my
config.pbtxt
to go from the hugectr to hps backend =>backend: "hugectr"
tobackend: "hps"
.Expected behavior
First, when building the inference version, I expect
hugectr
's Python module to be installed, but it isn't. This is weird because when I turn off the-DENABLE_INFERENCE=ON
and install,import hugectr
works.Second, I expect the Triton server to start and accept requests.
The text was updated successfully, but these errors were encountered: