Skip to content

Latest commit

 

History

History
33 lines (23 loc) · 1.72 KB

File metadata and controls

33 lines (23 loc) · 1.72 KB

NVIDIA TensorRT Deployment Workshop

This repository contains the code and resources for the 2-day NVIDIA TensorRT Deployment Workshop. The workshop focuses on deploying deep learning models efficiently using NVIDIA TensorRT, a high-performance deep learning inference library.

Workshop Overview:

During this hands-on workshop, participants will:

  • Learn how to optimize deep learning models for inference.
  • Understand the benefits of using TensorRT for deployment in real-world applications.
  • Explore various model formats (ONNX, PyTorch, TensorFlow) and their conversion to TensorRT.
  • Work with practical examples involving neural networks and their deployment on NVIDIA GPUs.

Key Topics Covered:

  1. Introduction to Huggingface Framework : In order to use models on the fly using various libraries and adapting to our use case.
  2. Introduction to NVIDIA TensorRT: Understanding its components and optimization strategies.
  3. Model Conversion: Converting trained models into TensorRT.
  4. Inference Optimization: Techniques for improving inference speed and efficiency.
  5. Hands-on Labs: Implementing real-world examples, including object detection and classification tasks.
  6. Performance Benchmarks: Measuring speed and accuracy across different hardware setups.

This Repository also contains the Demo of NIM NVIDIA RAG-based Chatbot utilizing LLAMA 3.1B model, and Vector Store DB (Pickled for upto 10x efficiency), hosted on Streamlit

Installation

Before executing NIM_NVIDIA_CHATBOT.py, ensure you have the required dependencies installed. Run the following command:

pip install -r requirements.txt

Prerequisites:

  • Basic knowledge of deep learning models (PyTorch or TensorFlow).