Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create a Docker image for Morpheus models #1804

Merged
merged 27 commits into from
Jul 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
5780f1b
Remove most models from the release container, including only the dat…
dagardner-nv Jul 8, 2024
cba0564
First pass at a models container
dagardner-nv Jul 8, 2024
99fb97f
Copy the two example models to the container
dagardner-nv Jul 8, 2024
b537bdb
Strip the leading 'v' char from tag, don't auto tag as latest
dagardner-nv Jul 8, 2024
466b136
Update example to use the new models container
dagardner-nv Jul 8, 2024
e7e95b6
WIP
dagardner-nv Jul 8, 2024
0429922
WIP
dagardner-nv Jul 8, 2024
4d6d0d8
Add datasets fetch target which includes data needed for examples and…
dagardner-nv Jul 8, 2024
8cbef92
Replace models with datasets from fetch command
dagardner-nv Jul 8, 2024
99d7bf9
No need to mount the models in the release container
dagardner-nv Jul 8, 2024
18d2bbf
Update triton start commands
dagardner-nv Jul 8, 2024
0f8933c
Update sid viz to use new models container
dagardner-nv Jul 8, 2024
2767857
Update dev guide
dagardner-nv Jul 8, 2024
7a41c3c
Replace triton image with models image, remove redundant variable dec…
dagardner-nv Jul 8, 2024
21e819d
Move ransomeware models from the examples dir to the models dir
dagardner-nv Jul 8, 2024
5400643
Update devcontainer to use models image
dagardner-nv Jul 8, 2024
27224f0
WIP: Updating docs
dagardner-nv Jul 9, 2024
7894415
Update fetch_data documentation to only fetch datasets instead of models
dagardner-nv Jul 9, 2024
68186eb
Ensure models are fetched, determine MORPHEUS_ROOT_HOST just after MO…
dagardner-nv Jul 9, 2024
3cfc1fb
Revert "Remove pre-built container section from `getting_started.md` …
dagardner-nv Jul 9, 2024
4c84fec
Adjust versions and add a tidbit about pulling morpheus-tritonserver-…
dagardner-nv Jul 9, 2024
a970c41
Add information about building the models container
dagardner-nv Jul 23, 2024
2025511
Add information about building the models container [no ci]
dagardner-nv Jul 23, 2024
c656fb1
Update morpheus model container version
dagardner-nv Jul 23, 2024
d59e680
Merge branch 'branch-24.10' into david-models-image
dagardner-nv Jul 23, 2024
87e79ed
Merge branch 'branch-24.10' of github.com:nv-morpheus/Morpheus into d…
dagardner-nv Jul 25, 2024
7b92ad4
Revert switching to morpheus models container for the devcontainer, p…
dagardner-nv Jul 25, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 0 additions & 1 deletion .devcontainer/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,6 @@ services:
volumes:
- ${HOST_MORPHEUS_ROOT}/models:/models
- ${HOST_MORPHEUS_ROOT}/examples/abp_pcap_detection/abp-pcap-xgb:/models/triton-model-repo/abp-pcap-xgb
- ${HOST_MORPHEUS_ROOT}/examples/ransomware_detection/models/ransomw-model-short-rf:/models/triton-model-repo/ransomw-model-short-rf

zookeeper:
image: bitnami/zookeeper:latest
Expand Down
13 changes: 13 additions & 0 deletions ci/release/update-version.sh
Original file line number Diff line number Diff line change
Expand Up @@ -113,3 +113,16 @@ sed_runner "s/${CURRENT_SHORT_TAG}/${NEXT_SHORT_TAG}/g" docs/source/getting_star
# models/model-cards
sed_runner "s|blob/branch-${CURRENT_SHORT_TAG}|blob/branch-${NEXT_SHORT_TAG}|g" models/model-cards/*.md
sed_runner "s|tree/branch-${CURRENT_SHORT_TAG}|tree/branch-${NEXT_SHORT_TAG}|g" models/model-cards/*.md

# Update the version of the Morpheus model container
# We need to update several files, however we need to avoid symlinks as well as the build and .cache directories
DOCS_MD_FILES=$(find -P ./docs/source/ -type f -iname "*.md")
EXAMPLES_MD_FILES=$(find -P ./examples/ -type f -iname "*.md")
sed_runner "s|morpheus-tritonserver-models:${CURRENT_SHORT_TAG}|morpheus-tritonserver-models:${NEXT_SHORT_TAG}|g" \
${DOCS_MD_FILES} \
${EXAMPLES_MD_FILES} \
.devcontainer/docker-compose.yml \
examples/sid_visualization/docker-compose.yml \
models/triton-model-repo/README.md \
scripts/validation/val-globals.sh \
tests/benchmarks/README.md
2 changes: 1 addition & 1 deletion docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -294,7 +294,7 @@ COPY "${MORPHEUS_ROOT_HOST}/conda/environments/*.yaml" "./conda/environments/"
COPY "${MORPHEUS_ROOT_HOST}/docker" "./docker"
COPY --from=build_docs "/workspace/build/docs/html" "./docs"
COPY "${MORPHEUS_ROOT_HOST}/examples" "./examples"
COPY "${MORPHEUS_ROOT_HOST}/models" "./models"
COPY "${MORPHEUS_ROOT_HOST}/models/datasets" "./models/datasets"
COPY "${MORPHEUS_ROOT_HOST}/scripts" "./scripts"
COPY "${MORPHEUS_ROOT_HOST}/*.md" "./"
COPY "${MORPHEUS_ROOT_HOST}/LICENSE" "./"
Expand Down
2 changes: 1 addition & 1 deletion docker/build_container_release.sh
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ export DOCKER_TARGET=${DOCKER_TARGET:-"runtime"}
popd &> /dev/null

# Fetch data
"${SCRIPT_DIR}/../scripts/fetch_data.py" fetch docs examples models
"${SCRIPT_DIR}/../scripts/fetch_data.py" fetch docs examples datasets

# Call the general build script
${SCRIPT_DIR}/build_container.sh
2 changes: 1 addition & 1 deletion docker/run_container_release.sh
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ DOCKER_EXTRA_ARGS=${DOCKER_EXTRA_ARGS:-""}

popd &> /dev/null

DOCKER_ARGS="--runtime=nvidia --env WORKSPACE_VOLUME=${PWD} -v $PWD/models:/workspace/models --net=host --gpus=all --cap-add=sys_nice ${DOCKER_EXTRA_ARGS}"
DOCKER_ARGS="--runtime=nvidia --env WORKSPACE_VOLUME=${PWD} --net=host --gpus=all --cap-add=sys_nice ${DOCKER_EXTRA_ARGS}"

if [[ -n "${SSH_AUTH_SOCK}" ]]; then
echo -e "${b}Setting up ssh-agent auth socket${x}"
Expand Down
4 changes: 2 additions & 2 deletions docs/source/basics/building_a_pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -211,9 +211,9 @@ Pipeline visualization saved to .tmp/multi_monitor_throughput.png
This example shows an NLP Pipeline which uses several stages available in Morpheus. This example utilizes the Triton Inference Server to perform inference, and writes the output to a Kafka topic named `inference_output`. Both of which need to be started prior to launching Morpheus.

#### Launching Triton
From the Morpheus repo root directory, run the following to launch Triton and load the `sid-minibert` model:
Run the following to launch Triton and load the `sid-minibert` model:
```bash
docker run --rm -ti --gpus=all -p8000:8000 -p8001:8001 -p8002:8002 -v $PWD/models:/models nvcr.io/nvidia/tritonserver:23.06-py3 tritonserver --model-repository=/models/triton-model-repo --exit-on-error=false --model-control-mode=explicit --load-model sid-minibert-onnx
docker run --rm -ti --gpus=all -p8000:8000 -p8001:8001 -p8002:8002 nvcr.io/nvidia/morpheus/morpheus-tritonserver-models:24.10 --model-repository=/models/triton-model-repo --exit-on-error=false --model-control-mode=explicit --load-model sid-minibert-onnx
```

#### Launching Kafka
Expand Down
9 changes: 8 additions & 1 deletion docs/source/developer_guide/contributing.md
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,7 @@ This workflow utilizes a Docker container to set up most dependencies ensuring a
```
1. The container tag follows the same rules as `build_container_dev.sh` and will default to the current `YYMMDD`. Specify the desired tag with `DOCKER_IMAGE_TAG`. i.e. `DOCKER_IMAGE_TAG=my_tag ./docker/run_container_dev.sh`
2. This will automatically mount the current working directory to `/workspace`.
3. Some of the validation tests require launching a Triton Docker container within the Morpheus container. To enable this you will need to grant the Morpheus container access to your host OS's Docker socket file with:
3. Some of the validation tests require launching the Morpheus models Docker container within the Morpheus container. To enable this you will need to grant the Morpheus container access to your host OS's Docker socket file with:
```bash
DOCKER_EXTRA_ARGS="-v /var/run/docker.sock:/var/run/docker.sock" ./docker/run_container_dev.sh
```
Expand Down Expand Up @@ -235,6 +235,13 @@ git submodule update --init --recursive
```
At this point, Morpheus can be fully used. Any changes to Python code will not require a rebuild. Changes to C++ code will require calling `./scripts/compile.sh`. Installing Morpheus is only required once per virtual environment.

### Build the Morpheus Models Container

From the root of the Morpheus repository run the following command:
```bash
models/docker/build_container.sh
```

### Quick Launch Kafka Cluster

Launching a full production Kafka cluster is outside the scope of this project; however, if a quick cluster is needed for testing or development, one can be quickly launched via Docker Compose. The following commands outline that process. Refer to [this](https://medium.com/big-data-engineering/hello-kafka-world-the-complete-guide-to-kafka-with-docker-and-python-f788e2588cfc) guide for more in-depth information:
Expand Down
13 changes: 6 additions & 7 deletions docs/source/developer_guide/guides/2_real_world_phishing.md
Original file line number Diff line number Diff line change
Expand Up @@ -221,22 +221,21 @@ In the above the `needed_columns` were provided to as an argument to the `stage`

## Predicting Fraudulent Emails with Accelerated Machine Learning

Now we'll use the `RecipientFeaturesStage` that we just made in a real-world pipeline to detect fraudulent emails. The pipeline we will be building makes use of the `TritonInferenceStage` which is a pre-defined Morpheus stage designed to support the execution of Natural Language Processing (NLP) models via NVIDIA's [Triton Inference Server](https://developer.nvidia.com/nvidia-triton-inference-server). NVIDIA Triton Inference Server allows for GPU accelerated ML/DL and seamless co-location and execution of a wide variety of model frameworks. For our application, we will be using the `phishing-bert-onnx` model, which is included with Morpheus in the `models/triton-model-repo/` directory.
Now we'll use the `RecipientFeaturesStage` that we just made in a real-world pipeline to detect fraudulent emails. The pipeline we will be building makes use of the `TritonInferenceStage` which is a pre-defined Morpheus stage designed to support the execution of Natural Language Processing (NLP) models via NVIDIA's [Triton Inference Server](https://developer.nvidia.com/nvidia-triton-inference-server). NVIDIA Triton Inference Server allows for GPU accelerated ML/DL and seamless co-location and execution of a wide variety of model frameworks. For our application, we will be using the `phishing-bert-onnx` model, which is included with Morpheus models Docker container as well as in the `models/triton-model-repo/phishing-bert-onnx` directory.

It's important to note here that Triton is a service that is external to the Morpheus pipeline and often will not reside on the same machine(s) as the rest of the pipeline. The `TritonInferenceStage` will use HTTP and [gRPC](https://grpc.io/) network protocols to allow us to interact with the machine learning models that are hosted by the Triton server.

### Launching Triton

Triton will need to be running while we execute our pipeline. For simplicity, we will launch it locally inside of a Docker container.
Triton will need to be running while we execute our pipeline. For simplicity, we will be using the Morpheus models container which includes both Trtion and the Morpheus models.

> **Note**: This step assumes you have both [Docker](https://docs.docker.com/engine/install/) and the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#installation-guide) installed.

From the root of the Morpheus project we will launch a Triton Docker container with the `models` directory mounted into the container:
We will launch a Triton Docker container with:

```shell
docker run --rm -ti --gpus=all -p8000:8000 -p8001:8001 -p8002:8002 \
-v $PWD/models:/models \
nvcr.io/nvidia/tritonserver:23.06-py3 \
nvcr.io/nvidia/morpheus/morpheus-tritonserver-models:24.10 \
tritonserver --model-repository=/models/triton-model-repo \
--exit-on-error=false \
--log-info=true \
Expand Down Expand Up @@ -381,7 +380,7 @@ From this information, we note that the expected dimensions of the model inputs
### Defining our Pipeline
For this pipeline we will have several configuration parameters such as the paths to the input and output files, we will be using the (click)[https://click.palletsprojects.com/] library to expose and parse these parameters as command line arguments. We will also expose the choice of using the class or function based stage implementation via the `--use_stage_function` command-line flag.

> **Note**: For simplicity, we assume that the `MORPHEUS_ROOT` environment variable is set to the root of the Morpheus project repository.
> **Note**: For simplicity, we assume that the `MORPHEUS_ROOT` environment variable is set to the root of the Morpheus project repository.

To start, we will need to instantiate and set a few attributes of the `Config` class. This object is used for configuration options that are global to the pipeline as a whole. We will provide this object to each stage along with stage-specific configuration parameters.

Expand All @@ -402,7 +401,7 @@ The `feature_length` property needs to match the dimensions of the model inputs,

Ground truth classification labels are read from the `morpheus/data/labels_phishing.txt` file included in Morpheus.

Now that our config object is populated, we move on to the pipeline itself. We will be using the same input file from the previous example.
Now that our config object is populated, we move on to the pipeline itself. We will be using the same input file from the previous example.

Next, we will add our custom recipient features stage to the pipeline. We imported both implementations of the stage, allowing us to add the appropriate one based on the `use_stage_function` value provided by the command-line.

Expand Down
9 changes: 7 additions & 2 deletions docs/source/examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,9 +33,14 @@ Ensure the environment is set up by following [Getting Started with Morpheus](./


## Environments
Morpheus supports multiple environments, each environment is intended to support a given use-case. Each example documents which environments it is able to run in. With the exception of the Morpheus Release Container, the examples require fetching the model and example datasets via the `fetch_data.sh` script:
Morpheus supports multiple environments, each environment is intended to support a given use-case. Each example documents which environments it is able to run in. With the exception of the Morpheus Release Container, the examples require fetching both the `datasets` and `examples` dataset via the `fetch_data.sh` script:
```bash
./scripts/fetch_data.py fetch examples models
./scripts/fetch_data.py fetch examples datasets
```

In addition to this many of the examples utilize the Morpheus Triton Models container which can be obtained by running the following command:
```bash
docker pull nvcr.io/nvidia/morpheus/morpheus-tritonserver-models:24.10
```

The following are the supported environments:
Expand Down
79 changes: 74 additions & 5 deletions docs/source/getting_started.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,10 +17,14 @@ limitations under the License.

# Getting Started with Morpheus

There are two ways to get started with Morpheus:
There are three ways to get started with Morpheus:
- [Using pre-built Docker containers](#using-pre-built-docker-containers)
- [Building the Morpheus Docker container](#building-the-morpheus-container)
- [Building Morpheus from source](./developer_guide/contributing.md#building-from-source)

The [pre-built Docker containers](#using-pre-built-docker-containers) are the easiest way to get started with the latest release of Morpheus. Released versions of Morpheus containers can be found on [NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/morpheus/collections/morpheus_).

More advanced users, or those who are interested in using the latest pre-release features, will need to [build the Morpheus container](#building-the-morpheus-container) or [build from source](./developer_guide/contributing.md#building-from-source).

## Requirements
- Volta architecture GPU or better
Expand All @@ -33,6 +37,47 @@ There are two ways to get started with Morpheus:
>
> The Morpheus documentation and examples assume that the [Manage Docker as a non-root user](https://docs.docker.com/engine/install/linux-postinstall/#manage-docker-as-a-non-root-user) post install step has been performed allowing Docker commands to be executed by a non-root user. This is not strictly necessary so long as the current user has `sudo` privileges to execute Docker commands.

## Using pre-built Docker containers
### Pull the Morpheus Image
1. Go to [https://catalog.ngc.nvidia.com/orgs/nvidia/teams/morpheus/containers/morpheus/tags](https://catalog.ngc.nvidia.com/orgs/nvidia/teams/morpheus/containers/morpheus/tags)
1. Choose a version
1. Download the selected version, for example for `24.10`:
```bash
docker pull nvcr.io/nvidia/morpheus/morpheus:24.10-runtime
```
1. Optional, many of the examples require NVIDIA Triton Inference Server to be running with the included models. To download the Morpheus Triton Server Models container (ensure that the version number matches that of the Morpheus container you downloaded in the previous step):
```bash
docker pull nvcr.io/nvidia/morpheus/morpheus-tritonserver-models:24.10
```

> **Note about Morpheus versions:**
>
> Morpheus uses Calendar Versioning ([CalVer](https://calver.org/)). For each Morpheus release there will be an image tagged in the form of `YY.MM-runtime` this tag will always refer to the latest point release for that version. In addition to this there will also be at least one point release version tagged in the form of `vYY.MM.00-runtime` this will be the initial point release for that version (ex. `v24.10.00-runtime`). In the event of a major bug, we may release additional point releases (ex. `v24.10.01-runtime`, `v24.10.02-runtime` etc...), and the `YY.MM-runtime` tag will be updated to reference that point release.
>
> Users who want to ensure they are running with the latest bug fixes should use a release image tag (`YY.MM-runtime`). Users who need to deploy a specific version into production should use a point release image tag (`vYY.MM.00-runtime`).

### Starting the Morpheus Container
1. Ensure that [The NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#docker) is installed.
1. Start the container downloaded from the previous section:
```bash
docker run --rm -ti --runtime=nvidia --gpus=all --net=host -v /var/run/docker.sock:/var/run/docker.sock nvcr.io/nvidia/morpheus/morpheus:24.10-runtime bash
```

Note about some of the flags above:
| Flag | Description |
| ---- | ----------- |
| `--runtime=nvidia` | Choose the NVIDIA docker runtime, this enables access to the GPU inside the container. This flag isn't needed if the `nvidia` runtime is already set as the default runtime for Docker. |
| `--gpus=all` | Specify which GPUs the container has access to. Alternately, a specific GPU could be chosen with `--gpus=<gpu-id>` |
| `--net=host` | Most of the Morpheus pipelines utilize [NVIDIA Triton Inference Server](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tritonserver), which will be running in another container. For simplicity we will give the container access to the host system's network, production deployments may opt for an explicit network configuration. |
| `-v /var/run/docker.sock:/var/run/docker.sock` | Enables access to the Docker socket file from within the running container, this allows launching other Docker containers from within the Morpheus container. This flag is required for launching Triton with access to the included Morpheus models, users with their own models can omit this. |

Once launched, users wishing to launch Triton using the included Morpheus models will need to install the Docker tools in the Morpheus container by running:
```bash
./external/utilities/docker/install_docker.sh
```

Skip ahead to the [Acquiring the Morpheus Models Container](#acquiring-the-morpheus-models-container) section.

## Building the Morpheus Container
### Clone the Repository

Expand All @@ -57,6 +102,7 @@ scripts/fetch_data.py fetch <dataset> [<dataset>...]

At time of writing the defined datasets are:
* all - Metaset includes all others
* datasets - Input files needed for many of the examples
* docs - Graphics needed for documentation
* examples - Data needed by scripts in the `examples` subdir
* models - Morpheus models (largest dataset)
Expand Down Expand Up @@ -100,14 +146,24 @@ The `./docker/run_container_release.sh` script accepts the same `DOCKER_IMAGE_NA
DOCKER_IMAGE_TAG="v24.10.00-runtime" ./docker/run_container_release.sh
```

## Launching Triton Server
## Acquiring the Morpheus Models Container

Many of the validation tests and example workflows require a Triton server to function. In a new terminal, from the root of the Morpheus repo, use the following command to launch a Docker container for Triton loading all of the included pre-trained models:
Many of the validation tests and example workflows require a Triton server to function. For simplicity Morpheus provides a pre-built models container which contains both Triton and the Morpheus models. Users using a release version of Morpheus can download the corresponding Triton models container from NGC with the following command:
```bash
docker pull nvcr.io/nvidia/morpheus/morpheus-tritonserver-models:24.10
```

Users working with an unreleased development version of Morpheus can build the Triton models container from the Morpheus repository. To build the Triton models container, from the root of the Morpheus repository run the following command:
```bash
models/docker/build_container.sh
```

## Launching Triton Server

In a new terminal use the following command to launch a Docker container for Triton loading all of the included pre-trained models:
```bash
docker run --rm -ti --gpus=all -p8000:8000 -p8001:8001 -p8002:8002 \
-v $PWD/models:/models \
nvcr.io/nvidia/tritonserver:23.06-py3 \
nvcr.io/nvidia/morpheus/morpheus-tritonserver-models:24.10 \
tritonserver --model-repository=/models/triton-model-repo \
--exit-on-error=false \
--log-info=true \
Expand All @@ -119,6 +175,19 @@ This will launch Triton using the default network ports (8000 for HTTP, 8001 for

Note: The above command is useful for testing out Morpheus, however it does load several models into GPU memory, which at time of writing consumes roughly 2GB of GPU memory. Production users should consider only loading the specific model(s) they plan on using with the `--model-control-mode=explicit` and `--load-model` flags. For example to launch Triton only loading the `abp-nvsmi-xgb` model:
```bash
docker run --rm -ti --gpus=all -p8000:8000 -p8001:8001 -p8002:8002 \
nvcr.io/nvidia/morpheus/morpheus-tritonserver-models:24.10 \
tritonserver --model-repository=/models/triton-model-repo \
--exit-on-error=false \
--log-info=true \
--strict-readiness=false \
--disable-auto-complete-config \
--model-control-mode=explicit \
--load-model abp-nvsmi-xgb
```

Alternately, for users who have checked out the Morpheus git repository, launching the Triton server container directly mounting the models from the repository is an option. This approach is most useful for users training their own models. From the root of the Morpheus repo, use the following command to launch a Docker container for Triton loading all of the included pre-trained models:
```bash
docker run --rm -ti --gpus=all -p8000:8000 -p8001:8001 -p8002:8002 \
-v $PWD/models:/models \
nvcr.io/nvidia/tritonserver:23.06-py3 \
Expand Down
Loading
Loading