From 6de215f81cdd653ae8f69df714fa4e5838d77cd4 Mon Sep 17 00:00:00 2001 From: Brian Barrett Date: Wed, 27 Sep 2023 15:47:31 -0700 Subject: [PATCH] Prep for 1.7.3 release Update release notes with note about NVLS performance and change the version number for final release. Signed-off-by: Brian Barrett --- RELEASENOTES.md | 11 +++++++++++ configure.ac | 2 +- 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/RELEASENOTES.md b/RELEASENOTES.md index 739a6be7a..3421d24a8 100644 --- a/RELEASENOTES.md +++ b/RELEASENOTES.md @@ -21,9 +21,20 @@ maintaining backward compatibility with older NCCL versions ([NCCL v2.4.8](https It was tested with Libfabric versions up to [Libfabric v1.18.1](https://github.com/ofiwg/libfabric/releases/tag/v1.18.1). +With NCCL 2.18.5 and v1.7.3-aws of the plugin, +[NVLink SHARP](https://developer.nvidia.com/blog/upgrading-multi-gpu-interconnectivity-with-the-third-generation-nvidia-nvswitch/) +is enabled for the first time on AWS platforms. NVLink SHARP offloads +the computation part of Allreduce collectives to the NVLink fabric, +and involves a different set of algorithms for multi-node parallelism +than previously used. We have seen NVLink SHARP both help and hurt +performance of applications. While NVLink SHARP is enabled by default +if NCCL 2.18.5 or later is used, users may wish to disable it by +setting `NCCL_NVLS_ENABLE=0` in the environment of your job. + New Features: Bug Fixes: +* Do not disable LL and LL128 protocols on P5 instances. * Add support for g5.48xlarge instance types. * Fix a block in use leak in the freelist implementation. * For NCCL 2.18.5 or later, don't disable NVLS support. diff --git a/configure.ac b/configure.ac index b5050f730..5bf93e386 100644 --- a/configure.ac +++ b/configure.ac @@ -6,7 +6,7 @@ # # Initialization -AC_INIT([aws-ofi-nccl], [1.7.3rc1-aws], [al-ofi-nccl-team@amazon.com], , [http://github.com/aws/aws-ofi-nccl]) +AC_INIT([aws-ofi-nccl], [1.7.3-aws], [al-ofi-nccl-team@amazon.com], , [http://github.com/aws/aws-ofi-nccl]) AC_PREREQ([2.69]) AC_CONFIG_SRCDIR([src/nccl_ofi_net.c]) AC_CONFIG_AUX_DIR([build-aux])