Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extensible configuration of the superbuild location #2426

Merged
merged 169 commits into from
Sep 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
169 commits
Select commit Hold shift + click to select a range
fed257d
Adding some flexibility in the customized_build_env script to make the
bvanessen Feb 14, 2024
81a664a
Adding code to explicitly get the hostname for the superbuild configu…
bvanessen Feb 15, 2024
949bffc
Updated to the latest ROCm versions.
bvanessen Apr 9, 2024
60cb632
Added some env variables for RCCL
bvanessen Apr 9, 2024
045c7b0
Add spack type for mi300a
bvanessen Apr 17, 2024
d38da1d
Only include the external CUDA libraries on cuda systems.
bvanessen Apr 29, 2024
2353251
Fixed the external modules for cray-mpich.
bvanessen Apr 29, 2024
0692c03
Ensure that the CMAKE_PREFIX_PATH is captured in the superbuild sugge…
bvanessen Jun 12, 2024
be22132
Automatically output the suggested cmake prefix path to the install d…
bvanessen Jun 12, 2024
edd2659
Forwarded the CMAKE_PREFIX_PATH to the LBANN build.
bvanessen Jun 12, 2024
6cc61aa
Added a flag to the build_lbann.sh script to specify a directory of s…
bvanessen Jun 12, 2024
3fc2169
Added the superbuild-prefix to the Pascal CI pipeline.
bvanessen Jun 12, 2024
5e345f1
Disable caliper and force [email protected]
bvanessen Jun 12, 2024
a07f99d
Switch back to using the system specific spack.
bvanessen Jun 13, 2024
8906a67
Force the use of normal zlib
bvanessen Jun 13, 2024
24615f4
Force the use of normal zlib
bvanessen Jun 13, 2024
ce70053
Temporarily disable half on Pascal.
bvanessen Jun 13, 2024
9f71345
Split the superbuild scripts into core dependencies and DHA dependenc…
bvanessen Jun 13, 2024
472d426
Added superbuild script for DHA with half.
bvanessen Jun 13, 2024
5f7fc4f
Updated the build scripts to allow for specific DHA compiled versions.
bvanessen Jun 13, 2024
91b9a71
Reenabled half on pascal CI test.
bvanessen Jun 13, 2024
e812fb3
Allow for newer gcc compilers.
bvanessen Jun 14, 2024
ab05970
Updated all of the pascal CI scripts to use the new stable dependencies.
bvanessen Jun 14, 2024
f609910
Updating the Tioga scripts to use the superbuild.
bvanessen Jun 14, 2024
6bdf0df
Fixed the sense of the shared variant on protobuf.
bvanessen Jun 15, 2024
899b824
Updated the AMD ROCm stack to 6.1.2
bvanessen Jun 15, 2024
0f825f6
Adding path for external HWLOC in superbuild stable dependencies. Ad…
bvanessen Jun 16, 2024
c31736e
Add aws-ofi-rccl to the superbuild externals.
bvanessen Jun 16, 2024
a97ed1c
Fixed typo
bvanessen Jun 17, 2024
ea77035
Fix how the CMAKE_PREFIX_PATH is forwarded to DHA libraries.
bvanessen Jun 17, 2024
9edde7a
Fix how the CMAKE_PREFIX_PATH is forwarded to DHA libraries.
bvanessen Jun 17, 2024
4cb5ccd
Updating the Tioga superbuild scripts to force the runpaths to be pro…
bvanessen Jun 17, 2024
82c0d76
Updating the Pascal superbuild scripts to force the runpaths to be pr…
bvanessen Jun 17, 2024
dabeac1
Added CMake flags to enable shared library builds.
bvanessen Jun 17, 2024
40f4b5c
Added a path to cuTensor for x86_64 platforms.
bvanessen Jun 17, 2024
0a5982e
Added a path to the correct miopen.
bvanessen Jun 17, 2024
21fa5cf
Mark the new MIOpen as develop.
bvanessen Jun 17, 2024
360837c
Disable the superbuild on Corona and Lassen
bvanessen Jun 17, 2024
16fd585
Fixed the install path.
bvanessen Jun 17, 2024
8fc2e73
Add some logic to clean up the initial CMAKE_INSTALL_RPATH
benson31 Jun 17, 2024
07a5de7
Remove system paths from build rpath
benson31 Jun 17, 2024
4b32fd6
Temporarily disable Corona and Lassen tests.
bvanessen Jun 18, 2024
4034c78
Fixed how the CMake environment sets up the PYTHONPATH and caches it
bvanessen Jun 19, 2024
ca797db
Revert back to ROCm 5.7.1
bvanessen Jun 19, 2024
69113df
Updated the superbuild scripts to use LDD and Gold linkers as
bvanessen Jun 19, 2024
a022170
Removing custom MIOpen build.
bvanessen Jun 19, 2024
e2e8a17
Added the build modules to the LBANN_DEPENDENT_MODULES so that they a…
bvanessen Jun 19, 2024
8ab2d82
Fixed how the LBANN_DEPENDENT_MODULES are composed.
bvanessen Jun 19, 2024
545ef71
Temporarily reduce the time for Tioga jobs
bvanessen Jun 20, 2024
161e799
Try a different set of modules for Tioga.
bvanessen Jun 20, 2024
2617c30
Trimming time/
bvanessen Jun 20, 2024
91997e7
Fixed grouping on link flags. Fixed RPATH issues for build and insta…
bvanessen Jun 24, 2024
a91fea6
Increasing the precision of the reported error for check metric.
bvanessen Jun 25, 2024
22108ac
Force the installation of pip packages in the installed location to
bvanessen Jun 25, 2024
7565e80
Fixed the time.
bvanessen Jun 25, 2024
a8e7c0c
Correctly set the --force-reinstall flag on the pip command.
bvanessen Jun 25, 2024
a218243
Correcting the nightly time limit.
bvanessen Jun 25, 2024
7ed3b5e
Set the CXX and CUDA flags to an optimized build.
bvanessen Jun 25, 2024
6919899
Updated the Tioga builds to include the PE_ENV field in the stable de…
bvanessen Jun 25, 2024
1ca4199
Updated the build path so that the source files can be saved for debu…
bvanessen Jun 26, 2024
4a3501b
Updated the build path so that the source files can be saved for debu…
bvanessen Jun 26, 2024
ad64cb9
Removed the pip force-reinstall
bvanessen Jun 26, 2024
997d4ed
Fixed pascal build path.
bvanessen Jun 26, 2024
a7c6c6f
Fixed the quotes around the linker flags.
bvanessen Jun 26, 2024
352a8d3
Do not use gold linker for core dependencies because protobuf fails.
bvanessen Jun 26, 2024
1860023
Fixed typo
bvanessen Jun 26, 2024
1b08f50
Updated the version of half to 2.2.0
bvanessen Jun 26, 2024
440dc06
Did not set the loaded modules in the LBANN module file.
bvanessen Jun 26, 2024
3e18d25
Include ROCM_PATH/lib to RPATH. Switch Pascal back to gcc/10.3.1.
bvanessen Jun 26, 2024
8ece61d
Switch Pascal CI to using Clang 14. Added compiler into the CI
bvanessen Jun 26, 2024
1d6d4ff
Fixed compiler paths and typos.
bvanessen Jun 26, 2024
0d8c7ca
Fixed typo.
bvanessen Jun 26, 2024
0866178
Commented out unused variable.
bvanessen Jun 26, 2024
d5176b1
Log file for superbuild shell script is now defined in the environmen…
bvanessen Jun 26, 2024
dec483a
Fixed the extra RPATH on cray.
bvanessen Jun 27, 2024
2f60f1b
Switched back to half v2.1.0. Added logging for the modules used to
bvanessen Jun 27, 2024
4272c86
Fixing the extra RPATHs field to handle multiple entries.
bvanessen Jun 27, 2024
d39d5fe
Add an updated time limit for the reconstruction loss unit test.
bvanessen Jun 27, 2024
bfea5e6
Add EnsureComm calls to truncation selection algo
benson31 Jun 27, 2024
4a8ecaf
Use a vertical | to avoid issues propagating ;.
bvanessen Jun 27, 2024
d133abe
Constrain version of NumPy to 1.22.3
bvanessen Jun 27, 2024
4fe80e0
Removed the -02 optimization flags from the pascal and tioga
bvanessen Jun 28, 2024
831adc5
Added superbuild scripts for Corona. Added hipTT to build_lbann.sh
bvanessen Jun 28, 2024
84ad3bf
Moved the definition of the external hiptt to a ROCm only section.
bvanessen Jun 28, 2024
14cf442
Update Corona to ROCm 6.0.2
bvanessen Jul 1, 2024
8755cb7
Changed the Corona externals to use variable for ROCm version.
bvanessen Jul 2, 2024
dc3f1f3
Exporting the shell variable.
bvanessen Jul 2, 2024
00833eb
Moved when the ROCm version is defined.
bvanessen Jul 2, 2024
d5639da
Back to 6.0.2
bvanessen Jul 2, 2024
9a74dbe
Trying a unified single pipeline for Pascal CI.
bvanessen Jul 2, 2024
62d5bec
Working on updating the CI builds to use a more direct script setup.
bvanessen Jul 30, 2024
3972df7
Added configure scripts for LBANN and a script to run the unit and in…
bvanessen Jul 30, 2024
661b511
Cleaning up the CI scripts.
bvanessen Jul 31, 2024
0f180d2
Added GitLab CI yaml files.
bvanessen Jul 31, 2024
4771fec
Lowered the git depth.
bvanessen Jul 31, 2024
84aa772
Fix the submodule strategy.
bvanessen Jul 31, 2024
20281ca
Fixed the CI tests to use 2 nodes. Better error handling.
bvanessen Jul 31, 2024
60e264c
Fixed the name of the test result files so that they would be picked …
bvanessen Jul 31, 2024
4564112
Added a test pascal pipeline.
bvanessen Aug 1, 2024
396dbdb
Fixed how the DistConv flag is propagated.
bvanessen Aug 1, 2024
18bf20d
Added external flags for building with HALF and FFT support. Limited…
bvanessen Aug 1, 2024
9c9119f
Cleaning up code.
bvanessen Aug 1, 2024
07f9fce
Added distconv pascal test.
bvanessen Aug 1, 2024
e84fb10
Fix the status capture.
bvanessen Aug 1, 2024
f8d37fd
Fixed logic bug in bash.
bvanessen Aug 1, 2024
adc008f
Fixed the include path to Half and disabled FFT
bvanessen Aug 1, 2024
a11bd66
Fixed the failed test reporting and that distconv and half don't play…
bvanessen Aug 1, 2024
ba04786
Extend the mpi catch tests time limit.
bvanessen Aug 1, 2024
c019345
Added optimization flags for DHA
bvanessen Aug 2, 2024
1bc3cd4
Temporarily force rebuild of dependencies.
bvanessen Aug 2, 2024
2ee089d
Fixed typo
bvanessen Aug 2, 2024
77fd1a4
Added Corona to new CI.
bvanessen Aug 2, 2024
47d9973
Added config for Lassen.
bvanessen Aug 6, 2024
84de254
Fixed how the lapack argument is passed to Hydrogen
bvanessen Aug 6, 2024
ab6e9d2
Fixed flag for LBANN BLA.
bvanessen Aug 6, 2024
a7219fa
Added scripts to install core dependencies for lassen.
bvanessen Aug 6, 2024
4a10d9a
Added Lassen CI.
bvanessen Aug 6, 2024
51f7f98
Adding in some help for extra rpaths.
bvanessen Aug 6, 2024
6737e4a
Force LBANN to RPATH DHA libraries inside of the project.
bvanessen Aug 6, 2024
9b5cd55
Improve the reporting of the MPI catch tests. Consolidated all of the
bvanessen Aug 6, 2024
e6904b0
Force rebuild again.
bvanessen Aug 6, 2024
7e49f85
Updated Lassen to use a newer python. Tweaking how rpath's are set.
bvanessen Aug 7, 2024
0b99f25
Fixed quoting on RPATH
bvanessen Aug 7, 2024
2f40595
Fixed the path for the catch tests.
bvanessen Aug 7, 2024
c10ab02
Fixed up a few shell details to make switching PEs simpler.
bvanessen Aug 13, 2024
21aaba7
Building for Mi300A as well as 250.
bvanessen Aug 13, 2024
2ca99eb
Stop hardcoding the CRAY_MPICH_VERSION
bvanessen Aug 13, 2024
696544f
Added the ability to export the AWS_OFI_RCCL plugin to the
bvanessen Aug 13, 2024
cd4cec4
Tweak the Tioga build environment.
bvanessen Aug 13, 2024
60ff967
Work on building the dependencies on PrgEnv-cray.
bvanessen Aug 13, 2024
1f6fee9
Fixed accidental debugging code.
bvanessen Aug 13, 2024
d762531
Added DiHydrogen cache check. Only add Half prefix path when asked for.
bvanessen Aug 14, 2024
edca795
Add the hash for H2.
bvanessen Aug 15, 2024
7d561fc
Ensure that for AMD/HIP/ROCm systems all three fields GPU_TARGETS,
bvanessen Aug 15, 2024
77c3b4f
Disable FFT on Lassen
bvanessen Aug 16, 2024
75b42ff
Disable installing torch.
bvanessen Aug 16, 2024
3366b27
Disable FFT on lassen right now.
bvanessen Aug 16, 2024
16777c5
Set proper AMD architectures.
bvanessen Aug 16, 2024
4e581ea
Use a special PR for 6.2.0
bvanessen Aug 17, 2024
e347368
Explicitly turned on the half feature, which is not properly disabled…
bvanessen Aug 19, 2024
0547765
When not using a flag, set it to a NULL string, not 0.
bvanessen Aug 19, 2024
71a5eeb
Reporting the state of the build script DHA features.
bvanessen Aug 20, 2024
70b0645
Set flag to ON not 1
bvanessen Aug 20, 2024
84d1de7
Fix when local 6.2.0 MIOpen library is linked in.
bvanessen Aug 20, 2024
a6c96d1
Auto-detect the CUDA version and compiler version.
bvanessen Aug 21, 2024
b445d53
Working to consolidate how the core dependencies are built to use the
bvanessen Aug 21, 2024
4edfe32
Cleaning up Power and HIP specific flags.
bvanessen Aug 21, 2024
54a0fd5
Added support for creating a Python virtual environment in the CI stack.
bvanessen Aug 22, 2024
b7e09bd
Removed older core platform specific dependency scripts.
bvanessen Aug 22, 2024
74ea19e
Update python/lbann/contrib/lc/launcher.py
bvanessen Aug 22, 2024
df40eeb
Add pytest to the venv. Cleaned up.
bvanessen Aug 22, 2024
55431d0
Added code to build OpenBLAS on Power and then install standard libra…
bvanessen Aug 23, 2024
cafa97c
Only create the virtual environment if it doesn't exist.
bvanessen Aug 23, 2024
cf616f7
Fix typo.
bvanessen Aug 23, 2024
c310bdf
Changed to installing all of the PIP installs in the virtual env dire…
bvanessen Aug 23, 2024
2a6fa09
Cleanup.
bvanessen Aug 23, 2024
77ff196
Apply suggestions from code review
bvanessen Aug 24, 2024
a6dd4d5
Renamed variable AWS_OFI_RCCL_LIBRARY to AWS_OFI_RCCL_LIBDIR.
bvanessen Aug 24, 2024
0bc591c
Gather the build logs for the DHA dependencies and keep them as artif…
bvanessen Aug 24, 2024
f3880b1
Added some cmake logic to capture the path to the python venv used du…
bvanessen Aug 27, 2024
6a62aa5
Removed bad debug statement.
bvanessen Aug 27, 2024
c47b429
If a python virtual enviornment was defined and used during the build
bvanessen Aug 27, 2024
bf3f0f3
Trying to fix a bug where lbann_pfe.sh isn't found after loading the
bvanessen Aug 28, 2024
901dcae
Temporarily remove the lua code to activate the virtual environment.
bvanessen Aug 28, 2024
10434e7
Debugging modules.
bvanessen Aug 28, 2024
6068ad1
Disabled always rebuilding the dependencies. Added a check to
bvanessen Aug 28, 2024
53fdf98
Removed debugging code.
bvanessen Aug 28, 2024
b2d573e
Updated the Tioga tests to use ROCm 6.2.1beta1 and craycc.
bvanessen Aug 28, 2024
309af7c
Rewound the Tioga ROCm versions.
bvanessen Aug 28, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
105 changes: 18 additions & 87 deletions .gitlab-ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,110 +28,41 @@
# clusters. To run testing locally, consult the README in the ci_test
# directory.

variables:
FF_USE_NEW_BASH_EVAL_STRATEGY: 'true'
FF_ENABLE_BASH_EXIT_CODE_CHECK: 1
LBANN_CI_CLEAN_BUILD: 'true'
include:
- project: 'lc-templates/id_tokens'
file: 'id_tokens.yml'

stages:
- run-all-clusters

corona testing:
stage: run-all-clusters
variables:
WITH_WEEKLY: "${LBANN_CI_RUN_WEEKLY}"
WITH_CLEAN_BUILD: "${LBANN_CI_CLEAN_BUILD}"
trigger:
strategy: depend
include: .gitlab/corona/pipeline.yml

corona distconv testing:
stage: run-all-clusters
variables:
JOB_NAME_SUFFIX: _distconv
SPACK_ENV_BASE_NAME_MODIFIER: "-distconv"
SPACK_SPECS: "+rocm +distconv"
WITH_WEEKLY: "${LBANN_CI_RUN_WEEKLY}"
WITH_CLEAN_BUILD: "${LBANN_CI_CLEAN_BUILD}"
TEST_FLAG: "test_*_distconv.py"
trigger:
strategy: depend
include: .gitlab/corona/pipeline.yml

lassen testing:
stage: run-all-clusters
variables:
WITH_WEEKLY: "${LBANN_CI_RUN_WEEKLY}"
WITH_CLEAN_BUILD: "${LBANN_CI_CLEAN_BUILD}"
trigger:
strategy: depend
include: .gitlab/lassen/pipeline.yml

lassen distconv testing:
tioga testing:
stage: run-all-clusters
variables:
JOB_NAME_SUFFIX: _distconv
SPACK_ENV_BASE_NAME_MODIFIER: "-multi-stage-distconv"
SPACK_SPECS: "+cuda +distconv +fft"
# SPACK_SPECS: "+cuda +distconv +nvshmem +fft"
WITH_WEEKLY: "${LBANN_CI_RUN_WEEKLY}"
WITH_CLEAN_BUILD: "${LBANN_CI_CLEAN_BUILD}"
TEST_FLAG: "test_*_distconv.py"
trigger:
strategy: depend
include: .gitlab/lassen/multi_stage_pipeline.yml
include: '.gitlab/build-and-test-tioga.yml'
forward:
pipeline_variables: true

pascal testing:
stage: run-all-clusters
variables:
WITH_WEEKLY: "${LBANN_CI_RUN_WEEKLY}"
WITH_CLEAN_BUILD: "${LBANN_CI_CLEAN_BUILD}"
trigger:
strategy: depend
include: .gitlab/pascal/pipeline.yml
include: '.gitlab/build-and-test-pascal.yml'
forward:
pipeline_variables: true

pascal compiler testing:
stage: run-all-clusters
variables:
SPACK_SPECS: "%[email protected] +cuda +half +fft"
BUILD_SCRIPT_OPTIONS: "--no-default-mirrors"
WITH_CLEAN_BUILD: "${LBANN_CI_CLEAN_BUILD}"
trigger:
strategy: depend
include: .gitlab/pascal/pipeline_compiler_tests.yml

pascal distconv testing:
stage: run-all-clusters
variables:
JOB_NAME_SUFFIX: _distconv
SPACK_SPECS: "%[email protected] +cuda +distconv +fft"
BUILD_SCRIPT_OPTIONS: "--no-default-mirrors"
WITH_WEEKLY: "${LBANN_CI_RUN_WEEKLY}"
WITH_CLEAN_BUILD: "${LBANN_CI_CLEAN_BUILD}"
TEST_FLAG: "test_*_distconv.py"
trigger:
strategy: depend
include: .gitlab/pascal/pipeline.yml

tioga testing:
corona testing:
stage: run-all-clusters
variables:
# FF_USE_NEW_BASH_EVAL_STRATEGY: 1
WITH_WEEKLY: "${LBANN_CI_RUN_WEEKLY}"
WITH_CLEAN_BUILD: "${LBANN_CI_CLEAN_BUILD}"
trigger:
strategy: depend
include: .gitlab/tioga/pipeline.yml
include: '.gitlab/build-and-test-corona.yml'
forward:
pipeline_variables: true

tioga distconv testing:
lassen testing:
stage: run-all-clusters
variables:
JOB_NAME_SUFFIX: _distconv
SPACK_ENV_BASE_NAME_MODIFIER: "-distconv"
SPACK_SPECS: "+rocm +distconv"
WITH_WEEKLY: "${LBANN_CI_RUN_WEEKLY}"
WITH_CLEAN_BUILD: "${LBANN_CI_CLEAN_BUILD}"
TEST_FLAG: "test_*_distconv.py"
trigger:
strategy: depend
include: .gitlab/tioga/pipeline.yml
include: '.gitlab/build-and-test-lassen.yml'
forward:
pipeline_variables: true
56 changes: 56 additions & 0 deletions .gitlab/build-and-test-common.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
################################################################################
## Copyright (c) 2014-2024, Lawrence Livermore National Security, LLC.
## Produced at the Lawrence Livermore National Laboratory.
## Written by the LBANN Research Team (B. Van Essen, et al.) listed in
## the CONTRIBUTORS file. <[email protected]>
##
## LLNL-CODE-697807.
## All rights reserved.
##
## This file is part of LBANN: Livermore Big Artificial Neural Network
## Toolkit. For details, see http://software.llnl.gov/LBANN or
## https://github.com/LLNL/LBANN.
##
## Licensed under the Apache License, Version 2.0 (the "Licensee"); you
## may not use this file except in compliance with the License. You may
## obtain a copy of the License at:
##
## http://www.apache.org/licenses/LICENSE-2.0
##
## Unless required by applicable law or agreed to in writing, software
## distributed under the License is distributed on an "AS IS" BASIS,
## WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
## implied. See the License for the specific language governing
## permissions and limitations under the license.
################################################################################

.build-and-test-base:
variables:
LLNL_SERVICE_USER: lbannusr
LLNL_SLURM_SCHEDULER_PARAMETERS: "-N2 -t 90"
LLNL_FLUX_SCHEDULER_PARAMETERS: "-N2 -t 120m"
LLNL_LSF_SCHEDULER_PARAMETERS: "-q pbatch -nnodes 2 -W 60"
GIT_SUBMODULE_STRATEGY: none
GIT_DEPTH: 5
script:
- printenv > ${CI_PROJECT_DIR}/ci_environment.log
- ${CI_PROJECT_DIR}/.gitlab/build-and-test.sh
cache:
key: $CI_JOB_NAME_SLUG
paths:
- install-deps-${CI_JOB_NAME_SLUG}
timeout: 6h

.build-and-test:
artifacts:
when: always
paths:
bvanessen marked this conversation as resolved.
Show resolved Hide resolved
- "${CI_PROJECT_DIR}/*junit.*xml"
- "${CI_PROJECT_DIR}/ci_environment.log"
- "${CI_PROJECT_DIR}/build-${CI_JOB_ID}/build-lbann/build.ninja"
- "${CI_PROJECT_DIR}/build-${CI_JOB_ID}/build-lbann/CMakeFiles/rules.ninja"
- "${CI_PROJECT_DIR}/build-${CI_JOB_ID}/build-deps/all_build_files.tar.gz"
- "${CI_PROJECT_DIR}/build-${CI_JOB_ID}/build-deps/all_output_logs.tar.gz"
reports:
junit: "${CI_PROJECT_DIR}/*junit.*xml"
extends: .build-and-test-base
54 changes: 54 additions & 0 deletions .gitlab/build-and-test-corona.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
################################################################################
## Copyright (c) 2014-2024, Lawrence Livermore National Security, LLC.
## Produced at the Lawrence Livermore National Laboratory.
## Written by the LBANN Research Team (B. Van Essen, et al.) listed in
## the CONTRIBUTORS file. <[email protected]>
##
## LLNL-CODE-697807.
## All rights reserved.
##
## This file is part of LBANN: Livermore Big Artificial Neural Network
## Toolkit. For details, see http://software.llnl.gov/LBANN or
## https://github.com/LLNL/LBANN.
##
## Licensed under the Apache License, Version 2.0 (the "Licensee"); you
## may not use this file except in compliance with the License. You may
## obtain a copy of the License at:
##
## http://www.apache.org/licenses/LICENSE-2.0
##
## Unless required by applicable law or agreed to in writing, software
## distributed under the License is distributed on an "AS IS" BASIS,
## WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
## implied. See the License for the specific language governing
## permissions and limitations under the license.
################################################################################

default:
id_tokens:
SITE_ID_TOKEN:
aud: https://lc.llnl.gov/gitlab

stages:
- build

include:
local: "/.gitlab/build-and-test-common.yml"

rocm-5-7-1-corona:
variables:
COMPILER_FAMILY: amdclang
MODULES: "rocm/5.7.1 clang/14.0.6-magic openmpi/4.1.2"
extends: .build-and-test-on-corona

rocm-5-7-1-distconv-corona:
variables:
COMPILER_FAMILY: amdclang
MODULES: "rocm/5.7.1 clang/14.0.6-magic openmpi/4.1.2"
WITH_DISTCONV: "ON"
extends: .build-and-test-on-corona

.build-and-test-on-corona:
stage: build
tags: [corona, batch]
extends: .build-and-test
55 changes: 55 additions & 0 deletions .gitlab/build-and-test-lassen.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
################################################################################
## Copyright (c) 2014-2024, Lawrence Livermore National Security, LLC.
## Produced at the Lawrence Livermore National Laboratory.
## Written by the LBANN Research Team (B. Van Essen, et al.) listed in
## the CONTRIBUTORS file. <[email protected]>
##
## LLNL-CODE-697807.
## All rights reserved.
##
## This file is part of LBANN: Livermore Big Artificial Neural Network
## Toolkit. For details, see http://software.llnl.gov/LBANN or
## https://github.com/LLNL/LBANN.
##
## Licensed under the Apache License, Version 2.0 (the "Licensee"); you
## may not use this file except in compliance with the License. You may
## obtain a copy of the License at:
##
## http://www.apache.org/licenses/LICENSE-2.0
##
## Unless required by applicable law or agreed to in writing, software
## distributed under the License is distributed on an "AS IS" BASIS,
## WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
## implied. See the License for the specific language governing
## permissions and limitations under the license.
################################################################################

default:
id_tokens:
SITE_ID_TOKEN:
aud: https://lc.llnl.gov/gitlab

stages:
- build

include:
local: "/.gitlab/build-and-test-common.yml"

# fftw/3.3.10-gcc-11.2.1
clang-16-0-6-gcc-11-2-1-cuda-12-2-2-lassen:
variables:
COMPILER_FAMILY: clang
MODULES: "clang/16.0.6-gcc-11.2.1 spectrum-mpi/rolling-release cuda/12.2.2 cmake/3.29.2 python/3.11.5"
extends: .build-and-test-on-lassen

clang-16-0-6-gcc-11-2-1-cuda-12-2-2-distconv-lassen:
variables:
COMPILER_FAMILY: clang
MODULES: "clang/16.0.6-gcc-11.2.1 spectrum-mpi/rolling-release cuda/12.2.2 cmake/3.29.2 python/3.11.5"
WITH_DISTCONV: "ON"
extends: .build-and-test-on-lassen

.build-and-test-on-lassen:
stage: build
tags: [lassen, batch]
extends: .build-and-test
54 changes: 54 additions & 0 deletions .gitlab/build-and-test-pascal.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
################################################################################
## Copyright (c) 2014-2024, Lawrence Livermore National Security, LLC.
## Produced at the Lawrence Livermore National Laboratory.
## Written by the LBANN Research Team (B. Van Essen, et al.) listed in
## the CONTRIBUTORS file. <[email protected]>
##
## LLNL-CODE-697807.
## All rights reserved.
##
## This file is part of LBANN: Livermore Big Artificial Neural Network
## Toolkit. For details, see http://software.llnl.gov/LBANN or
## https://github.com/LLNL/LBANN.
##
## Licensed under the Apache License, Version 2.0 (the "Licensee"); you
## may not use this file except in compliance with the License. You may
## obtain a copy of the License at:
##
## http://www.apache.org/licenses/LICENSE-2.0
##
## Unless required by applicable law or agreed to in writing, software
## distributed under the License is distributed on an "AS IS" BASIS,
## WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
## implied. See the License for the specific language governing
## permissions and limitations under the license.
################################################################################

default:
id_tokens:
SITE_ID_TOKEN:
aud: https://lc.llnl.gov/gitlab

stages:
- build

include:
local: "/.gitlab/build-and-test-common.yml"

clang-14-0-6-cuda-11-8-0-pascal:
variables:
COMPILER_FAMILY: clang
MODULES: "clang/14.0.6-magic openmpi/4.1.2 cuda/11.8.0 ninja/1.11.1"
WITH_HALF: "ON"
extends: [.build-and-test-on-pascal, .build-and-test]

clang-14-0-6-cuda-11-8-0-distconv-pascal:
variables:
COMPILER_FAMILY: clang
MODULES: "clang/14.0.6-magic openmpi/4.1.2 cuda/11.8.0 ninja/1.11.1"
WITH_DISTCONV: "ON"
extends: [.build-and-test-on-pascal, .build-and-test]

.build-and-test-on-pascal:
stage: build
tags: [pascal, batch]
Loading
Loading