Streamline dependencies of docker CI images #951

ahnaf-tahmid-chowdhury · 2024-03-02T07:05:10Z

Description

This PR aims to optimize the dependencies of the docker CI images by:

Removing unused dependencies.
Updating Geant4 to version 11.2.1.
Changing the download method of HDF5 and Geant4 to use git clone from the source.
Switching solely to git clone to download dependencies, removing the dependency on wget.
Creating a new stage to store only binary files, thereby reducing the docker image size.

Impact

These changes are expected to streamline the dependencies of the docker CI images, reducing the image size and potentially improving performance.

Related Issues

Closes #750.

ahnaf-tahmid-chowdhury · 2024-03-03T03:42:39Z

Is it necessary to create an extra stage like HDF5 and MOAB, given that these are also external dependencies? I think we can remove these and instead update our workflow to create a binary/final stage to store only the binaries with minimal size, and we may choose the tag as dagmc.

gonuke

One question about installing things in the Dockerfile...

gonuke · 2024-03-04T22:16:33Z

CI/Dockerfile

+
+ARG INSTALL_DIR
+
+COPY --from=dagmc_test ${INSTALL_DIR} ${INSTALL_DIR}


What files are you copying here and why? If they aren't installed by make install then they probably aren't necessary.

In this step, I am using an empty image to copy all the installed files from the DAGMC step to the exact location. Therefore, in this image, we only contain the files available in /opt/*. There are no extra files or dependencies in this new image, which keeps the image size minimal, less than 90MB.

In this step we have, DAGMC, MOAB, HDF5, Double Down, Embree, Geant4. However, We can make it more minimal by copying only /opt/dagmc folder. I have chosen to copy the full folder as I am not sure if they are core dependency of DAGMC or only necessary to build DAGMC since we have all the binaries available that works without them.

By applying this step, we can update our docker_publish workflow to add a new tag as dagmc:latest upon PR merge, and dagmc:version/stable upon new release. This can help streamline other projects as well, such as PyNE and NukeLab. We can also update our documentations to use the docker version of DAGMC as well for the general users. Let me know if you find the idea useful.

Ah! You've already merged this PR. I was waiting for your response. The binary stage won't run until we update the docker_publish file. There are currently no instructions for this. I think I should create a new PR for it.

gonuke · 2024-03-05T18:07:51Z

Is it necessary to create an extra stage like HDF5 and MOAB, given that these are also external dependencies? I think we can remove these and instead update our workflow to create a binary/final stage to store only the binaries with minimal size, and we may choose the tag as dagmc.

I think these stages mostly appear for historical reasons to limit the amount of rebuilding of images that must occur. The philosophy is that things that occur earlier in the Dockerfile change least often and least under our control. Thus, having them installed in early as a different stage can mean fewer instances of rebuilding those sections of the file. GEANT4 is the most resource intensive to build so one of the main things to avoid.

ahnaf-tahmid-chowdhury · 2024-03-05T18:22:18Z

I think these stages mostly appear for historical reasons to limit the amount of rebuilding of images that must occur. The philosophy is that things that occur earlier in the Dockerfile change least often and least under our control. Thus, having them installed in early as a different stage can mean fewer instances of rebuilding those sections of the file. GEANT4 is the most resource intensive to build so one of the main things to avoid.

I understand. We've included these stages to minimize the need for rebuilding HDF5 and MOAB whenever we make changes to the Dockerfile. For instance, if we modify the MOAB part, Docker will trace back to the HDF5 stage and then build MOAB. Currently, GitHub workflow supports a full chase log. This means that the steps where we make changes in the Dockerfile will be traced accordingly.

gonuke · 2024-03-05T18:58:46Z

CI/Dockerfile

    cd build && \
-    ../hdf5-${HDF5_VERSION}/configure --enable-shared \
-        --prefix=${hdf5_install_dir} \
+    ../hdf5/configure --enable-shared \


@ahnaf-tahmid-chowdhury - are you familiar with configure and autotools? The tarball that we retrieve with wget has the configure file already generated. If we pull from the repo, we need to do those steps ourselves. Do they support a CMake based build? That might be more generalizable.

Not that much as I know cmake. I think autotools are based on autoconf similar to cmake. Handle things in an easier way than cmake.

I have noticed we have installed autoconf and libtool thus I haven't tried with cmake for this step. I think I can give this a try. To make it more simplified.

ahnaf-tahmid-chowdhury added 10 commits March 1, 2024 17:45

uwuw_unit_tests failing

2ea11b0

streamline dependencies

ff7ed49

remove wget

986c7d2

streamline dependencies

3b90d4c

enable geant4 and double down

5777b07

add gcc

b77bb77

only over 22

d491acf

only over 22

0944001

Add zlib1g-dev for ubuntu 20.04

05fcbfa

geant4 11.2.1

a715e9c

ahnaf-tahmid-chowdhury added the build & CI label Mar 2, 2024

Streamline dependencies of docker CI images

cf6363c

ahnaf-tahmid-chowdhury linked an issue Mar 2, 2024 that may be closed by this pull request

System HDF5 on CI #608

Closed

ahnaf-tahmid-chowdhury mentioned this pull request Mar 2, 2024

request to moving location of Dockerfile #814

Open

gonuke requested changes Mar 4, 2024

View reviewed changes

gonuke approved these changes Mar 5, 2024

View reviewed changes

gonuke merged commit 594f7f7 into develop Mar 5, 2024
1 check passed

gonuke reviewed Mar 5, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Streamline dependencies of docker CI images #951

Streamline dependencies of docker CI images #951

ahnaf-tahmid-chowdhury commented Mar 2, 2024

ahnaf-tahmid-chowdhury commented Mar 3, 2024

gonuke left a comment

gonuke Mar 4, 2024

ahnaf-tahmid-chowdhury Mar 5, 2024

ahnaf-tahmid-chowdhury Mar 5, 2024

ahnaf-tahmid-chowdhury Mar 5, 2024

ahnaf-tahmid-chowdhury Mar 5, 2024

gonuke commented Mar 5, 2024

ahnaf-tahmid-chowdhury commented Mar 5, 2024

gonuke Mar 5, 2024

ahnaf-tahmid-chowdhury Mar 5, 2024

ahnaf-tahmid-chowdhury Mar 5, 2024


		ARG INSTALL_DIR

		COPY --from=dagmc_test ${INSTALL_DIR} ${INSTALL_DIR}

Streamline dependencies of docker CI images #951

Streamline dependencies of docker CI images #951

Conversation

ahnaf-tahmid-chowdhury commented Mar 2, 2024

Description

Impact

Related Issues

ahnaf-tahmid-chowdhury commented Mar 3, 2024

gonuke left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gonuke commented Mar 5, 2024

ahnaf-tahmid-chowdhury commented Mar 5, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment