Build and test Docker image using `bioconductor/bioconductor_docker:3.19` and `renv` #3

jaclyn-taroni · 2024-08-04T12:33:29Z

Hi @maud-p! I am filing this today because I expect you might look at it before my workday starts tomorrow. 😊

If you were to merge this, it would update your branch in AlexsLemonade#681

You do not necessarily need to merge this, but we’ve found that sometimes it’s easier to express a set of interconnected ideas with a pull request to a pull request instead of individual comments on files. So, even if we end up picking and choosing, I think a pull request is still helpful!

I’ll summarize some main points here and then add comments to individual lines and files with more explanation when necessary.

Using bioconductor/bioconductor_docker:3.19 as the base image for (🤞) greater platform compatibility
- More of an explanation here (I have since been able to test this!): https://github.com/AlexsLemonade/OpenScPCA-analysis/pull/681/files#r1702065259
Using renv and committing all of the files renv::init() produces to the repository for more flexibility (i.e., one could develop outside of the container just using renv in addition to using the Docker container)
- I had a problem with Bioconductor versions like you (https://github.com/AlexsLemonade/OpenScPCA-analysis/pull/681/files#r1701991128), but once I ran renv::update() and took a snapshot, I was okay: jaclyn-taroni@46c7e8b
- I got started with adding packages that I expect you’ll need from Start wilms analysis AlexsLemonade/OpenScPCA-analysis#681
Removing the files specific to how your lab uses Docker and adding them to .gitignore
- I don’t think we want to add these to the AlexsLemonade version of the repository because we don’t want to make assumptions about podman installation or the YAML parsing.
- However, because this is ignored, you could add these files locally back and still use them for your development.
- Adding a section to the README that explains how to work with the Docker image using docker run instead because it is consistent with the project docs.
  - I have been able to test this on various systems to the best of my ability, but it would be wonderful if you could see if it works for you! 😄
Adding GitHub Action workflows
- I think it’s time to start testing if the Docker image can be built and pushing it to the Elastic Container Registry (.github/workflows/docker_cell-type-wilms-tumor-06.yml)
  - I expect this addition will help speed up subsequent review because we won’t have to focus on whether or not the image builds!
- I also added essentially a stub for testing running your module’s code on test data, but it’s not doing much yet. We can start using it in 01_clustering AlexsLemonade/OpenScPCA-analysis#680.

jaclyn-taroni

Returning some more info 😄

jaclyn-taroni · 2024-08-04T12:34:27Z

.github/workflows/docker_cell-type-wilms-tumor-06.yml

This is just using the template for building and pushing Docker images but tailored to your module: https://github.com/AlexsLemonade/OpenScPCA-analysis/blob/main/templates/workflows/docker_module.yml

jaclyn-taroni · 2024-08-04T12:35:34Z

.github/workflows/docker_cell-type-wilms-tumor-06.yml

+  pull_request:
+    branches:
+      - main
+    paths:
+      - "analyses/cell-type-wilms-tumor-06/Dockerfile"
+      - "analyses/cell-type-wilms-tumor-06/.dockerignore"
+      - "analyses/cell-type-wilms-tumor-06/renv.lock"
+      - "analyses/cell-type-wilms-tumor-06/conda-lock.yml"


I've uncommented these lines, so if relevant environment files are altered in a pull request (to AlexsLemonade:main), it will test if the Docker image can be built. I think we're ready to start testing that!

jaclyn-taroni · 2024-08-04T12:36:33Z

.github/workflows/docker_all-modules.yml

@@ -38,6 +38,7 @@ jobs:
          - simulate-sce
          - cell-type-ewings
          - doublet-detection
+          - cell-type-wilms-tumor-06


We're going to periodically (monthly, to be exact!) test that all of the images can be built. This is adding your Docker image to the list that gets tested.

jaclyn-taroni · 2024-08-04T12:37:05Z

.github/workflows/run_cell-type-wilms-tumor-06.yml

This is using the template: https://github.com/AlexsLemonade/OpenScPCA-analysis/blob/main/templates/workflows/run_renv-module.yml

jaclyn-taroni · 2024-08-04T12:37:56Z

.github/workflows/run_cell-type-wilms-tumor-06.yml

+  # workflow_call:
+  # pull_request:
+  #   branches:
+  #     - main
+  #   paths:
+  #     - "analyses/cell-type-wilms-tumor-06/**"


This won't get used on pull requests to AlexsLemonade:main until we uncomment those lines. It doesn't matter for AlexsLemonade#681, but we'll probably want to uncomment them in AlexsLemonade#680.

jaclyn-taroni · 2024-08-04T12:56:07Z

analyses/cell-type-wilms-tumor-06/components/dependencies.R

+# Tidyverse
+library(tidyverse)


As a substitute for using the bioconductor:tidyverse:3.19 image, we can pin this dependency as described here: https://openscpca.readthedocs.io/en/latest/ensuring-repro/managing-software/using-renv/#pinning-dependencies-that-are-not-captured-automatically

jaclyn-taroni · 2024-08-04T12:57:02Z

analyses/cell-type-wilms-tumor-06/components/dependencies.R

+library(Seurat)  # remotes::install_github("satijalab/[email protected]")
+library(presto)  # remotes::install_github("immunogenomics/presto")
+library(Azimuth)  # remotes::install_github("satijalab/azimuth")
+library(SCpubr)


For this package, I could pin the dependency this way and have it installed via typical renv means.

jaclyn-taroni · 2024-08-04T12:58:22Z

analyses/cell-type-wilms-tumor-06/renv/activate.R

This file is added when you run renv::init(). Including it here means you could use renv outside of Docker if you wanted to! (Same for the renv/.gitignore and renv/settings.json files.)

jaclyn-taroni · 2024-08-04T12:59:31Z

analyses/cell-type-wilms-tumor-06/README.md

+  --mount type=bind,target=/home/rstudio/OpenScPCA-analysis,source=$PWD \
+  -e PASSWORD={PASSWORD} \
+  -p 8787:8787 \
+  public.ecr.aws/openscpca/cell-type-wilms-tumor-06:latest


When you test, you'd want this to be the copy you build locally instead: openscpca/cell-type-wilms-tumor-06 since it won't be on ECR yet!

.github/workflows/run_cell-type-wilms-tumor-06.yml

maud-p · 2024-08-04T22:13:55Z

Dear @jaclyn-taroni ,

Thank you very much for your efforts and help!! I am still in the process of understanding/testing your suggestion but wanted to give you some feedback.

Hi @maud-p! I am filing this today because I expect you might look at it before my workday starts tomorrow. 😊

If you were to merge this, it would update your branch in AlexsLemonade#681

I merged all your changes in maud-p:start-wilms-analysis and take it now as a refenrence!

You do not necessarily need to merge this, but we’ve found that sometimes it’s easier to express a set of interconnected ideas with a pull request to a pull request instead of individual comments on files. So, even if we end up picking and choosing, I think a pull request is still helpful!

I’ll summarize some main points here and then add comments to individual lines and files with more explanation when necessary.

Using bioconductor/bioconductor_docker:3.19 as the base image for (🤞) greater platform compatibility

More of an explanation here (I have since been able to test this!): https://github.com/AlexsLemonade/OpenScPCA-analysis/pull/681/files#r1702065259

Using renv and committing all of the files renv::init() produces to the repository for more flexibility (i.e., one could develop outside of the container just using renv in addition to using the Docker container)

I had a problem with Bioconductor versions like you (https://github.com/AlexsLemonade/OpenScPCA-analysis/pull/681/files#r1701991128), but once I ran renv::update() and took a snapshot, I was okay: jaclyn-taroni@46c7e8b

good to know, thank you!

I got started with adding packages that I expect you’ll need from Start wilms analysis AlexsLemonade/OpenScPCA-analysis#681

Thanks!

Removing the files specific to how your lab uses Docker and adding them to .gitignore

I don’t think we want to add these to the AlexsLemonade version of the repository because we don’t want to make assumptions about podman installation or the YAML parsing.

However, because this is ignored, you could add these files locally back and still use them for your development.

Adding a section to the README that explains how to work with the Docker image using docker run instead because it is consistent with the project docs.

I have been able to test this on various systems to the best of my ability, but it would be wonderful if you could see if it works for you! 😄

I have been able to build the docker image using the following

podman buildx build . -t openscpca/cell-type-wilms-tumor-06:latest --platform linux/amd64

For some permission issues, we have to use podman in our group server, afaik.

However, running the image does not work for me using this

podman run \
  --mount type=bind,target=/home/rstudio/OpenScPCA-analysis,source=$PWD \
  -e PASSWORD=wordpass \
  -p 8787:8787 \
  openscpca/cell-type-wilms-tumor-06:latest

Might be some issue with the mounting path, I need to check. When opening the http://localhost:8787/, I reach the RStudio connection page but cannot connect. On the Terminal, these are the message I get

[s6-init] making user provided files available at /var/run/s6/etc...exited 0.
[s6-init] ensuring user provided files have correct perms...exited 0.
[fix-attrs.d] applying ownership & permissions fixes...
[fix-attrs.d] done.
[cont-init.d] executing container initialization scripts...
[cont-init.d] 01_set_env: executing...
skipping /var/run/s6/container_environment/HOME
skipping /var/run/s6/container_environment/PASSWORD
skipping /var/run/s6/container_environment/RSTUDIO_VERSION
[cont-init.d] 01_set_env: exited 0.
[cont-init.d] 02_userconf: executing...
Assuming the container runs under rootless mode
Under rootless mode,
 - You will log in using 'root' as user
 - You will have root privileges within the container (e.g. apt)
 - The files you create as root on mounted volumes will appear at the host as owned by the user who started the container
 - You can't modify host files you don't have permission to
 - You should NOT run in RUNROOTLESS=true if you are using the container with privileges (e.g. sudo docker run... or sudo podman run...)
setting minimum authorised user to 0 (RUNROOTLESS=true)
deleting the default user (rstudio) since it is not needed.
No sudoers changes needed when running rootless
[cont-init.d] 02_userconf: exited 0.
[cont-init.d] done.
[services.d] starting services
[services.d] done.
[cont-finish.d] executing container finish scripts...
[cont-finish.d] done.

I can however run the docker image using our podman / config yaml, so I would be able to continue the analysis on the same docker image :)

Adding GitHub Action workflows

I think it’s time to start testing if the Docker image can be built and pushing it to the Elastic Container Registry (.github/workflows/docker_cell-type-wilms-tumor-06.yml)

I'll need to read a bit more about that :)

* I expect this addition will help speed up subsequent review because we won’t have to focus on whether or not the image builds!
I also added essentially a stub for testing running your module’s code on test data, but it’s not doing much yet. We can start using it in 01_clustering AlexsLemonade/OpenScPCA-analysis#680.

I was thinking to close the PR AlexsLemonade#680 (AlexsLemonade#680) and reopen one with the analysis re-ran with the new settings / Docker image. Might be easier to track the few changes on the renv.lock file and READ.ME.

Thank you again for your help setting up all of this! I am super happy/grateful, I learned a lot the past week :) I am looking forward running the analysis :)

maud-p · 2024-08-05T07:54:02Z

Good morning :)
So I figured out that I have to add few user definition lines to the execute command lines to be able to connect to the RStudio session. But this is specific to our server.
The first two lines map the different users that exist in a container to a range of special IDs (reserved for our linux user) on the host system. The third line makes sure that our user's group memberships are preserved for the container users.

--uidmap $uid:0:1 --uidmap 0:1:$uid --uidmap $(($uid+1)):$(($uid+1)):$(($subuidSize-$uid)) \
--gidmap $gid:0:1 --gidmap 0:1:$gid --gidmap $(($gid+1)):$(($gid+1)):$(($subgidSize-$gid)) \
--group-add=keep-groups \

So I can now run the image like this on my system:

podman run \
  --name maudp_ScPCA_wilms \
  -e RUNROOTLESS=false \
  --uidmap $uid:0:1 --uidmap 0:1:$uid --uidmap $(($uid+1)):$(($uid+1)):$(($subuidSize-$uid)) \
  --gidmap $gid:0:1 --gidmap 0:1:$gid --gidmap $(($gid+1)):$(($gid+1)):$(($subgidSize-$gid)) \
  --group-add=keep-groups \
  --volume=$PWD/:/home/rstudio \
  -e PASSWORD=wordpass \
  -p 8787:8787 \
  -e TZ=Europe/Vienna \
  openscpca/cell-type-wilms-tumor-06:latest

would it be OK to continue with this for the analysis?
alternative would be for me to work on my virtual machine that allow using docker, but my VM has less memory and is not as fast.

jaclyn-taroni added 15 commits August 3, 2024 11:53

Remove lock file (potentially temporarily!)

9a2f1cf

Base image with RStudio that's compatible with ARM + renv

6862fcc

Commit output from renv::init()

83a6fe5

Add an empty dependencies.R file

c4ec00a

Add tidyverse to dependencies

4973f16

Update lockfile after library(tidyverse)

230bf62

Uncomment renv steps from Dockerfile

b4eebec

Snapshot with some single-cell packages

c4e64c1

renv::update()

46c7e8b

Stub in run workflow; don't uncomment on PR

ebd2bf1

Add Docker build and push for Wilms cell typing module

41bf440

Add to list of modules with Docker images

afaff45

Document experience with Docker and renv

d3bce0e

Remove files specific to Halbritter lab setup and stop tracking

dd17e27

Add newline

650df7b

jaclyn-taroni commented Aug 4, 2024

View reviewed changes

maud-p approved these changes Aug 4, 2024

View reviewed changes

maud-p merged commit 47a84bc into maud-p:start-wilms-analysis Aug 4, 2024

jaclyn-taroni mentioned this pull request Aug 5, 2024

Start wilms analysis AlexsLemonade/OpenScPCA-analysis#681

Merged

7 tasks

jaclyn-taroni deleted the jaclyn-taroni/wilms-06-build-test branch August 5, 2024 12:01

jaclyn-taroni mentioned this pull request Aug 5, 2024

Add push trigger to Wilms tumor cell typing Docker build/push workflow & proposed change to template AlexsLemonade/OpenScPCA-analysis#688

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build and test Docker image using `bioconductor/bioconductor_docker:3.19` and `renv` #3

Build and test Docker image using `bioconductor/bioconductor_docker:3.19` and `renv` #3

jaclyn-taroni commented Aug 4, 2024

jaclyn-taroni left a comment

jaclyn-taroni Aug 4, 2024

jaclyn-taroni Aug 4, 2024

jaclyn-taroni Aug 4, 2024

jaclyn-taroni Aug 4, 2024

jaclyn-taroni Aug 4, 2024

jaclyn-taroni Aug 4, 2024

jaclyn-taroni Aug 4, 2024

jaclyn-taroni Aug 4, 2024

jaclyn-taroni Aug 4, 2024

maud-p commented Aug 4, 2024

maud-p commented Aug 5, 2024

		# Tidyverse
		library(tidyverse)

Build and test Docker image using bioconductor/bioconductor_docker:3.19 and renv #3

Build and test Docker image using bioconductor/bioconductor_docker:3.19 and renv #3

Conversation

jaclyn-taroni commented Aug 4, 2024

jaclyn-taroni left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

maud-p commented Aug 4, 2024

maud-p commented Aug 5, 2024

Build and test Docker image using `bioconductor/bioconductor_docker:3.19` and `renv` #3

Build and test Docker image using `bioconductor/bioconductor_docker:3.19` and `renv` #3