Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build and test Docker image using bioconductor/bioconductor_docker:3.19 and renv #3

Conversation

jaclyn-taroni
Copy link

Hi @maud-p! I am filing this today because I expect you might look at it before my workday starts tomorrow. 😊

If you were to merge this, it would update your branch in AlexsLemonade#681

You do not necessarily need to merge this, but we’ve found that sometimes it’s easier to express a set of interconnected ideas with a pull request to a pull request instead of individual comments on files. So, even if we end up picking and choosing, I think a pull request is still helpful!

I’ll summarize some main points here and then add comments to individual lines and files with more explanation when necessary.

  • Using bioconductor/bioconductor_docker:3.19 as the base image for (🤞) greater platform compatibility
  • Using renv and committing all of the files renv::init() produces to the repository for more flexibility (i.e., one could develop outside of the container just using renv in addition to using the Docker container)
  • Removing the files specific to how your lab uses Docker and adding them to .gitignore
    • I don’t think we want to add these to the AlexsLemonade version of the repository because we don’t want to make assumptions about podman installation or the YAML parsing.
    • However, because this is ignored, you could add these files locally back and still use them for your development.
    • Adding a section to the README that explains how to work with the Docker image using docker run instead because it is consistent with the project docs.
      • I have been able to test this on various systems to the best of my ability, but it would be wonderful if you could see if it works for you! 😄
  • Adding GitHub Action workflows
    • I think it’s time to start testing if the Docker image can be built and pushing it to the Elastic Container Registry (.github/workflows/docker_cell-type-wilms-tumor-06.yml)
      • I expect this addition will help speed up subsequent review because we won’t have to focus on whether or not the image builds!
    • I also added essentially a stub for testing running your module’s code on test data, but it’s not doing much yet. We can start using it in 01_clustering AlexsLemonade/OpenScPCA-analysis#680.

Copy link
Author

@jaclyn-taroni jaclyn-taroni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Returning some more info 😄

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just using the template for building and pushing Docker images but tailored to your module: https://github.com/AlexsLemonade/OpenScPCA-analysis/blob/main/templates/workflows/docker_module.yml

Comment on lines +16 to +23
pull_request:
branches:
- main
paths:
- "analyses/cell-type-wilms-tumor-06/Dockerfile"
- "analyses/cell-type-wilms-tumor-06/.dockerignore"
- "analyses/cell-type-wilms-tumor-06/renv.lock"
- "analyses/cell-type-wilms-tumor-06/conda-lock.yml"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've uncommented these lines, so if relevant environment files are altered in a pull request (to AlexsLemonade:main), it will test if the Docker image can be built. I think we're ready to start testing that!

@@ -38,6 +38,7 @@ jobs:
- simulate-sce
- cell-type-ewings
- doublet-detection
- cell-type-wilms-tumor-06
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're going to periodically (monthly, to be exact!) test that all of the images can be built. This is adding your Docker image to the list that gets tested.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +22 to +27
# workflow_call:
# pull_request:
# branches:
# - main
# paths:
# - "analyses/cell-type-wilms-tumor-06/**"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't get used on pull requests to AlexsLemonade:main until we uncomment those lines. It doesn't matter for AlexsLemonade#681, but we'll probably want to uncomment them in AlexsLemonade#680.

Comment on lines +1 to +2
# Tidyverse
library(tidyverse)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As a substitute for using the bioconductor:tidyverse:3.19 image, we can pin this dependency as described here: https://openscpca.readthedocs.io/en/latest/ensuring-repro/managing-software/using-renv/#pinning-dependencies-that-are-not-captured-automatically

library(Seurat) # remotes::install_github("satijalab/[email protected]")
library(presto) # remotes::install_github("immunogenomics/presto")
library(Azimuth) # remotes::install_github("satijalab/azimuth")
library(SCpubr)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this package, I could pin the dependency this way and have it installed via typical renv means.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is added when you run renv::init(). Including it here means you could use renv outside of Docker if you wanted to! (Same for the renv/.gitignore and renv/settings.json files.)

--mount type=bind,target=/home/rstudio/OpenScPCA-analysis,source=$PWD \
-e PASSWORD={PASSWORD} \
-p 8787:8787 \
public.ecr.aws/openscpca/cell-type-wilms-tumor-06:latest
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you test, you'd want this to be the copy you build locally instead: openscpca/cell-type-wilms-tumor-06 since it won't be on ECR yet!

@maud-p maud-p merged commit 47a84bc into maud-p:start-wilms-analysis Aug 4, 2024
@maud-p
Copy link
Owner

maud-p commented Aug 4, 2024

Dear @jaclyn-taroni ,

Thank you very much for your efforts and help!! I am still in the process of understanding/testing your suggestion but wanted to give you some feedback.

Hi @maud-p! I am filing this today because I expect you might look at it before my workday starts tomorrow. 😊

If you were to merge this, it would update your branch in AlexsLemonade#681

I merged all your changes in maud-p:start-wilms-analysis and take it now as a refenrence!

You do not necessarily need to merge this, but we’ve found that sometimes it’s easier to express a set of interconnected ideas with a pull request to a pull request instead of individual comments on files. So, even if we end up picking and choosing, I think a pull request is still helpful!

I’ll summarize some main points here and then add comments to individual lines and files with more explanation when necessary.

good to know, thank you!

Thanks!

  • Removing the files specific to how your lab uses Docker and adding them to .gitignore

    • I don’t think we want to add these to the AlexsLemonade version of the repository because we don’t want to make assumptions about podman installation or the YAML parsing.

    • However, because this is ignored, you could add these files locally back and still use them for your development.

    • Adding a section to the README that explains how to work with the Docker image using docker run instead because it is consistent with the project docs.

      • I have been able to test this on various systems to the best of my ability, but it would be wonderful if you could see if it works for you! 😄

I have been able to build the docker image using the following

podman buildx build . -t openscpca/cell-type-wilms-tumor-06:latest --platform linux/amd64

For some permission issues, we have to use podman in our group server, afaik.

However, running the image does not work for me using this

podman run \
  --mount type=bind,target=/home/rstudio/OpenScPCA-analysis,source=$PWD \
  -e PASSWORD=wordpass \
  -p 8787:8787 \
  openscpca/cell-type-wilms-tumor-06:latest

Might be some issue with the mounting path, I need to check. When opening the http://localhost:8787/, I reach the RStudio connection page but cannot connect. On the Terminal, these are the message I get

[s6-init] making user provided files available at /var/run/s6/etc...exited 0.
[s6-init] ensuring user provided files have correct perms...exited 0.
[fix-attrs.d] applying ownership & permissions fixes...
[fix-attrs.d] done.
[cont-init.d] executing container initialization scripts...
[cont-init.d] 01_set_env: executing...
skipping /var/run/s6/container_environment/HOME
skipping /var/run/s6/container_environment/PASSWORD
skipping /var/run/s6/container_environment/RSTUDIO_VERSION
[cont-init.d] 01_set_env: exited 0.
[cont-init.d] 02_userconf: executing...
Assuming the container runs under rootless mode
Under rootless mode,
 - You will log in using 'root' as user
 - You will have root privileges within the container (e.g. apt)
 - The files you create as root on mounted volumes will appear at the host as owned by the user who started the container
 - You can't modify host files you don't have permission to
 - You should NOT run in RUNROOTLESS=true if you are using the container with privileges (e.g. sudo docker run... or sudo podman run...)
setting minimum authorised user to 0 (RUNROOTLESS=true)
deleting the default user (rstudio) since it is not needed.
No sudoers changes needed when running rootless
[cont-init.d] 02_userconf: exited 0.
[cont-init.d] done.
[services.d] starting services
[services.d] done.
[cont-finish.d] executing container finish scripts...
[cont-finish.d] done.

I can however run the docker image using our podman / config yaml, so I would be able to continue the analysis on the same docker image :)

  • Adding GitHub Action workflows

    • I think it’s time to start testing if the Docker image can be built and pushing it to the Elastic Container Registry (.github/workflows/docker_cell-type-wilms-tumor-06.yml)

I'll need to read a bit more about that :)

* I expect this addition will help speed up subsequent review because we won’t have to focus on whether or not the image builds!

I was thinking to close the PR AlexsLemonade#680 (AlexsLemonade#680) and reopen one with the analysis re-ran with the new settings / Docker image. Might be easier to track the few changes on the renv.lock file and READ.ME.

Thank you again for your help setting up all of this! I am super happy/grateful, I learned a lot the past week :) I am looking forward running the analysis :)

@maud-p
Copy link
Owner

maud-p commented Aug 5, 2024

Good morning :)
So I figured out that I have to add few user definition lines to the execute command lines to be able to connect to the RStudio session. But this is specific to our server.
The first two lines map the different users that exist in a container to a range of special IDs (reserved for our linux user) on the host system. The third line makes sure that our user's group memberships are preserved for the container users.

--uidmap $uid:0:1 --uidmap 0:1:$uid --uidmap $(($uid+1)):$(($uid+1)):$(($subuidSize-$uid)) \
--gidmap $gid:0:1 --gidmap 0:1:$gid --gidmap $(($gid+1)):$(($gid+1)):$(($subgidSize-$gid)) \
--group-add=keep-groups \

So I can now run the image like this on my system:

podman run \
  --name maudp_ScPCA_wilms \
  -e RUNROOTLESS=false \
  --uidmap $uid:0:1 --uidmap 0:1:$uid --uidmap $(($uid+1)):$(($uid+1)):$(($subuidSize-$uid)) \
  --gidmap $gid:0:1 --gidmap 0:1:$gid --gidmap $(($gid+1)):$(($gid+1)):$(($subgidSize-$gid)) \
  --group-add=keep-groups \
  --volume=$PWD/:/home/rstudio \
  -e PASSWORD=wordpass \
  -p 8787:8787 \
  -e TZ=Europe/Vienna \
  openscpca/cell-type-wilms-tumor-06:latest

would it be OK to continue with this for the analysis?
alternative would be for me to work on my virtual machine that allow using docker, but my VM has less memory and is not as fast.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants