From 6863704eefea2ea141dadc3f9242f876104b8570 Mon Sep 17 00:00:00 2001 From: Clemens Tolboom Date: Thu, 5 Oct 2023 14:11:30 +0200 Subject: [PATCH 1/4] Initial restructure by role. --- README.md | 420 +--------------------------------------- docs/README.md | 14 +- docs/_sidebar.md | 28 ++- docs/faq.md | 105 ++++++++++ docs/ops/configuring.md | 28 +++ docs/ops/installing.md | 56 ++++++ docs/upgrade-2-3.md | 175 +++++++++++++++++ 7 files changed, 392 insertions(+), 434 deletions(-) create mode 100644 docs/faq.md create mode 100644 docs/ops/configuring.md create mode 100644 docs/ops/installing.md create mode 100644 docs/upgrade-2-3.md diff --git a/README.md b/README.md index 3e7c82377..6b0ff7886 100644 --- a/README.md +++ b/README.md @@ -18,422 +18,8 @@ service provides the following features: ![DataSHIELD overview](https://raw.githubusercontent.com/molgenis/molgenis-service-armadillo/master/docs/img/overview-datashield.png) -# Armadillo installation -Armadillo requires Java to run, Docker to access the DataSHIELD profiles, and OIDC for authentication (not needed for local tests). Below instructions how to run Armadillo directly from Java, as a Docker container, as a service on Ubuntu or from source code. -Note that for production you should add a https proxy for essential security. And you might need to enable 'Docker socket' on your docker service. +## Getting started -### Run Armadillo using java commandline -Software developers often run Armadillo as java jar file: +For installing and using Armadillo see https://molgenis.github.io/molgenis-service-armadillo/#/ -1. Install Java and Docker (for the DataSHIELD profiles) -2. Download Armadillo jar file from [releases](https://github.com/molgenis/molgenis-service-armadillo/releases), for example: -[molgenis-armadillo-3.3.0.jar](https://github.com/molgenis/molgenis-service-armadillo/releases/download/V3.3.0/) -3. Run armadillo using ```java -jar molgenis-armadillo-3.3.0.jar``` -4. Go to http://localhost:8080 to see your Armadillo running. - -Default Armadillo will start with only 'basic-auth' and user 'admin' with password 'admin'. You can enable 'oidc' for connecting more users. You can change -by providing and editing [application.yaml](application.template.yml) file -in your working directory and then run command above again. - -### Run Armadillo via docker compose -For testing without having to installing Java you can run using docker: - -1. Install [docker-compose](https://docs.docker.com/compose/install/) -2. Download this [docker-compose.yml](docker-compose.yml). -3. Execute ```docker-compose up``` -4. Once it says 'Started' go to http://localhost:8080 to see your Armadillo running. - -The command must run in same directory as downloaded docker file. We made docker available via 'docker.sock' so we can start/stop DataSHIELD profiles. Alternatively you must include the datashield profiles into this docker-compose. You can override all application.yaml settings via environment variables -(see commented code in docker-compose file). - -### Run Armadillo as service on Ubuntu -We run Armadillo in production as a Linux service on Ubuntu, ensuring it gets restarted when the server is rebooted. You might be able to reproduce also on -CentOS (using yum instead of apt). - -#### 1. Install necessary software -``` -apt update -apt install openjdk-19-jre-headless -apt install docker.io -``` -Note: you might need 'sudo' - -#### 2. Run installation script -This step will install most recent [release](https://github.com/molgenis/molgenis-service-armadillo/releases): -``` -wget https://raw.githubusercontent.com/molgenis/molgenis-service-armadillo/master/scripts/install/armadillo-setup.sh -bash armadillo-setup.sh \ - --admin-user admin \ - --admin-password xxxxx - --domain my.server.com \ - --oidc \ - --oidc_url https://lifecycle-auth.molgenis.org \ - --oidc_clientid clientid \ - --oidc_clientsecret secret \ - --cleanup -``` -Note: adapt install command to suit your situation. Use --help to see the options. https://lifecycle-auth.molgenis.org is MOLGENIS provided OIDC service but -you can also use your own, see FAQ below. - -### Running Armadillo from source code - -You can run from source code as follows: -1. Install Java and Docker -2. Checkout the source using ```git clone https://github.com/molgenis/molgenis-service-armadillo.git``` -3. Optionally copy ```application.template.yml``` to ```application.yml``` to change settings (will be .gitignored) -4. Compile and execute the code using ```./gradlew run``` - -Note: contact MOLGENIS team if you want to contribute and need a testing OIDC config that you can run against localhost. - -# Using Armadillo -Armadillo has three main screens to manage projects, user access and DataSHIELD profiles: - -### Create data access projects -Data stewards can use the Armadillo web user interface or [MolgenisArmadillo R client](https://molgenis.github.io/molgenis-r-armadillo) -to create 'projects' and upload their data into those. Data tables need to be in parquet format that supports fast selections of the columns -(variables) you need for analysis. Other files can be configured as 'resources'. - -### Manage user access -Data stewards can use the permission screen to give email adresses access to the data. Everybody signs in via single sign on using an OIDC central -authentication server such as KeyCloack or Fusion auth that federates to authentication systems of -connected institutions, ideally using a federated AAI such as LifeScience AAI. - -### Configure DataSHIELD profiles -To analyse data, users must choose a datashield profile. Armadillo owners can use the web user interface to configure new profiles. Assuming you -installed docker you can also start/stop these images. Alternatively you can use docker-compose for that. - -There are DataSHIELD packages for [standard statistical analysis](https://github.com/datashield/dsBaseClient) -, [exposome studies](https://github.com/isglobal-brge/dsExposomeClient) -, [survival studies](https://github.com/neelsoumya/dsSurvivalClient) -, [microbiome studies](https://github.com/StuartWheater/dsMicrobiomeClient) -and [analysis tools for studies that are using large genetic datasets](https://github.com/isglobal-brge/dsomicsclient) -. These packages can all be installed in the Armadillo suite. - -### End users can use Armadillo as any other DataSHIELD server -A researcher connects from an [R client](https://molgenis.github.io/molgenis-r-datashield) to one or multiple Armadillo servers. The data is -loaded into an R session on the Armadillo server specifically created for the researcher. Analysis requests are sent to the R session on each Armadillo server. -There the analysis is performed and aggregated results are sent back to the client. - -# Developing Armadillo - -We use gradle to build: -* run using ```./gradlew run``` -* run tests using ```./gradlew test``` - -We use intellij to develop -* To run or debug in intellij, right click on armadillo/src/main/java/org.molgenis.armdadillo/ArmadilloServiceAppliction and choose 'Run/Debug Armadillo...' -* To run using oidc, create a copy of [application.yml](application.template.yml) in root of your project - -We have a swagger-ui to quickly see and test available web services at http://localhost:8080/swagger-ui/ - -# Developing DataSHIELD packages in Armadillo -As package developer will want to push your new packages into a DataSHIELD profile - -* You can start Armadillo with defaults as described above; then use admin/admin as authentication -* to see what profile are available and has been selected: -``` -curl -u admin:admin http://localhost:8080/profiles -``` -* to change selected profile 'my-profile': -``` -curl -X POST http://localhost:8080/select-profile \ - -H 'Content-Type: application/json' \ - -d 'default' -``` -* to install-packages in DataSHIELD current using admin user: -``` -curl -u admin:admin -v \ --H 'Content-Type: multipart/form-data' \ --F "file=@dsBase_6.3.0.tar.gz" \ --X POST http://localhost:8080/install-package -``` -* to update whitelist of your current profile: -``` -curl -u admin:admin -X POST http://localhost:8080/whitelist/dsBase -``` -* to get whitelist of current profile: -``` -curl -u admin:admin http://localhost:8080/whitelist -``` - -# Frequently asked questions - -### Docker gives a 'java.socket' error -You might need to enable Docker socket. On Docker desktop you can find that under 'settings' and 'advanced'. - -### Can I use docker compose to start profiles? -Instead of making Armadillo start/stop DataSHIELD profiles you can also use docker compose. -See commented section in docker-compose.yml file. - -### Can I pass environment or commandline variables instead of application.yml? - -Yes, it is standard spring. - -### Can I run Armadillo with oauth2 config offline? -Yes, you can run in 'offline' profile -``` -./gradlew run -Dspring.profiles.active=offline -``` - -### How to import data from Armadillo 2? -To export data from and Armadillo 2 server take the following steps: - -#### 1. Check if there's enough space left on the server -``` -du -h -``` -Compare to: -``` -du -h /var/lib/minio -``` -Available space should be at least twice the size of the MinIO folder. - -#### 2. Backup Armadillo 2 settings -``` -mkdir armadillo2-backup -rsync -avr /usr/share/armadillo armadillo2-backup -cp /etc/armadillo/application.yml armadillo2-backup/application-armadillo2.yml -``` -N.B.change /usr/share to path matching your local config. - -#### 3. Install helper software -Login to your server as root, using ssh. -``` -apt update -apt install pip -pip install minio -pip install fusionauth-client -pip install simple_term_menu -``` -If you get a purple message asking to update, accept and install everything. -Restart of server is recommended after this. - -N.B. Note that the commands in this manual are for Ubuntu, on other linux systems, -the `apt` command needs to be replaced with another one. - -#### 4. Stop all docker images for Armadillo 2 -List all docker images -```docker ps -a``` - -Stop and remove all Armadillo 2 related images (except for MinIO), e.g. -``` -docker rm armadillo_auth_1 armadillo_console_1 armadillo_rserver-default_1 armadillo_rserver-mediation_1 armadillo_rserver-exposome_1 armadillo_rserver-omics_1 armadillo_armadillo_1 -f -``` -Check with `docker ps -a` if there are still containers running, if so remove these (except for the MinIO) in the same way as the others. - -#### 5. Install armadillo -``` -apt update -apt install openjdk-19-jre-headless -apt install docker.io -``` -The docker.io step might fail because containerd already exists, if that's the case, remove containerd and try again: -``` -apt remove containerd -apt install docker.io -``` - -Get armadillo: -``` -wget https://raw.githubusercontent.com/molgenis/molgenis-service-armadillo/master/scripts/install/armadillo-setup.sh -bash armadillo-setup.sh \ - --admin-user admin \ - --admin-password xxxxx - --domain my.server.com \ - --oidc \ - --oidc_url https://lifecycle-auth.molgenis.org \ - --oidc_clientid clientid \ - --oidc_clientsecret secret \ - --cleanup \ -``` -Don't forget to set a proper admin password (use a generator), domain, clientid and clientsecret. The client id and -secret can be found on the lifecycle auth server in the configuration for your server. If you don't have permissions to -receive this, you can ask the support team to get it for you. - -Open armadillo in the browser and try to login using basicauth to check if the server is running properly. If it's not -running at all, try: -``` -systemctl start armadillo -``` - -#### 6. Export data from Armadillo 2 into armadillo 3 -Look up the user/password in the application.yml of the old armadillo. They're called MinIO access key and minio -secret key. -``` -cat /root/armadillo2-backup/application-armadillo2.yml -``` -Do the following step in a separate screen. On ubuntu use: -``` -screen -``` -Navigate to the armadillo folder: -``` -cd /usr/share/armadillo -``` -This step will copy Armadillo 2 data from minio into the folder matching of an Armadillo 3 data folder: -``` -mkdir data -wget https://raw.githubusercontent.com/molgenis/molgenis-service-armadillo/master/scripts/migrate-minio.py -python3 migrate-minio.py --minio http://localhost:9000 --target /usr/share/armadillo/data -``` - -This might take a couple of minutes. You can detach the screen using `ctrl+a` followed by `d` and reattach it using -`screen -r`. - -#### 7. Run Armadillo 3 using exported data -Make sure to move the exported data into the new 'data' folder. Optionally you might need to fix user permissions, e.g.: -``` -chown armadillo:armadillo -R data -``` -Check if armadillo is running by going to the URL of your server in the browser, login and navigate to the projects tab. - -#### 8. Optionally, acquire a permission set from MOLGENIS team -If you previously run central authorisation server with MOLGENIS team, they can provide you with procedure to load -pre-existing permissions. They will use: -``` -wget https://raw.githubusercontent.com/molgenis/molgenis-service-armadillo/master/scripts/migrate-auth.py -python3 migrate-auth.py --fusion-auth https://lifecycle-auth.molgenis.org --armadillo https://thearmadillourl.net -``` -Now check if all users and data are properly migrated. - -#### 9. Cleanup ngnix config - -Change `/etc/ngninx/sites-available/armadillo.conf` to: -``` -server { - listen 80; - server_name urlofyourserver.org - include /etc/nginx/global.d/*.conf; - location / { - proxy_pass http://localhost:8080; - client_max_body_size 0; - proxy_read_timeout 600s; - proxy_redirect http://localhost:8080/ $scheme://$host/; - proxy_set_header Host $host; - proxy_http_version 1.1; - } -} -``` -Note that the `https://` is missing in the server_name part. - -Remove the console and storage file from: `/etc/ngninx/sites-enabled/`. - -``` -system restart ngninx -``` - -#### 10. Fix application.yml -Make sure the following is added: -``` -server: -forward-headers-strategy: framework -``` - -#### 11. Fix URLs in the lifecycle FusionAuth -Add the following to the config of your server: -`https://yourserver.com/login/oauth2/code/molgenis` - -#### 12. Set up profiles -Login to armadillo in the browser. Navigate to the "Profiles" tab. Add a new profile with the following properties: - -Name: `xenon` -Image: `datashield/armadillo-rserver_caravan-xenon:latest` -Package whitelist: `dsBase`, `resourcer`, `dsMediation`, `dsMTLBase`, `dsSurvival`, `dsExposome` - -Assign a random 9-number seed and create and start the container. - -#### 13. Remove old MinIO data -First remove the MinIO docker container. First check the name of the container using `docker ps -a`, then: -``` -docker rm containername -f -``` -After that remove the data: -```rm -Rf /var/lib/minio/ ``` - - -### How to run previous armadillo 2? - -For armadillo 2.x you can follow instructions at -* for testing we use docker compose at https://github.com/molgenis/molgenis-service-armadillo/tree/armadillo-service-2.2.3 -* for production we are using Ansible at https://galaxy.ansible.com/molgenis/armadillo` - -### How to run Armadillo as developer? - -We develop Armadillo using IntelliJ. - -#### To build Armadillo -To build run following command in the github root: -```./gradlew build``` - -To execute in 'dev' run following command in the github root: -```./gradlew run``` - -#### Setting up development tools - -This repository uses `pre-commit` to manage commit hooks. An installation guide can be found -[here](https://pre-commit.com/index.html#1-install-pre-commit). To install the hooks, run `pre-commit install` once in the root folder of this repository. Now -your code will be automatically formatted whenever you commit. - -#### How to change data directory - -Data is automatically stored in the `data` folder in this repository. You can choose another location -in `application.yml` by changing the `storage.root-dir` -setting. - -> **_Note_**: When you run Armadillo locally for the first time, the `lifecycle` project has not been -> added to the system metadata yet. To add it automatically, see [Application properties](#application-properties). -> Or you can add it manually: -> - Go to the Swagger UI (`http://localhost:8080/swagger-ui/index.html`) -> - Go to the `PUT /access/projects` endpoint -> - Add the project `lifecycle` -> -> Now you're all set! - -#### Working with resources in development mode -When developing locally, docker has trouble connecting to localhost. This problem becomes clear when working with -resources. Luckily there's a quick fix for the problem. Instead of defining a resource as for example -`http://localhost:8080/storage/projects/omics/objects/test%2Fgse66351_1.rda`, rewrite it to: -`http://host.docker.internal:8080/storage/projects/omics/objects/test%2Fgse66351_1.rda`. Here's some example R code -for uploading resources: -```R -## Uploading resources -library(MolgenisArmadillo) -library(resourcer) - -token <- armadillo.get_token("http://localhost:8080/") - -resGSE1 <- resourcer::newResource( - name = "GSE66351_1", - secret = token, - url = "http://host.docker.internal:8080/storage/projects/omics/objects/test%2Fgse66351_1.rda", - format = "ExpressionSet" -) - -armadillo.login("http://localhost:8080/") -armadillo.upload_resource(project="omics", folder="ewas", resource = resGSE1, name = "GSE66351_1") -``` -And for using them: -```R -library(DSMolgenisArmadillo) -library(dsBaseClient) - -token <- armadillo.get_token("http://localhost:8080/") - -builder <- DSI::newDSLoginBuilder() -builder$append( - server = "local", - url = "http://localhost:8080/", - token = token, - driver = "ArmadilloDriver", - profile = "uniform", - resource = "omics/ewas/GSE66351_1" -) - -login_data <- builder$build() -conns <- DSI::datashield.login(logins = login_data, assign = TRUE) - -datashield.resources(conns = conns) -datashield.assign.resource(conns, resource="omics/ewas/GSE66351_1", symbol="eSet_0y_EUR") -ds.class('eSet_0y_EUR', datasources = conns) -datashield.assign.expr(conns, symbol = "methy_0y_EUR",expr = quote(as.resource.object(eSet_0y_EUR))) -``` +For developing and contributing see [Contributing](./CONTRIBUTING.md). diff --git a/docs/README.md b/docs/README.md index 621b105dc..c6d0b840a 100644 --- a/docs/README.md +++ b/docs/README.md @@ -1 +1,13 @@ -See /README.md \ No newline at end of file +# Overview + +Use MOLGENIS/Armadillo to make data available for privacy protecting federated analysis using [DataSHIELD](https://datashield.org) protocol. Armadillo +service provides the following features: +* **manage data projects**. Projects can either hold tabular data in the efficient 'parquet' format or any other file use DataSHIELD + 'resources' framework. +* **grant users access permission**. We use a central OIDC service like KeyCloak or FusionAuth in combination with a trused identity provider like + Life Sciences AAI to authenticate users. +* **configure DataSHIELD analysis profiles**. [DataSHIELD analysis profiles](https://www.datashield.org/help/standard-profiles-and-plaforms) are + Docker images that contain a collection of multiple [DataSHIELD analysis packages](https://www.datashield.org/help/community-packages). + +![DataSHIELD overview](https://raw.githubusercontent.com/molgenis/molgenis-service-armadillo/master/docs/img/overview-datashield.png) + diff --git a/docs/_sidebar.md b/docs/_sidebar.md index 0a5ab0aba..9870f3a3a 100644 --- a/docs/_sidebar.md +++ b/docs/_sidebar.md @@ -1,17 +1,13 @@ - [Armadillo suite](/ "Armadillo suite") - -- [Armadillo 3 UI](/ui.md#armadillo-user-interface "Armadillo 3 UI") - -- [DsMolgenisArmadillo](https://github.com/molgenis/molgenis-r-datashield/blob/master/README.md) - -- [MolgenisArmadillo](https://molgenis.github.io/molgenis-r-armadillo/) - -- [molgenis-r-auth](https://molgenis.github.io/molgenis-r-auth/) - -- [molgenis-r-client](https://github.com/molgenis/molgenis-r-client/blob/master/README.md) - -- [ds-upload](https://lifecycle-project.github.io/ds-upload/) - -- [ds-dictionaries](https://github.com/lifecycle-project/ds-dictionaries/blob/master/README.md) - -- [Armadillo 2 to 3 migration guide](/migration-guide-2-to-3.md#instructions-for-migrating-a-2x-armadillo-to-3x "Armadillo 2 to 3 migration guide") \ No newline at end of file + - [Install](/ops/installing.md) + - [Configure](/ops/configuring.md) + - [UI](/ui.md#armadillo-user-interface "Armadillo 3 UI") + - [FAQ](/faq.md) + - [Armadillo 2 to 3 migration guide](/migration-guide-2-to-3.md#instructions-for-migrating-a-2x-armadillo-to-3x "Armadillo 2 to 3 migration guide") +- External sites + - [DsMolgenisArmadillo](https://github.com/molgenis/molgenis-r-datashield/blob/master/README.md) + - [MolgenisArmadillo](https://molgenis.github.io/molgenis-r-armadillo/) + - [molgenis-r-auth](https://molgenis.github.io/molgenis-r-auth/) + - [molgenis-r-client](https://github.com/molgenis/molgenis-r-client/blob/master/README.md) + - [ds-upload](https://lifecycle-project.github.io/ds-upload/) + - [ds-dictionaries](https://github.com/lifecycle-project/ds-dictionaries/blob/master/README.md) diff --git a/docs/faq.md b/docs/faq.md new file mode 100644 index 000000000..bcde8f572 --- /dev/null +++ b/docs/faq.md @@ -0,0 +1,105 @@ +# Frequently asked questions + +## Docker gives a 'java.socket' error +You might need to enable Docker socket. On Docker desktop you can find that under 'settings' and 'advanced'. + +## Can I use docker compose to start profiles? +Instead of making Armadillo start/stop DataSHIELD profiles you can also use docker compose. +See commented section in docker-compose.yml file. + +## Can I pass environment or commandline variables instead of application.yml? + +Yes, it is standard spring. + +## Can I run Armadillo with oauth2 config offline? +Yes, you can run in 'offline' profile +``` +./gradlew run -Dspring.profiles.active=offline +``` + +## How to run previous armadillo 2? + +For armadillo 2.x you can follow instructions at +* for testing we use docker compose at https://github.com/molgenis/molgenis-service-armadillo/tree/armadillo-service-2.2.3 +* for production we are using Ansible at https://galaxy.ansible.com/molgenis/armadillo` + +## How to run Armadillo as developer? + +We develop Armadillo using IntelliJ. + +### To build Armadillo +To build run following command in the github root: +```./gradlew build``` + +To execute in 'dev' run following command in the github root: +```./gradlew run``` + +### Setting up development tools + +This repository uses `pre-commit` to manage commit hooks. An installation guide can be found +[here](https://pre-commit.com/index.html#1-install-pre-commit). To install the hooks, run `pre-commit install` once in the root folder of this repository. Now +your code will be automatically formatted whenever you commit. + +### How to change data directory + +Data is automatically stored in the `data` folder in this repository. You can choose another location +in `application.yml` by changing the `storage.root-dir` +setting. + +> **_Note_**: When you run Armadillo locally for the first time, the `lifecycle` project has not been +> added to the system metadata yet. To add it automatically, see [Application properties](#application-properties). +> Or you can add it manually: +> - Go to the Swagger UI (`http://localhost:8080/swagger-ui/index.html`) +> - Go to the `PUT /access/projects` endpoint +> - Add the project `lifecycle` +> +> Now you're all set! + +### Working with resources in development mode +When developing locally, docker has trouble connecting to localhost. This problem becomes clear when working with +resources. Luckily there's a quick fix for the problem. Instead of defining a resource as for example +`http://localhost:8080/storage/projects/omics/objects/test%2Fgse66351_1.rda`, rewrite it to: +`http://host.docker.internal:8080/storage/projects/omics/objects/test%2Fgse66351_1.rda`. Here's some example R code +for uploading resources: +```R +## Uploading resources +library(MolgenisArmadillo) +library(resourcer) + +token <- armadillo.get_token("http://localhost:8080/") + +resGSE1 <- resourcer::newResource( + name = "GSE66351_1", + secret = token, + url = "http://host.docker.internal:8080/storage/projects/omics/objects/test%2Fgse66351_1.rda", + format = "ExpressionSet" +) + +armadillo.login("http://localhost:8080/") +armadillo.upload_resource(project="omics", folder="ewas", resource = resGSE1, name = "GSE66351_1") +``` +And for using them: +```R +library(DSMolgenisArmadillo) +library(dsBaseClient) + +token <- armadillo.get_token("http://localhost:8080/") + +builder <- DSI::newDSLoginBuilder() +builder$append( + server = "local", + url = "http://localhost:8080/", + token = token, + driver = "ArmadilloDriver", + profile = "uniform", + resource = "omics/ewas/GSE66351_1" +) + +login_data <- builder$build() +conns <- DSI::datashield.login(logins = login_data, assign = TRUE) + +datashield.resources(conns = conns) +datashield.assign.resource(conns, resource="omics/ewas/GSE66351_1", symbol="eSet_0y_EUR") +ds.class('eSet_0y_EUR', datasources = conns) +datashield.assign.expr(conns, symbol = "methy_0y_EUR",expr = quote(as.resource.object(eSet_0y_EUR))) +``` diff --git a/docs/ops/configuring.md b/docs/ops/configuring.md new file mode 100644 index 000000000..331669e03 --- /dev/null +++ b/docs/ops/configuring.md @@ -0,0 +1,28 @@ +# Using Armadillo +Armadillo has three main screens to manage projects, user access and DataSHIELD profiles: + +### Create data access projects +Data stewards can use the Armadillo web user interface or [MolgenisArmadillo R client](https://molgenis.github.io/molgenis-r-armadillo) +to create 'projects' and upload their data into those. Data tables need to be in parquet format that supports fast selections of the columns +(variables) you need for analysis. Other files can be configured as 'resources'. + +### Manage user access +Data stewards can use the permission screen to give email adresses access to the data. Everybody signs in via single sign on using an OIDC central +authentication server such as KeyCloack or Fusion auth that federates to authentication systems of +connected institutions, ideally using a federated AAI such as LifeScience AAI. + +### Configure DataSHIELD profiles +To analyse data, users must choose a datashield profile. Armadillo owners can use the web user interface to configure new profiles. Assuming you +installed docker you can also start/stop these images. Alternatively you can use docker-compose for that. + +There are DataSHIELD packages for [standard statistical analysis](https://github.com/datashield/dsBaseClient) +, [exposome studies](https://github.com/isglobal-brge/dsExposomeClient) +, [survival studies](https://github.com/neelsoumya/dsSurvivalClient) +, [microbiome studies](https://github.com/StuartWheater/dsMicrobiomeClient) +and [analysis tools for studies that are using large genetic datasets](https://github.com/isglobal-brge/dsomicsclient) +. These packages can all be installed in the Armadillo suite. + +### End users can use Armadillo as any other DataSHIELD server +A researcher connects from an [R client](https://molgenis.github.io/molgenis-r-datashield) to one or multiple Armadillo servers. The data is +loaded into an R session on the Armadillo server specifically created for the researcher. Analysis requests are sent to the R session on each Armadillo server. +There the analysis is performed and aggregated results are sent back to the client. diff --git a/docs/ops/installing.md b/docs/ops/installing.md new file mode 100644 index 000000000..1e492a19d --- /dev/null +++ b/docs/ops/installing.md @@ -0,0 +1,56 @@ +# Armadillo installation +Armadillo requires Java to run, Docker to access the DataSHIELD profiles, and OIDC for authentication (not needed for local tests). Below instructions how to run Armadillo directly from Java, as a Docker container, as a service on Ubuntu or from source code. +Note that for production you should add a https proxy for essential security. And you might need to enable 'Docker socket' on your docker service. + +## Run Armadillo using java commandline +Software developers often run Armadillo as java jar file: + +1. Install Java and Docker (for the DataSHIELD profiles) +2. Download Armadillo jar file from [releases](https://github.com/molgenis/molgenis-service-armadillo/releases), for example: +[molgenis-armadillo-3.3.0.jar](https://github.com/molgenis/molgenis-service-armadillo/releases/download/V3.3.0/) +3. Run armadillo using ```java -jar molgenis-armadillo-3.3.0.jar``` +4. Go to http://localhost:8080 to see your Armadillo running. + +Default Armadillo will start with only 'basic-auth' and user 'admin' with password 'admin'. You can enable 'oidc' for connecting more users. You can change +by providing and editing [application.yaml](application.template.yml) file +in your working directory and then run command above again. + +## Run Armadillo via docker compose +For testing without having to installing Java you can run using docker: + +1. Install [docker-compose](https://docs.docker.com/compose/install/) +2. Download this [docker-compose.yml](docker-compose.yml). +3. Execute ```docker-compose up``` +4. Once it says 'Started' go to http://localhost:8080 to see your Armadillo running. + +The command must run in same directory as downloaded docker file. We made docker available via 'docker.sock' so we can start/stop DataSHIELD profiles. Alternatively you must include the datashield profiles into this docker-compose. You can override all application.yaml settings via environment variables +(see commented code in docker-compose file). + +## Run Armadillo as service on Ubuntu +We run Armadillo in production as a Linux service on Ubuntu, ensuring it gets restarted when the server is rebooted. You might be able to reproduce also on +CentOS (using yum instead of apt). + +### 1. Install necessary software +``` +apt update +apt install openjdk-19-jre-headless +apt install docker.io +``` +Note: you might need 'sudo' + +### 2. Run installation script +This step will install most recent [release](https://github.com/molgenis/molgenis-service-armadillo/releases): +``` +wget https://raw.githubusercontent.com/molgenis/molgenis-service-armadillo/master/scripts/install/armadillo-setup.sh +bash armadillo-setup.sh \ + --admin-user admin \ + --admin-password xxxxx + --domain my.server.com \ + --oidc \ + --oidc_url https://lifecycle-auth.molgenis.org \ + --oidc_clientid clientid \ + --oidc_clientsecret secret \ + --cleanup +``` +Note: adapt install command to suit your situation. Use --help to see the options. https://lifecycle-auth.molgenis.org is MOLGENIS provided OIDC service but +you can also use your own, see FAQ below. diff --git a/docs/upgrade-2-3.md b/docs/upgrade-2-3.md new file mode 100644 index 000000000..81df3ad7c --- /dev/null +++ b/docs/upgrade-2-3.md @@ -0,0 +1,175 @@ +### Migrate Armadillo 2 to Armadille 3 + +To export data from and Armadillo 2 server take the following steps: + +#### 1. Check if there's enough space left on the server +``` +du -h +``` +Compare to: +``` +du -h /var/lib/minio +``` +Available space should be at least twice the size of the MinIO folder. + +#### 2. Backup Armadillo 2 settings +``` +mkdir armadillo2-backup +rsync -avr /usr/share/armadillo armadillo2-backup +cp /etc/armadillo/application.yml armadillo2-backup/application-armadillo2.yml +``` +N.B.change /usr/share to path matching your local config. + +#### 3. Install helper software +Login to your server as root, using ssh. +``` +apt update +apt install pip +pip install minio +pip install fusionauth-client +pip install simple_term_menu +``` +If you get a purple message asking to update, accept and install everything. +Restart of server is recommended after this. + +N.B. Note that the commands in this manual are for Ubuntu, on other linux systems, +the `apt` command needs to be replaced with another one. + +#### 4. Stop all docker images for Armadillo 2 +List all docker images +```docker ps -a``` + +Stop and remove all Armadillo 2 related images (except for MinIO), e.g. +``` +docker rm armadillo_auth_1 armadillo_console_1 armadillo_rserver-default_1 armadillo_rserver-mediation_1 armadillo_rserver-exposome_1 armadillo_rserver-omics_1 armadillo_armadillo_1 -f +``` +Check with `docker ps -a` if there are still containers running, if so remove these (except for the MinIO) in the same way as the others. + +#### 5. Install armadillo +``` +apt update +apt install openjdk-19-jre-headless +apt install docker.io +``` +The docker.io step might fail because containerd already exists, if that's the case, remove containerd and try again: +``` +apt remove containerd +apt install docker.io +``` + +Get armadillo: +``` +wget https://raw.githubusercontent.com/molgenis/molgenis-service-armadillo/master/scripts/install/armadillo-setup.sh +bash armadillo-setup.sh \ + --admin-user admin \ + --admin-password xxxxx + --domain my.server.com \ + --oidc \ + --oidc_url https://lifecycle-auth.molgenis.org \ + --oidc_clientid clientid \ + --oidc_clientsecret secret \ + --cleanup \ +``` +Don't forget to set a proper admin password (use a generator), domain, clientid and clientsecret. The client id and +secret can be found on the lifecycle auth server in the configuration for your server. If you don't have permissions to +receive this, you can ask the support team to get it for you. + +Open armadillo in the browser and try to login using basicauth to check if the server is running properly. If it's not +running at all, try: +``` +systemctl start armadillo +``` + +#### 6. Export data from Armadillo 2 into armadillo 3 +Look up the user/password in the application.yml of the old armadillo. They're called MinIO access key and minio +secret key. +``` +cat /root/armadillo2-backup/application-armadillo2.yml +``` +Do the following step in a separate screen. On ubuntu use: +``` +screen +``` +Navigate to the armadillo folder: +``` +cd /usr/share/armadillo +``` +This step will copy Armadillo 2 data from minio into the folder matching of an Armadillo 3 data folder: +``` +mkdir data +wget https://raw.githubusercontent.com/molgenis/molgenis-service-armadillo/master/scripts/migrate-minio.py +python3 migrate-minio.py --minio http://localhost:9000 --target /usr/share/armadillo/data +``` + +This might take a couple of minutes. You can detach the screen using `ctrl+a` followed by `d` and reattach it using +`screen -r`. + +#### 7. Run Armadillo 3 using exported data +Make sure to move the exported data into the new 'data' folder. Optionally you might need to fix user permissions, e.g.: +``` +chown armadillo:armadillo -R data +``` +Check if armadillo is running by going to the URL of your server in the browser, login and navigate to the projects tab. + +#### 8. Optionally, acquire a permission set from MOLGENIS team +If you previously run central authorisation server with MOLGENIS team, they can provide you with procedure to load +pre-existing permissions. They will use: +``` +wget https://raw.githubusercontent.com/molgenis/molgenis-service-armadillo/master/scripts/migrate-auth.py +python3 migrate-auth.py --fusion-auth https://lifecycle-auth.molgenis.org --armadillo https://thearmadillourl.net +``` +Now check if all users and data are properly migrated. + +#### 9. Cleanup ngnix config + +Change `/etc/ngninx/sites-available/armadillo.conf` to: +``` +server { + listen 80; + server_name urlofyourserver.org + include /etc/nginx/global.d/*.conf; + location / { + proxy_pass http://localhost:8080; + client_max_body_size 0; + proxy_read_timeout 600s; + proxy_redirect http://localhost:8080/ $scheme://$host/; + proxy_set_header Host $host; + proxy_http_version 1.1; + } +} +``` +Note that the `https://` is missing in the server_name part. + +Remove the console and storage file from: `/etc/ngninx/sites-enabled/`. + +``` +system restart ngninx +``` + +#### 10. Fix application.yml +Make sure the following is added: +``` +server: +forward-headers-strategy: framework +``` + +#### 11. Fix URLs in the lifecycle FusionAuth +Add the following to the config of your server: +`https://yourserver.com/login/oauth2/code/molgenis` + +#### 12. Set up profiles +Login to armadillo in the browser. Navigate to the "Profiles" tab. Add a new profile with the following properties: + +Name: `xenon` +Image: `datashield/armadillo-rserver_caravan-xenon:latest` +Package whitelist: `dsBase`, `resourcer`, `dsMediation`, `dsMTLBase`, `dsSurvival`, `dsExposome` + +Assign a random 9-number seed and create and start the container. + +#### 13. Remove old MinIO data +First remove the MinIO docker container. First check the name of the container using `docker ps -a`, then: +``` +docker rm containername -f +``` +After that remove the data: +```rm -Rf /var/lib/minio/ ``` From dfb2a9f71f02e820d4cc2e84780080a6a05808ad Mon Sep 17 00:00:00 2001 From: Clemens Tolboom Date: Thu, 5 Oct 2023 14:39:10 +0200 Subject: [PATCH 2/4] Add missing file. --- CONTRIBUTING.md | 54 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+) create mode 100644 CONTRIBUTING.md diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md new file mode 100644 index 000000000..4e22316c5 --- /dev/null +++ b/CONTRIBUTING.md @@ -0,0 +1,54 @@ +# Contributing to Armadillo + +## Running Armadillo from source code + +You can run from source code as follows: + +1. Install Java and Docker +2. Checkout the source using ```git clone https://github.com/molgenis/molgenis-service-armadillo.git``` +3. Optionally copy ```application.template.yml``` to ```application.yml``` to change settings (will be .gitignored) +4. Compile and execute the code using ```./gradlew run``` + +Note: contact MOLGENIS team if you want to contribute and need a testing OIDC config that you can run against localhost. + +# Developing Armadillo + +We use gradle to build: +* run using ```./gradlew run``` +* run tests using ```./gradlew test``` + +We use intellij to develop +* To run or debug in intellij, right click on armadillo/src/main/java/org.molgenis.armdadillo/ArmadilloServiceAppliction and choose 'Run/Debug Armadillo...' +* To run using oidc, create a copy of [application.yml](application.template.yml) in root of your project + +We have a swagger-ui to quickly see and test available web services at http://localhost:8080/swagger-ui/ + +# Developing DataSHIELD packages in Armadillo +As package developer will want to push your new packages into a DataSHIELD profile + +* You can start Armadillo with defaults as described above; then use admin/admin as authentication +* to see what profile are available and has been selected: +``` +curl -u admin:admin http://localhost:8080/profiles +``` +* to change selected profile 'my-profile': +``` +curl -X POST http://localhost:8080/select-profile \ + -H 'Content-Type: application/json' \ + -d 'default' +``` +* to install-packages in DataSHIELD current using admin user: +``` +curl -u admin:admin -v \ +-H 'Content-Type: multipart/form-data' \ +-F "file=@dsBase_6.3.0.tar.gz" \ +-X POST http://localhost:8080/install-package +``` +* to update whitelist of your current profile: +``` +curl -u admin:admin -X POST http://localhost:8080/whitelist/dsBase +``` +* to get whitelist of current profile: +``` +curl -u admin:admin http://localhost:8080/whitelist +``` From 45f97f2b6238d6f18e44ed209867b03715759a2a Mon Sep 17 00:00:00 2001 From: marikaris Date: Fri, 3 Nov 2023 14:13:05 +0100 Subject: [PATCH 3/4] chore: fix typo --- docs/upgrade-2-3.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/upgrade-2-3.md b/docs/upgrade-2-3.md index 81df3ad7c..66810469c 100644 --- a/docs/upgrade-2-3.md +++ b/docs/upgrade-2-3.md @@ -1,4 +1,4 @@ -### Migrate Armadillo 2 to Armadille 3 +### Migrate Armadillo 2 to Armadillo 3 To export data from and Armadillo 2 server take the following steps: From 57905c2aa8439e97c34931a7768ec09ed219dd2f Mon Sep 17 00:00:00 2001 From: marikaris Date: Fri, 3 Nov 2023 14:48:22 +0100 Subject: [PATCH 4/4] chore: move docker.host.internal docs to contributing --- CONTRIBUTING.md | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 4e22316c5..ebee4b918 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -23,6 +23,15 @@ We use intellij to develop We have a swagger-ui to quickly see and test available web services at http://localhost:8080/swagger-ui/ +## Profile xenon with resourcer whitelisted returns a host.docker.internal error +When developing locally, it might be possible to come across the container error: `Could not resolve host: host.docker.internal`, +especially when developing on a non-supported operating system when resourcer is whitelisted (such as xenon). +Sadly, the only way around this error is to edit the JAVA source code of Armadillo to include starting with an extra host. +To enable this feature, you must edit the private method `installImage` of [DockerService.java](https://github.com/molgenis/molgenis-service-armadillo/blob/master/armadillo/src/main/java/org/molgenis/armadillo/profile/DockerService.java) `CreateContainerCmd cmd` from `.withHostConfig(new HostConfig().withPortBindings(portBindings))` to `.withHostConfig(new HostConfig().withPortBindings(portBindings).withExtraHosts("host.docker.internal:host-gateway"))`. + +Please note that in order for this change to work, you must use Intellij to run Armadillo or compile the new source code. +Also, if you already have a xenon container build and running, stop and remove that container. + # Developing DataSHIELD packages in Armadillo As package developer will want to push your new packages into a DataSHIELD profile