From c8289a39de60b995077bf3750ec67977f6f459d1 Mon Sep 17 00:00:00 2001 From: Louise Deconinck Date: Wed, 4 Sep 2024 14:54:30 +0200 Subject: [PATCH 1/2] Package based interoperability --- slides/slides.qmd | 54 +++++++++++++++++++++++++++++++++++++++++------ 1 file changed, 48 insertions(+), 6 deletions(-) diff --git a/slides/slides.qmd b/slides/slides.qmd index b4b6405..deb0b75 100644 --- a/slides/slides.qmd +++ b/slides/slides.qmd @@ -24,12 +24,49 @@ exectute: # Introduction -# File formats +1. How do you interact with a package in another language? +2. How do you make you package useable for developers in other languages? + +We will be focusing on R & Python + +# How do you interact with a package in another language? + +1. File format based interoperability +2. In-memory interoperability + +# How do you make your package useable for developers in other languages? + +1. Package-based interoperability +2. Best practices + +## Package-based interoperability +or: the question of reimplementation. + +Consider the pros: +- Discoverability +- Can your package be useful in other domains? +- Very user friendly + +Consider the cons: +- Think twice: is it worth it? +- It's a lot of work +- How will you keep it up to date? +- How will you ensure parity? + +## Best practices +1. Work with the standards +2. Work with matrices, arrays and dataframes +3. Provide vignettes on interoperability + +# File format based interoperability + # Calling Python from R and vice versa or: in-memory interoperability ## Overview + + reticulate: 1. Call Python in R @@ -37,7 +74,6 @@ reticulate: basilisk allows managing Python environments within the BioConductor ecosystem - rpy2: 1. Call R in Python @@ -53,16 +89,23 @@ rpy2: - ensure that the method accepts this - you need to be familiar with using & managing both environments - data duplication +- you need to manage the environments ## accessing R from Python -rpy2 +rpy2: an interface to R running embedded in a Python process + + + Jupyter notebooks: - Use IPython magic interface - most useful for matrices & arrays -e.g. `%%R -i input -o output` +e.g. `%%R -i input -o output` as the first line of the cell + +## accessing R from Python +rpy2 - use anndata2ri: converts anndata objects to SingleCellExperiment @@ -70,8 +113,7 @@ e.g. `%%R -i input -o output` reticulate basilisk -# Package-based interoperability -or: the question of reimplementation + # Workflows From e395e043f3e38dd766ab659d669bb81530514199 Mon Sep 17 00:00:00 2001 From: Louise Deconinck Date: Wed, 4 Sep 2024 16:33:13 +0200 Subject: [PATCH 2/2] rpy2 part --- slides/slides.qmd | 68 ++++++++++++++++++++++++++++++----------------- 1 file changed, 44 insertions(+), 24 deletions(-) diff --git a/slides/slides.qmd b/slides/slides.qmd index deb0b75..586e46c 100644 --- a/slides/slides.qmd +++ b/slides/slides.qmd @@ -22,6 +22,9 @@ exectute: echo: true --- +Todo: refer to sc best practices +Todo: paper Lior Pachter differences R & Python + # Introduction 1. How do you interact with a package in another language? @@ -60,24 +63,26 @@ Consider the cons: # File format based interoperability +# In-memory interoperability +Calling Python in an R environment and vice versa. +- No need to write out datasets. +- Best suited to calling functions -# Calling Python from R and vice versa -or: in-memory interoperability +rpy2 and reticulate ## Overview +advantages & disadvantaes -reticulate: - -1. Call Python in R -2. embed a Python session within your R session - -basilisk allows managing Python environments within the BioConductor ecosystem - -rpy2: +rpy2 +1. overview +2. usage +3. pitfalls -1. Call R in Python -2. run R in a Python process +reticulate: +1. overview +2. usage +3. pitfalls ## in-memory interoperability advantages - no need to write & read results @@ -91,28 +96,43 @@ rpy2: - data duplication - you need to manage the environments -## accessing R from Python +## rpy2 +Accessing R from Python -rpy2: an interface to R running embedded in a Python process +Example: code block +`rpy2.rinterface`, the low-level interface +`rpy2.robjects`, the high-level interface +Example for calling R functions +Example for conversion of arrays -Jupyter notebooks: -- Use IPython magic interface -- most useful for matrices & arrays +## rpy2 +Conversion: +numpy & pandas -e.g. `%%R -i input -o output` as the first line of the cell +Example: code block -## accessing R from Python -rpy2 +sparse matrices: anndata2ri + +## rpy2 + +Jupyter(like) notebooks: +make use of the Magic command interface + +`%load_ext rmagic` +`%R -i input -o output` + +example -- use anndata2ri: converts anndata objects to SingleCellExperiment +## rpy2 -## accessing Python from R -reticulate -basilisk +1. let your method be run with matrices and arrays as input +2. anndata2ri +? +## reticulate # Workflows