Slides

saeyslab · Sep 11, 2024 · 9f070a4 · 9f070a4
1 parent e6403b8
commit 9f070a4
Show file tree

Hide file tree

Showing 3 changed files with 169 additions and 83 deletions.
diff --git a/book/in_memory/reticulate.qmd b/book/in_memory/reticulate.qmd
@@ -55,27 +55,11 @@ result_r <- py_to_r(result)
 r_to_py(result_r)
 ```
 
-
-# Interactive sessions
-One of the most useful ways to take advantage of in-memory interoperability is to use it in interactive sessions, where you're exploring the data and want to try out some functions non-native to your language of choice.
-
-Jupyter notebooks (and some other notebooks) make this possible from the Python side: using IPython line and cell magic and rpy2, you can easily run an R jupyter cell in your notebooks.
-
-```{python show_magic, eval=FALSE}
-%load_ext rpy2.ipython  # line magic that loads the rpy2 ipython extension.
-                        # this extension allows the use of the following cell magic
-
-%%R -i input -o output  # this line allows to specify inputs 
-                        # (which will be converted to R objects) and outputs 
-                        # (which will be converted back to Python objects) 
-                        # this line is put at the start of a cell
-                        # the rest of the cell will be run as R code
-
-```
-
+# Interactivity
+You can easily include Python chunks in Rmarkdown notebooks using the Python engine in `knitr`.
 
 # Usecase
-We will not showcase the usefulness of reticulate by using the DE analysis: it would involve loading in `pandas` to create a Python dataframe, adding rownames and columnnames and then grouping them, but that is easier to do just in R.
+We will not showcase the usefulness of reticulate by using the DE analysis: it would involve loading in `pandas` to create a Python dataframe, adding rownames and columnnames and then grouping them, but that is easier to do natively in R.
 
 A more interesting thing you can do using `reticulate` is interacting with anndata-based Python packages, such as `scanpy`! 
 
@@ -104,7 +88,7 @@ adata
 
 We can't easily show the result of the plot in this Quarto notebook, but we can save it and show it:
 
-```{r scanpy_plot}
+```{r scanpy_plot, warning=TRUE}
 path <- "umap.png"
 sc$pl$umap(adata, color="leiden_res1", save=path)
 ```

diff --git a/book/in_memory/rpy2.qmd b/book/in_memory/rpy2.qmd
@@ -86,6 +86,23 @@ with anndata2ri.converter.context():
     ad2 = anndata2ri.rpy2py(sce)
 ```
 
+## Interactive sessions
+One of the most useful ways to take advantage of in-memory interoperability is to use it in interactive sessions, where you're exploring the data and want to try out some functions non-native to your language of choice.
+
+Jupyter notebooks (and some other notebooks) make this possible from the Python side: using IPython line and cell magic and rpy2, you can easily run an R jupyter cell in your notebooks.
+
+```{python show_magic, eval=FALSE}
+%load_ext rpy2.ipython  # line magic that loads the rpy2 ipython extension.
+                        # this extension allows the use of the following cell magic
+
+%%R -i input -o output  # this line allows to specify inputs 
+                        # (which will be converted to R objects) and outputs 
+                        # (which will be converted back to Python objects) 
+                        # this line is put at the start of a cell
+                        # the rest of the cell will be run as R code
+
+```
+
 ## Usecase: ran in Python
 
 We will perform the Compute DE step not in R, but in Python

diff --git a/slides/slides.qmd b/slides/slides.qmd
@@ -22,10 +22,6 @@ exectute:
     echo: true
 ---
 
-# test
-
-{{< include ../book/in_memory/pitfalls.qmd#rpy2_mapping echo=true >}}
-
 # Introduction
 
 1. How do you interact with a package in another language?
@@ -43,97 +39,186 @@ We will be focusing on R & Python
 1. Package-based interoperability
 2. Best practices
 
-## Package-based interoperability
+# Package-based interoperability
 or: the question of reimplementation.
 
-Consider the pros:
-- Discoverability
-- Can your package be useful in other domains?
-- Very user friendly
+- Consider the pros:
 
-Consider the cons:
-- Think twice: is it worth it?
-- It's a lot of work
-- How will you keep it up to date?
-- How will you ensure parity?
+  1. Discoverability
+  2. Can your package be useful in other domains?
+  3. Very user friendly
 
-## Best practices
-1. Work with the standards
-2. Work with matrices, arrays and dataframes
-3. Provide vignettes on interoperability
+- Consider the cons:
 
-# File format based interoperability
+  1. Think twice: is it worth it?
+  2. **It's a lot of work**
+  3. How will you keep it up to date?
+  4. How will you ensure parity?
 
-# In-memory interoperability
-Calling Python in an R environment and vice versa.
-- No need to write out datasets.
-- Best suited to calling functions
+# Package-based interoperability
 
-rpy2 and reticulate
+Please learn both R & Python
 
-## Overview
+# Best practices
+1. Work with the standards
+2. Work with matrices, arrays and dataframes
+3. Provide vignettes on interoperability
 
-advantages & disadvantaes
+# In-memory interoperability
+![](../book/in_memory/images/imm_overview.png)
 
-rpy2
-1. overview
-2. usage
-3. pitfalls
+# Overview
 
-reticulate:
-1. overview
-2. usage
-3. pitfalls
+1. Advantages & disadvantages
+2. Pitfalls when using Python & R
+2. Rpy2
+3. Reticulate
 
-## in-memory interoperability advantages
+# in-memory interoperability advantages
 - no need to write & read results
 - useful when you need a limited amount of functions in another language
 
-## in-memory interoperability drawbacks
-- no access to classes
-- you need to extract necessary matrices & arrays for the method
-- ensure that the method accepts this
-- you need to be familiar with using & managing both environments
+# in-memory interoperability drawbacks
+- not always access to all classes
 - data duplication
 - you need to manage the environments
 
-## rpy2
-Accessing R from Python
+# Pitfalls when using Python and R
+**Column major vs row major matrices**
+In R: every dense matrix is stored as column major
+
+![](../book/in_memory/images/inmemorymatrix.png)
+
+# Pitfalls when using Python and R
+**Indexing**
+
+![](../book/in_memory/images/indexing.png)
+
+# Pitfalls when using Python and R
+**dots and underscores**
+
+- mapping in rpy2
+
+```python
+from rpy2.robjects.packages import importr
+
+d = {'package.dependencies': 'package_dot_dependencies',
+     'package_dependencies': 'package_uscore_dependencies'}
+tools = importr('tools', robject_translations = d)
+```
+
+# Pitfalls when using Python and R
+**Integers**
+
+```r 
+library(reticulate)
+bi <- reticulate::import_builtins()
+
+bi$list(bi$range(0, 5))
+# TypeError: 'float' object cannot be interpreted as an integer
+```
+
+```r 
+library(reticulate)
+bi <- reticulate::import_builtins()
+
+bi$list(bi$range(0L, 5L))
+# [1] 0 1 2 3 4
+```
+
+# Rpy2: basics
+- Accessing R from Python
+  - `rpy2.rinterface`, the low-level interface
+  - `rpy2.robjects`, the high-level interface
+
+```python
+import rpy2
+import rpy2.robjects as robjects
+
+vector = robjects.IntVector([1,2,3])
+rsum = robjects.r['sum']
+
+rsum(vector)
+```
+
+# Rpy2: basics
+
+```python
+str_vector = robjects.StrVector(['abc', 'def', 'ghi'])
+flt_vector = robjects.FloatVector([0.3, 0.8, 0.7])
+int_vector = robjects.IntVector([1, 2, 3])
+mtx = robjects.r.matrix(robjects.IntVector(range(10)), nrow=5)
+```
+
+# Rpy2: numpy
+
+```python
+import numpy as np
+
+from rpy2.robjects import numpy2ri
+from rpy2.robjects import default_converter
+
+rd_m = np.random.random((10, 7))
+
+with (default_converter + numpy2ri.converter).context():
+    mtx2 = robjects.r.matrix(rd_m, nrow = 10)
+```
+
+# Rpy2: pandas
+```python
+import pandas as pd
+
+from rpy2.robjects import pandas2ri
+
+pd_df = pd.DataFrame({'int_values': [1,2,3],
+                      'str_values': ['abc', 'def', 'ghi']})
+
+with (default_converter + pandas2ri.converter).context():
+    pd_df_r = robjects.DataFrame(pd_df)
+```
 
-Example: code block
+# Rpy2: sparse matrices
 
-`rpy2.rinterface`, the low-level interface
-`rpy2.robjects`, the high-level interface
+```python
+import scipy as sp
 
-Example for calling R functions
+from anndata2ri import scipy2ri
 
-Example for conversion of arrays
+sparse_matrix = sp.sparse.csc_matrix(rd_m)
 
-## rpy2
-Conversion:
-numpy & pandas
+with (default_converter + scipy2ri.converter).context():
+    sp_r = scipy2ri.py2rpy(sparse_matrix)
+```
 
-Example: code block
+# Rpy2: anndata
 
-sparse matrices: anndata2ri
+```python
+import anndata as ad
+import scanpy.datasets as scd
 
-## rpy2
+import anndata2ri
 
-Jupyter(like) notebooks:
-make use of the Magic command interface
+adata_paul = scd.paul15()
 
-`%load_ext rmagic`
-`%R -i input -o output`
+with anndata2ri.converter.context():
+    sce = anndata2ri.py2rpy(adata_paul)
+    ad2 = anndata2ri.rpy2py(sce)
+```
 
-example
+# Rpy2: interactivity
 
-## rpy2
+```python
+%load_ext rpy2.ipython  # line magic that loads the rpy2 ipython extension.
+                        # this extension allows the use of the following cell magic
 
-1. let your method be run with matrices and arrays as input
-2. anndata2ri
-?
+%%R -i input -o output  # this line allows to specify inputs 
+                        # (which will be converted to R objects) and outputs 
+                        # (which will be converted back to Python objects) 
+                        # this line is put at the start of a cell
+                        # the rest of the cell will be run as R code
+```
 
-## reticulate
+# Reticulate
 
 
 # Workflows