Skip to content

Commit

Permalink
Slides
Browse files Browse the repository at this point in the history
  • Loading branch information
LouiseDck committed Sep 11, 2024
1 parent e6403b8 commit 9f070a4
Show file tree
Hide file tree
Showing 3 changed files with 169 additions and 83 deletions.
24 changes: 4 additions & 20 deletions book/in_memory/reticulate.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -55,27 +55,11 @@ result_r <- py_to_r(result)
r_to_py(result_r)
```


# Interactive sessions
One of the most useful ways to take advantage of in-memory interoperability is to use it in interactive sessions, where you're exploring the data and want to try out some functions non-native to your language of choice.

Jupyter notebooks (and some other notebooks) make this possible from the Python side: using IPython line and cell magic and rpy2, you can easily run an R jupyter cell in your notebooks.

```{python show_magic, eval=FALSE}
%load_ext rpy2.ipython # line magic that loads the rpy2 ipython extension.
# this extension allows the use of the following cell magic
%%R -i input -o output # this line allows to specify inputs
# (which will be converted to R objects) and outputs
# (which will be converted back to Python objects)
# this line is put at the start of a cell
# the rest of the cell will be run as R code
```

# Interactivity
You can easily include Python chunks in Rmarkdown notebooks using the Python engine in `knitr`.

# Usecase
We will not showcase the usefulness of reticulate by using the DE analysis: it would involve loading in `pandas` to create a Python dataframe, adding rownames and columnnames and then grouping them, but that is easier to do just in R.
We will not showcase the usefulness of reticulate by using the DE analysis: it would involve loading in `pandas` to create a Python dataframe, adding rownames and columnnames and then grouping them, but that is easier to do natively in R.

A more interesting thing you can do using `reticulate` is interacting with anndata-based Python packages, such as `scanpy`!

Expand Down Expand Up @@ -104,7 +88,7 @@ adata

We can't easily show the result of the plot in this Quarto notebook, but we can save it and show it:

```{r scanpy_plot}
```{r scanpy_plot, warning=TRUE}
path <- "umap.png"
sc$pl$umap(adata, color="leiden_res1", save=path)
```
Expand Down
17 changes: 17 additions & 0 deletions book/in_memory/rpy2.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,23 @@ with anndata2ri.converter.context():
ad2 = anndata2ri.rpy2py(sce)
```

## Interactive sessions
One of the most useful ways to take advantage of in-memory interoperability is to use it in interactive sessions, where you're exploring the data and want to try out some functions non-native to your language of choice.

Jupyter notebooks (and some other notebooks) make this possible from the Python side: using IPython line and cell magic and rpy2, you can easily run an R jupyter cell in your notebooks.

```{python show_magic, eval=FALSE}
%load_ext rpy2.ipython # line magic that loads the rpy2 ipython extension.
# this extension allows the use of the following cell magic
%%R -i input -o output # this line allows to specify inputs
# (which will be converted to R objects) and outputs
# (which will be converted back to Python objects)
# this line is put at the start of a cell
# the rest of the cell will be run as R code
```

## Usecase: ran in Python

We will perform the Compute DE step not in R, but in Python
Expand Down
211 changes: 148 additions & 63 deletions slides/slides.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,6 @@ exectute:
echo: true
---

# test

{{< include ../book/in_memory/pitfalls.qmd#rpy2_mapping echo=true >}}

# Introduction

1. How do you interact with a package in another language?
Expand All @@ -43,97 +39,186 @@ We will be focusing on R & Python
1. Package-based interoperability
2. Best practices

## Package-based interoperability
# Package-based interoperability
or: the question of reimplementation.

Consider the pros:
- Discoverability
- Can your package be useful in other domains?
- Very user friendly
- Consider the pros:

Consider the cons:
- Think twice: is it worth it?
- It's a lot of work
- How will you keep it up to date?
- How will you ensure parity?
1. Discoverability
2. Can your package be useful in other domains?
3. Very user friendly

## Best practices
1. Work with the standards
2. Work with matrices, arrays and dataframes
3. Provide vignettes on interoperability
- Consider the cons:

# File format based interoperability
1. Think twice: is it worth it?
2. **It's a lot of work**
3. How will you keep it up to date?
4. How will you ensure parity?

# In-memory interoperability
Calling Python in an R environment and vice versa.
- No need to write out datasets.
- Best suited to calling functions
# Package-based interoperability

rpy2 and reticulate
Please learn both R & Python

## Overview
# Best practices
1. Work with the standards
2. Work with matrices, arrays and dataframes
3. Provide vignettes on interoperability

advantages & disadvantaes
# In-memory interoperability
![](../book/in_memory/images/imm_overview.png)

rpy2
1. overview
2. usage
3. pitfalls
# Overview

reticulate:
1. overview
2. usage
3. pitfalls
1. Advantages & disadvantages
2. Pitfalls when using Python & R
2. Rpy2
3. Reticulate

## in-memory interoperability advantages
# in-memory interoperability advantages
- no need to write & read results
- useful when you need a limited amount of functions in another language

## in-memory interoperability drawbacks
- no access to classes
- you need to extract necessary matrices & arrays for the method
- ensure that the method accepts this
- you need to be familiar with using & managing both environments
# in-memory interoperability drawbacks
- not always access to all classes
- data duplication
- you need to manage the environments

## rpy2
Accessing R from Python
# Pitfalls when using Python and R
**Column major vs row major matrices**
In R: every dense matrix is stored as column major

![](../book/in_memory/images/inmemorymatrix.png)

# Pitfalls when using Python and R
**Indexing**

![](../book/in_memory/images/indexing.png)

# Pitfalls when using Python and R
**dots and underscores**

- mapping in rpy2

```python
from rpy2.robjects.packages import importr

d = {'package.dependencies': 'package_dot_dependencies',
'package_dependencies': 'package_uscore_dependencies'}
tools = importr('tools', robject_translations = d)
```

# Pitfalls when using Python and R
**Integers**

```r
library(reticulate)
bi <- reticulate::import_builtins()

bi$list(bi$range(0, 5))
# TypeError: 'float' object cannot be interpreted as an integer
```

```r
library(reticulate)
bi <- reticulate::import_builtins()

bi$list(bi$range(0L, 5L))
# [1] 0 1 2 3 4
```

# Rpy2: basics
- Accessing R from Python
- `rpy2.rinterface`, the low-level interface
- `rpy2.robjects`, the high-level interface

```python
import rpy2
import rpy2.robjects as robjects

vector = robjects.IntVector([1,2,3])
rsum = robjects.r['sum']

rsum(vector)
```

# Rpy2: basics

```python
str_vector = robjects.StrVector(['abc', 'def', 'ghi'])
flt_vector = robjects.FloatVector([0.3, 0.8, 0.7])
int_vector = robjects.IntVector([1, 2, 3])
mtx = robjects.r.matrix(robjects.IntVector(range(10)), nrow=5)
```

# Rpy2: numpy

```python
import numpy as np

from rpy2.robjects import numpy2ri
from rpy2.robjects import default_converter

rd_m = np.random.random((10, 7))

with (default_converter + numpy2ri.converter).context():
mtx2 = robjects.r.matrix(rd_m, nrow = 10)
```

# Rpy2: pandas
```python
import pandas as pd

from rpy2.robjects import pandas2ri

pd_df = pd.DataFrame({'int_values': [1,2,3],
'str_values': ['abc', 'def', 'ghi']})

with (default_converter + pandas2ri.converter).context():
pd_df_r = robjects.DataFrame(pd_df)
```

Example: code block
# Rpy2: sparse matrices

`rpy2.rinterface`, the low-level interface
`rpy2.robjects`, the high-level interface
```python
import scipy as sp

Example for calling R functions
from anndata2ri import scipy2ri

Example for conversion of arrays
sparse_matrix = sp.sparse.csc_matrix(rd_m)

## rpy2
Conversion:
numpy & pandas
with (default_converter + scipy2ri.converter).context():
sp_r = scipy2ri.py2rpy(sparse_matrix)
```

Example: code block
# Rpy2: anndata

sparse matrices: anndata2ri
```python
import anndata as ad
import scanpy.datasets as scd

## rpy2
import anndata2ri

Jupyter(like) notebooks:
make use of the Magic command interface
adata_paul = scd.paul15()

`%load_ext rmagic`
`%R -i input -o output`
with anndata2ri.converter.context():
sce = anndata2ri.py2rpy(adata_paul)
ad2 = anndata2ri.rpy2py(sce)
```

example
# Rpy2: interactivity

## rpy2
```python
%load_ext rpy2.ipython # line magic that loads the rpy2 ipython extension.
# this extension allows the use of the following cell magic

1. let your method be run with matrices and arrays as input
2. anndata2ri
?
%%R -i input -o output # this line allows to specify inputs
# (which will be converted to R objects) and outputs
# (which will be converted back to Python objects)
# this line is put at the start of a cell
# the rest of the cell will be run as R code
```

## reticulate
# Reticulate


# Workflows
Expand Down

0 comments on commit 9f070a4

Please sign in to comment.