Deploying to gh-pages from @ d880a17 🚀

saeyslab · Sep 8, 2024 · 74aa0a4 · 74aa0a4
1 parent 57b09b8
commit 74aa0a4
Show file tree

Hide file tree

Showing 4 changed files with 28 additions and 29 deletions.
diff --git a/book/in_memory_interoperability.html b/book/in_memory_interoperability.html
@@ -233,7 +233,7 @@ <h1 class="title"><span class="chapter-number">3</span>&nbsp; <span class="chapt
 </header>
 
 
-<p>One aproach to interoperability is to work on in-memory representations of one object, and convert these in memory between different programming languages. This does not require you to write out your datasets and read them in in the different programming enivronment, but it does require you to set up an environment in both languages, which can be cumbersome. One language will act as the main language, and you will intereact with the other language using an FFI (foreign function interface). When evaluating R code within a Python program, we will make use of rpy2 to accomplish this. When evaluating Python code within an R program, we will make use of reticulate.</p>
+<p>One aproach to interoperability is to work on in-memory representations of one object, and convert these in memory between different programming languages. This does not require you to write out your datasets and read them in in the different programming environment, but it does require you to set up an environment in both languages, which can be cumbersome. One language will act as the main language, and you will interact with the other language using an FFI (foreign function interface). When evaluating R code within a Python program, we will make use of rpy2 to accomplish this. When evaluating Python code within an R program, we will make use of reticulate.</p>
 <section id="rpy2-basic-functionality" class="level2" data-number="3.1">
 <h2 data-number="3.1" class="anchored" data-anchor-id="rpy2-basic-functionality"><span class="header-section-number">3.1</span> Rpy2: basic functionality</h2>
 <p>Rpy2 is a foreign function interface to R. It can be used in the following way:</p>
@@ -318,18 +318,17 @@ <h2 data-number="3.1" class="anchored" data-anchor-id="rpy2-basic-functionality"
 <div class="cell-output cell-output-stdout">
 <pre><code>
   0%|          | 0.00/9.82M [00:00&lt;?, ?B/s]
-  0%|          | 8.00k/9.82M [00:00&lt;02:10, 79.0kB/s]
+  0%|          | 8.00k/9.82M [00:00&lt;02:10, 78.8kB/s]
   0%|          | 32.0k/9.82M [00:00&lt;01:01, 167kB/s] 
   1%|          | 96.0k/9.82M [00:00&lt;00:27, 367kB/s]
-  2%|1         | 200k/9.82M [00:00&lt;00:16, 607kB/s] 
+  2%|1         | 200k/9.82M [00:00&lt;00:16, 609kB/s] 
   4%|4         | 408k/9.82M [00:00&lt;00:09, 1.09MB/s]
   8%|8         | 840k/9.82M [00:00&lt;00:04, 2.10MB/s]
  17%|#6        | 1.66M/9.82M [00:00&lt;00:02, 4.04MB/s]
- 26%|##5       | 2.54M/9.82M [00:00&lt;00:01, 5.02MB/s]
- 56%|#####5    | 5.45M/9.82M [00:01&lt;00:00, 11.7MB/s]
- 73%|#######2  | 7.16M/9.82M [00:01&lt;00:00, 13.1MB/s]
- 91%|#########1| 8.98M/9.82M [00:01&lt;00:00, 14.4MB/s]
-100%|##########| 9.82M/9.82M [00:01&lt;00:00, 8.26MB/s]</code></pre>
+ 34%|###3      | 3.33M/9.82M [00:00&lt;00:00, 7.88MB/s]
+ 53%|#####3    | 5.21M/9.82M [00:00&lt;00:00, 10.4MB/s]
+ 83%|########3 | 8.16M/9.82M [00:01&lt;00:00, 15.4MB/s]
+100%|##########| 9.82M/9.82M [00:01&lt;00:00, 8.71MB/s]</code></pre>
 </div>
 <div class="sourceCode cell-code" id="cb10"><pre class="sourceCode python code-with-copy"><code class="sourceCode python"><span id="cb10-1"><a href="#cb10-1" aria-hidden="true" tabindex="-1"></a></span>
 <span id="cb10-2"><a href="#cb10-2" aria-hidden="true" tabindex="-1"></a><span class="cf">with</span> anndata2ri.converter.context():</span>

diff --git a/book/introduction.html b/book/introduction.html
@@ -188,9 +188,9 @@ <h2 id="toc-title">Table of contents</h2>
 
   <ul>
   <li><a href="#code-porting" id="toc-code-porting" class="nav-link active" data-scroll-target="#code-porting"><span class="header-section-number">1.1</span> Code porting</a></li>
-  <li><a href="#in-memory-interoperability" id="toc-in-memory-interoperability" class="nav-link" data-scroll-target="#in-memory-interoperability"><span class="header-section-number">1.2</span> In-memory Interoperability</a></li>
-  <li><a href="#disk-based-interoperability" id="toc-disk-based-interoperability" class="nav-link" data-scroll-target="#disk-based-interoperability"><span class="header-section-number">1.3</span> Disk-based Interoperability</a></li>
-  <li><a href="#workflow-frameworks" id="toc-workflow-frameworks" class="nav-link" data-scroll-target="#workflow-frameworks"><span class="header-section-number">1.4</span> Workflow Frameworks</a></li>
+  <li><a href="#in-memory-interoperability" id="toc-in-memory-interoperability" class="nav-link" data-scroll-target="#in-memory-interoperability"><span class="header-section-number">1.2</span> In-memory interoperability</a></li>
+  <li><a href="#disk-based-interoperability" id="toc-disk-based-interoperability" class="nav-link" data-scroll-target="#disk-based-interoperability"><span class="header-section-number">1.3</span> Disk-based interoperability</a></li>
+  <li><a href="#workflow-frameworks" id="toc-workflow-frameworks" class="nav-link" data-scroll-target="#workflow-frameworks"><span class="header-section-number">1.4</span> Workflow frameworks</a></li>
   </ul>
 <div class="toc-actions"><ul><li><a href="https://github.com/saeyslab/polygloty/edit/main/book/introduction.qmd" class="toc-action"><i class="bi bi-github"></i>Edit this page</a></li><li><a href="https://github.com/saeyslab/polygloty/issues/new" class="toc-action"><i class="bi empty"></i>Report an issue</a></li><li><a href="https://github.com/saeyslab/polygloty/blob/main/book/introduction.qmd" class="toc-action"><i class="bi empty"></i>View source</a></li></ul></div></nav>
     </div>
@@ -225,15 +225,15 @@ <h2 data-number="1.1" class="anchored" data-anchor-id="code-porting"><span class
 <p>Furthermore, work is not done after the initial port – in order for the researcher’s work to be useful to others, the ported code must be maintained and kept up-to-date with the original implementation. For this reason, we don’t consider reimplementation a viable option for most use-cases and will not discuss it further in this book.</p>
 </section>
 <section id="in-memory-interoperability" class="level2" data-number="1.2">
-<h2 data-number="1.2" class="anchored" data-anchor-id="in-memory-interoperability"><span class="header-section-number">1.2</span> In-memory Interoperability</h2>
+<h2 data-number="1.2" class="anchored" data-anchor-id="in-memory-interoperability"><span class="header-section-number">1.2</span> In-memory interoperability</h2>
 <p>Tools like rpy2 and reticulate allow for direct communication between languages within a single analysis session. This approach provides flexibility and avoids intermediate file I/O, but can introduce complexity in managing dependencies and environments.</p>
 </section>
 <section id="disk-based-interoperability" class="level2" data-number="1.3">
-<h2 data-number="1.3" class="anchored" data-anchor-id="disk-based-interoperability"><span class="header-section-number">1.3</span> Disk-based Interoperability</h2>
+<h2 data-number="1.3" class="anchored" data-anchor-id="disk-based-interoperability"><span class="header-section-number">1.3</span> Disk-based interoperability</h2>
 <p>Storing intermediate results to disk in standardized, language-agnostic file formats (e.g., HDF5, Parquet) allows for sequential execution of scripts written in different languages. This approach is relatively simple but can lead to increased storage requirements and I/O overhead.</p>
 </section>
 <section id="workflow-frameworks" class="level2" data-number="1.4">
-<h2 data-number="1.4" class="anchored" data-anchor-id="workflow-frameworks"><span class="header-section-number">1.4</span> Workflow Frameworks</h2>
+<h2 data-number="1.4" class="anchored" data-anchor-id="workflow-frameworks"><span class="header-section-number">1.4</span> Workflow frameworks</h2>
 <p>Workflow management systems (e.g., Nextflow, Snakemake) provide a structured approach to orchestrate complex, multi-language pipelines, enhancing reproducibility and automation. However, they may require a learning curve and additional configuration.</p>
 
 

diff --git a/search.json b/search.json
@@ -23,8 +23,8 @@
     "objectID": "book/introduction.html#in-memory-interoperability",
     "href": "book/introduction.html#in-memory-interoperability",
     "title": "1  Introduction",
-    "section": "1.2 In-memory Interoperability",
-    "text": "1.2 In-memory Interoperability\nTools like rpy2 and reticulate allow for direct communication between languages within a single analysis session. This approach provides flexibility and avoids intermediate file I/O, but can introduce complexity in managing dependencies and environments.",
+    "section": "1.2 In-memory interoperability",
+    "text": "1.2 In-memory interoperability\nTools like rpy2 and reticulate allow for direct communication between languages within a single analysis session. This approach provides flexibility and avoids intermediate file I/O, but can introduce complexity in managing dependencies and environments.",
     "crumbs": [
       "<span class='chapter-number'>1</span>  <span class='chapter-title'>Introduction</span>"
     ]
@@ -33,8 +33,8 @@
     "objectID": "book/introduction.html#disk-based-interoperability",
     "href": "book/introduction.html#disk-based-interoperability",
     "title": "1  Introduction",
-    "section": "1.3 Disk-based Interoperability",
-    "text": "1.3 Disk-based Interoperability\nStoring intermediate results to disk in standardized, language-agnostic file formats (e.g., HDF5, Parquet) allows for sequential execution of scripts written in different languages. This approach is relatively simple but can lead to increased storage requirements and I/O overhead.",
+    "section": "1.3 Disk-based interoperability",
+    "text": "1.3 Disk-based interoperability\nStoring intermediate results to disk in standardized, language-agnostic file formats (e.g., HDF5, Parquet) allows for sequential execution of scripts written in different languages. This approach is relatively simple but can lead to increased storage requirements and I/O overhead.",
     "crumbs": [
       "<span class='chapter-number'>1</span>  <span class='chapter-title'>Introduction</span>"
     ]
@@ -43,8 +43,8 @@
     "objectID": "book/introduction.html#workflow-frameworks",
     "href": "book/introduction.html#workflow-frameworks",
     "title": "1  Introduction",
-    "section": "1.4 Workflow Frameworks",
-    "text": "1.4 Workflow Frameworks\nWorkflow management systems (e.g., Nextflow, Snakemake) provide a structured approach to orchestrate complex, multi-language pipelines, enhancing reproducibility and automation. However, they may require a learning curve and additional configuration.\n\n\n\n\nHeumos, Lukas, Anna C. Schaar, Christopher Lance, Anastasia Litinetskaya, Felix Drost, Luke Zappia, Malte D. Lücken, et al. 2023. “Best Practices for Single-Cell Analysis Across Modalities.” Nature Reviews Genetics 24 (8): 550–72. https://doi.org/10.1038/s41576-023-00586-w.\n\n\nZappia, Luke, and Fabian J. Theis. 2021. “Over 1000 Tools Reveal Trends in the Single-Cell RNA-Seq Analysis Landscape.” Genome Biology 22 (1). https://doi.org/10.1186/s13059-021-02519-4.",
+    "section": "1.4 Workflow frameworks",
+    "text": "1.4 Workflow frameworks\nWorkflow management systems (e.g., Nextflow, Snakemake) provide a structured approach to orchestrate complex, multi-language pipelines, enhancing reproducibility and automation. However, they may require a learning curve and additional configuration.\n\n\n\n\nHeumos, Lukas, Anna C. Schaar, Christopher Lance, Anastasia Litinetskaya, Felix Drost, Luke Zappia, Malte D. Lücken, et al. 2023. “Best Practices for Single-Cell Analysis Across Modalities.” Nature Reviews Genetics 24 (8): 550–72. https://doi.org/10.1038/s41576-023-00586-w.\n\n\nZappia, Luke, and Fabian J. Theis. 2021. “Over 1000 Tools Reveal Trends in the Single-Cell RNA-Seq Analysis Landscape.” Genome Biology 22 (1). https://doi.org/10.1186/s13059-021-02519-4.",
     "crumbs": [
       "<span class='chapter-number'>1</span>  <span class='chapter-title'>Introduction</span>"
     ]
@@ -104,7 +104,7 @@
     "href": "book/in_memory_interoperability.html",
     "title": "3  In-memory interoperability",
     "section": "",
-    "text": "3.1 Rpy2: basic functionality\nRpy2 is a foreign function interface to R. It can be used in the following way:\nimport rpy2\nimport rpy2.robjects as robjects\n\n/home/runner/work/polygloty/polygloty/renv/python/virtualenvs/renv-python-3.12/lib/python3.12/site-packages/rpy2/rinterface_lib/embedded.py:276: UserWarning: R was initialized outside of rpy2 (R_NilValue != NULL). Trying to use it nevertheless.\n  warnings.warn(msg)\nR was initialized outside of rpy2 (R_NilValue != NULL). Trying to use it nevertheless.\n\nvector = robjects.IntVector([1,2,3])\nrsum = robjects.r['sum']\n\nrsum(vector)\n\n\n        IntVector with 1 elements.\n        \n\n\n\n6\nLuckily, we’re not restricted to just calling R functions and creating R objects. The real power of this in-memory interoperability lies in the conversion of Python objects to R objects to call R functions on, and then to the conversion of the results back to Python objects.\nRpy2 requires specific conversion rules for different Python objects. It is straightforward to create R vectors from corresponding Python lists:\nstr_vector = robjects.StrVector(['abc', 'def', 'ghi'])\nflt_vector = robjects.FloatVector([0.3, 0.8, 0.7])\nint_vector = robjects.IntVector([1, 2, 3])\nmtx = robjects.r.matrix(robjects.IntVector(range(10)), nrow=5)\nHowever, for single cell biology, the objects that are most interesting to convert are (count) matrices, arrays and dataframes. In order to do this, you need to import the corresponding rpy2 modules and specify the conversion context.\nimport numpy as np\n\nfrom rpy2.robjects import numpy2ri\nfrom rpy2.robjects import default_converter\n\nrd_m = np.random.random((10, 7))\n\nwith (default_converter + numpy2ri.converter).context():\n    mtx2 = robjects.r.matrix(rd_m, nrow = 10)\nimport pandas as pd\n\nfrom rpy2.robjects import pandas2ri\n\npd_df = pd.DataFrame({'int_values': [1,2,3],\n                      'str_values': ['abc', 'def', 'ghi']})\n\nwith (default_converter + pandas2ri.converter).context():\n    pd_df_r = robjects.DataFrame(pd_df)\nOne big limitation of rpy2 is the inability to convert sparse matrices: there is no built-in conversion module for scipy. The anndata2ri package provides, apart from functionality to convert SingleCellExperiment objects to an anndata objects, functions to convert sparse matrices.\nTODO: how to subscript sparse matrix? Is it possible?\nimport scipy as sp\n\nfrom anndata2ri import scipy2ri\n\nsparse_matrix = sp.sparse.csc_matrix(rd_m)\n\nwith (default_converter + scipy2ri.converter).context():\n    sp_r = scipy2ri.py2rpy(sparse_matrix)\nWe will showcase how to use anndata2ri to convert an anndata object to a SingleCellExperiment object and vice versa as well:\nimport anndata as ad\nimport scanpy.datasets as scd\n\nimport anndata2ri\n\nadata_paul = scd.paul15()\n\n\n  0%|          | 0.00/9.82M [00:00&lt;?, ?B/s]\n  0%|          | 8.00k/9.82M [00:00&lt;02:10, 79.0kB/s]\n  0%|          | 32.0k/9.82M [00:00&lt;01:01, 167kB/s] \n  1%|          | 96.0k/9.82M [00:00&lt;00:27, 367kB/s]\n  2%|1         | 200k/9.82M [00:00&lt;00:16, 607kB/s] \n  4%|4         | 408k/9.82M [00:00&lt;00:09, 1.09MB/s]\n  8%|8         | 840k/9.82M [00:00&lt;00:04, 2.10MB/s]\n 17%|#6        | 1.66M/9.82M [00:00&lt;00:02, 4.04MB/s]\n 26%|##5       | 2.54M/9.82M [00:00&lt;00:01, 5.02MB/s]\n 56%|#####5    | 5.45M/9.82M [00:01&lt;00:00, 11.7MB/s]\n 73%|#######2  | 7.16M/9.82M [00:01&lt;00:00, 13.1MB/s]\n 91%|#########1| 8.98M/9.82M [00:01&lt;00:00, 14.4MB/s]\n100%|##########| 9.82M/9.82M [00:01&lt;00:00, 8.26MB/s]\n\n\nwith anndata2ri.converter.context():\n    sce = anndata2ri.py2rpy(adata_paul)\n    ad2 = anndata2ri.rpy2py(sce)",
+    "text": "3.1 Rpy2: basic functionality\nRpy2 is a foreign function interface to R. It can be used in the following way:\nimport rpy2\nimport rpy2.robjects as robjects\n\n/home/runner/work/polygloty/polygloty/renv/python/virtualenvs/renv-python-3.12/lib/python3.12/site-packages/rpy2/rinterface_lib/embedded.py:276: UserWarning: R was initialized outside of rpy2 (R_NilValue != NULL). Trying to use it nevertheless.\n  warnings.warn(msg)\nR was initialized outside of rpy2 (R_NilValue != NULL). Trying to use it nevertheless.\n\nvector = robjects.IntVector([1,2,3])\nrsum = robjects.r['sum']\n\nrsum(vector)\n\n\n        IntVector with 1 elements.\n        \n\n\n\n6\nLuckily, we’re not restricted to just calling R functions and creating R objects. The real power of this in-memory interoperability lies in the conversion of Python objects to R objects to call R functions on, and then to the conversion of the results back to Python objects.\nRpy2 requires specific conversion rules for different Python objects. It is straightforward to create R vectors from corresponding Python lists:\nstr_vector = robjects.StrVector(['abc', 'def', 'ghi'])\nflt_vector = robjects.FloatVector([0.3, 0.8, 0.7])\nint_vector = robjects.IntVector([1, 2, 3])\nmtx = robjects.r.matrix(robjects.IntVector(range(10)), nrow=5)\nHowever, for single cell biology, the objects that are most interesting to convert are (count) matrices, arrays and dataframes. In order to do this, you need to import the corresponding rpy2 modules and specify the conversion context.\nimport numpy as np\n\nfrom rpy2.robjects import numpy2ri\nfrom rpy2.robjects import default_converter\n\nrd_m = np.random.random((10, 7))\n\nwith (default_converter + numpy2ri.converter).context():\n    mtx2 = robjects.r.matrix(rd_m, nrow = 10)\nimport pandas as pd\n\nfrom rpy2.robjects import pandas2ri\n\npd_df = pd.DataFrame({'int_values': [1,2,3],\n                      'str_values': ['abc', 'def', 'ghi']})\n\nwith (default_converter + pandas2ri.converter).context():\n    pd_df_r = robjects.DataFrame(pd_df)\nOne big limitation of rpy2 is the inability to convert sparse matrices: there is no built-in conversion module for scipy. The anndata2ri package provides, apart from functionality to convert SingleCellExperiment objects to an anndata objects, functions to convert sparse matrices.\nTODO: how to subscript sparse matrix? Is it possible?\nimport scipy as sp\n\nfrom anndata2ri import scipy2ri\n\nsparse_matrix = sp.sparse.csc_matrix(rd_m)\n\nwith (default_converter + scipy2ri.converter).context():\n    sp_r = scipy2ri.py2rpy(sparse_matrix)\nWe will showcase how to use anndata2ri to convert an anndata object to a SingleCellExperiment object and vice versa as well:\nimport anndata as ad\nimport scanpy.datasets as scd\n\nimport anndata2ri\n\nadata_paul = scd.paul15()\n\n\n  0%|          | 0.00/9.82M [00:00&lt;?, ?B/s]\n  0%|          | 8.00k/9.82M [00:00&lt;02:10, 78.8kB/s]\n  0%|          | 32.0k/9.82M [00:00&lt;01:01, 167kB/s] \n  1%|          | 96.0k/9.82M [00:00&lt;00:27, 367kB/s]\n  2%|1         | 200k/9.82M [00:00&lt;00:16, 609kB/s] \n  4%|4         | 408k/9.82M [00:00&lt;00:09, 1.09MB/s]\n  8%|8         | 840k/9.82M [00:00&lt;00:04, 2.10MB/s]\n 17%|#6        | 1.66M/9.82M [00:00&lt;00:02, 4.04MB/s]\n 34%|###3      | 3.33M/9.82M [00:00&lt;00:00, 7.88MB/s]\n 53%|#####3    | 5.21M/9.82M [00:00&lt;00:00, 10.4MB/s]\n 83%|########3 | 8.16M/9.82M [00:01&lt;00:00, 15.4MB/s]\n100%|##########| 9.82M/9.82M [00:01&lt;00:00, 8.71MB/s]\n\n\nwith anndata2ri.converter.context():\n    sce = anndata2ri.py2rpy(adata_paul)\n    ad2 = anndata2ri.rpy2py(sce)",
     "crumbs": [
       "<span class='chapter-number'>3</span>  <span class='chapter-title'>In-memory interoperability</span>"
     ]