work on intro + segmenation

SydneyBioX · Nov 30, 2023 · a85d2d4 · a85d2d4
1 parent 0f60c28
commit a85d2d4
Showing 1 changed file with 118 additions and 32 deletions.
diff --git a/vignettes/workshop_material.Rmd b/vignettes/workshop_material.Rmd
@@ -54,34 +54,32 @@ cellular heterogeneity in a tissue environment.
 
 ### Description
 
-In this tutorial we will introduce an analytical framework for analysing
-data from high dimensional spatial omics technologies such as, CODEX,
-CycIF, IMC and High Definition Spatial Transcriptomics. This framework
-makes use of functionality from our Bioconductor packages simpleSeg,
-FuseSOM, scClassify, scHot, spicyR, listClust, statial, scFeatures and
-ClassifyR. By the end of this tutorial attendees will be able to
-implement and assess some of the key steps of a spatial analysis
-pipeline including cell segmentation, feature normalisation, cell type
-identification, microenvironment and cell-state characterisation,
-spatial hypothesis testing and patient classification. Understanding
-these key steps will provide attendees with the core skills needed to
-interrogate the comprehensive spatial information generated by these
-exciting new technologies.
+In this workshop we will introduce some of the key analytical concepts
+needed to analyse data from high dimensional spatial omics technologies
+such as, PhenoCycler, IMC, Xenium and MERFISH. We will show how
+functionality from our Bioconductor packages simpleSeg, FuseSOM,
+scClassify, scHot, spicyR, listClust, statial, scFeatures and ClassifyR
+can be used to address various biological hypotheses. By the end of this
+workshop attendees will be able to implement and assess some of the key
+steps of a spatial analysis pipeline including cell segmentation,
+feature normalisation, cell type identification, microenvironment and
+cell-state characterisation, spatial hypothesis testing and patient
+classification. Understanding these key steps will provide attendees
+with the core skills needed to interrogate the comprehensive spatial
+information generated by these exciting new technologies.
 
 ### Pre-requisites
 
 It is expected that students will have:
 
 -   basic knowledge of R syntax,
--   familiarity with SingleCellExperiment and/or SpatialExperiment
-    objects, and
 -   this workshop will not provide an in-depth description of
     cell-resolution spatial omics technologies.
 
 ### *R* / *Bioconductor* packages used
 
 Several single cell R packages will be used from the scdney package, for
-more information visit: <https://sydneybiox.github.io/scdney/>
+more information visit: <https://sydneybiox.github.io/scdney/>.
 
 ### Time outline
 
@@ -163,34 +161,43 @@ options("restore_SingleCellExperiment_show" = TRUE)
 
 ## The data
 
-In this workshop, we will be working through two datasets to explore how
+In this workshop, we will be working with two datasets to explore how
 biological phenotypes, cellular interactions, and patterns of gene
-expression are correlated with disease.
+expression are correlated with disease. Both of these datasets will be
+used in different contexts, hopefully these contexts are representative
+of scenarios you will encounter in your own datasets.
 
 We will use two motivating datasets:
 
 -   [Keren et al,
     2018](https://www.cell.com/fulltext/S0092-8674(18)31100-0): A
     multiplexed ion beam imaging by time-of-flight (MIBI-TOF) dataset
-    profilining tissue from triple-negative breast cancer patients. Can
-    we predict risk of cancer recurrence and overall survival time based
-    on imaging data?
+    profilining tissue from triple-negative breast cancer patients. The
+    primary question we will address with this dataset is if we can
+    predict risk of cancer recurrence and overall survival time based on
+    imaging data?
 -   [Lohoff et al,
     2022](https://www.nature.com/articles/s41587-021-01006-2): A seqFISH
     study of early mouse organogenesis. We will use a subset of data
-    that is made available from the STExampleData package. Can we find
-    key transcriptomic drivers of the developing brain?
+    that is made available from the STExampleData package. The primary
+    question we will address with this dataset is if we can identify key
+    transcriptomic drivers of the developing brain?
 
 ## Data visualisation and exploration
 
-Here we will download the datasets, examine the structure, visualise the
-data and perform some exploratory analyses.
+The purpose of the this section is primarily to introduce the
+`SpatialExperiment` class which is used to store information from the
+imaging experiments in R. The goal will be to get comfortable enough
+manipulating and exploring these objects so that you can progress
+through the remainder of the workshop comfortably. Here we will download
+a dataset stored in the `STexampleData` R package , examine the
+structure, visualise the data and perform some exploratory analyses.
 
 ### SeqFISH mouse embryo
 
 Here we download the seqFISH mouse embryo data. This comes in the format
-of a `SpatialExperiment` object, where all the data from an IMC dataset
-can be compiled and accessed with relative ease.
+of a `SpatialExperiment` object, where summarized information from an
+imaging dataset can be compiled and accessed with relative ease.
 
 ```{r seqFISHData}
 spe <- STexampleData::seqFISH_mouseEmbryo()
@@ -325,7 +332,7 @@ Try starting off your exploration by answering the below questions.
 
 ```{r kerenQ1}
 # try to answer the above question using the imc object. 
-# you may want to check the SingleCellExperiment vignette.
+# you may want to check the SpatialExperiment vignette.
 # https://www.bioconductor.org/packages/release/bioc/vignettes/SpatialExperiment/inst/doc/SpatialExperiment.html
 
 ```
@@ -359,8 +366,11 @@ To load in our images we use the `loadImages` function from
 example.
 
 ```{r loadImage5}
+
+imageLocation <- system.file("extdata", "kerenPatient5.tiff", package = "ScdneySpatial")
 image5 = cytomapper::loadImages(
-  x = system.file("extdata", "kerenPatient5.tiff", package = "ScdneySpatial")
+  x = imageLocation,
+  as.is = TRUE #Needed as 8-bit image
 )
 
 mcols(image5) = data.frame(list("imageID" = "kerenPatient5"))
@@ -371,7 +381,83 @@ channelNames(image5) = c("Au", "Background", "Beta catenin", "Ca", "CD11b", "CD1
 
 ```
 
-### How do I perform segmentation in R?
+::: question
+**Questions**
+
+1.  What class is image5? Hint: class()
+2.  How many images and markers are in image5?
+3.  Challenge: What is the dimension of the image5 image?
+:::
+
+### Visualise an image
+
+We can visualise this image to see what we have read in. Lets highlight
+4 markers.\
+
+```{r plotImage}
+# Visualise segmentation performance another way.
+cytomapper::plotPixels(
+  image = image5[1],
+  colour_by = c("CD45", "Pan-Keratin", "SMA", "dsDNA"),
+  colour = list(
+    CD45 = c("black", "blue"),
+    `Pan-Keratin` = c("black", "yellow"),
+    SMA = c("black", "green"),
+    dsDNA = c("black", "red")
+  )
+)
+```
+
+We can manipulate the brightness, contrast and gamma levels as follows.
+See if you can do a better job.
+
+```{r plotImage2}
+# Visualise segmentation performance another way.
+cytomapper::plotPixels(
+  image = image5[1],
+  colour_by = c("CD45", "Pan-Keratin", "SMA", "dsDNA"),
+  display = "single",
+  colour = list(
+    CD45 = c("black", "red"),
+    `Pan-Keratin` = c("black", "yellow"),
+    SMA = c("black", "green"),
+    dsDNA = c("black", "blue")
+  )
+  ,
+  # Adjust the brightness, contrast and gamma of each channel.
+  bcg = list(
+    CD45 = c(0, 4, 1),
+    `Pan-Keratin` = c(0, 3, 1),
+    SMA = c(0, 2, 1),
+    dsDNA = c(0, 2, 1)
+  ),
+  legend = NULL
+)
+```
+
+### Can we identify the cells in the image?
+
+The `EBImage` package on Bioconductor provides a lot of useful functions
+for manipulating imaging data in R. This includes functionality for
+finding cells, process called cell segmentation. Lets work through an
+example from their vignette. This will use some functionality that
+complements that which you've already learnt.
+
+We start by loading the images of nuclei and cell bodies. To visualize
+the cells we overlay these images as the green and the blue channel of a
+false-color image. Notice, that with display you can zoom!
+
+```{r readEBImage}
+nuc = readImage(system.file('images', 'nuclei.tif', package='EBImage')) 
+cel = readImage(system.file('images', 'cells.tif', package='EBImage'))  
+cells = rgbImage(green=1.5*cel, blue=nuc) 
+display(cells, all = TRUE)
+```
+
+We will next create a nuclei mask. The `nuc` channel contains
+fluorescent intensities of a protein expressed in the nuclei of cells.
+The nuclei mask will threshold this channel to separate signal from
+noise and then clean this with som
 
 Images stored in a `list` or `CytoImageList` can be segmented using
 `simpleSeg`. Below `simpleSeg` will identify the nuclei in the image
@@ -1827,7 +1913,8 @@ kerenCV_recurrence = crossValidate(
 )
 ```
 
-Again, using `performancePlot`, this time for recurrence, we found better performance in select spatial metrics.
+Again, using `performancePlot`, this time for recurrence, we found
+better performance in select spatial metrics.
 
 ```{r perfPlot-recurrence}
 performancePlot(kerenCV_recurrence,
@@ -1842,4 +1929,3 @@ performancePlot(kerenCV_recurrence,
 ```{r sessionInfo}
 sessionInfo()
 ```
-