Skip to content

Commit

Permalink
work on intro + segmenation
Browse files Browse the repository at this point in the history
  • Loading branch information
ellispatrick committed Nov 30, 2023
1 parent 0f60c28 commit a85d2d4
Showing 1 changed file with 118 additions and 32 deletions.
150 changes: 118 additions & 32 deletions vignettes/workshop_material.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -54,34 +54,32 @@ cellular heterogeneity in a tissue environment.

### Description

In this tutorial we will introduce an analytical framework for analysing
data from high dimensional spatial omics technologies such as, CODEX,
CycIF, IMC and High Definition Spatial Transcriptomics. This framework
makes use of functionality from our Bioconductor packages simpleSeg,
FuseSOM, scClassify, scHot, spicyR, listClust, statial, scFeatures and
ClassifyR. By the end of this tutorial attendees will be able to
implement and assess some of the key steps of a spatial analysis
pipeline including cell segmentation, feature normalisation, cell type
identification, microenvironment and cell-state characterisation,
spatial hypothesis testing and patient classification. Understanding
these key steps will provide attendees with the core skills needed to
interrogate the comprehensive spatial information generated by these
exciting new technologies.
In this workshop we will introduce some of the key analytical concepts
needed to analyse data from high dimensional spatial omics technologies
such as, PhenoCycler, IMC, Xenium and MERFISH. We will show how
functionality from our Bioconductor packages simpleSeg, FuseSOM,
scClassify, scHot, spicyR, listClust, statial, scFeatures and ClassifyR
can be used to address various biological hypotheses. By the end of this
workshop attendees will be able to implement and assess some of the key
steps of a spatial analysis pipeline including cell segmentation,
feature normalisation, cell type identification, microenvironment and
cell-state characterisation, spatial hypothesis testing and patient
classification. Understanding these key steps will provide attendees
with the core skills needed to interrogate the comprehensive spatial
information generated by these exciting new technologies.

### Pre-requisites

It is expected that students will have:

- basic knowledge of R syntax,
- familiarity with SingleCellExperiment and/or SpatialExperiment
objects, and
- this workshop will not provide an in-depth description of
cell-resolution spatial omics technologies.

### *R* / *Bioconductor* packages used

Several single cell R packages will be used from the scdney package, for
more information visit: <https://sydneybiox.github.io/scdney/>
more information visit: <https://sydneybiox.github.io/scdney/>.

### Time outline

Expand Down Expand Up @@ -163,34 +161,43 @@ options("restore_SingleCellExperiment_show" = TRUE)

## The data

In this workshop, we will be working through two datasets to explore how
In this workshop, we will be working with two datasets to explore how
biological phenotypes, cellular interactions, and patterns of gene
expression are correlated with disease.
expression are correlated with disease. Both of these datasets will be
used in different contexts, hopefully these contexts are representative
of scenarios you will encounter in your own datasets.

We will use two motivating datasets:

- [Keren et al,
2018](https://www.cell.com/fulltext/S0092-8674(18)31100-0): A
multiplexed ion beam imaging by time-of-flight (MIBI-TOF) dataset
profilining tissue from triple-negative breast cancer patients. Can
we predict risk of cancer recurrence and overall survival time based
on imaging data?
profilining tissue from triple-negative breast cancer patients. The
primary question we will address with this dataset is if we can
predict risk of cancer recurrence and overall survival time based on
imaging data?
- [Lohoff et al,
2022](https://www.nature.com/articles/s41587-021-01006-2): A seqFISH
study of early mouse organogenesis. We will use a subset of data
that is made available from the STExampleData package. Can we find
key transcriptomic drivers of the developing brain?
that is made available from the STExampleData package. The primary
question we will address with this dataset is if we can identify key
transcriptomic drivers of the developing brain?

## Data visualisation and exploration

Here we will download the datasets, examine the structure, visualise the
data and perform some exploratory analyses.
The purpose of the this section is primarily to introduce the
`SpatialExperiment` class which is used to store information from the
imaging experiments in R. The goal will be to get comfortable enough
manipulating and exploring these objects so that you can progress
through the remainder of the workshop comfortably. Here we will download
a dataset stored in the `STexampleData` R package , examine the
structure, visualise the data and perform some exploratory analyses.

### SeqFISH mouse embryo

Here we download the seqFISH mouse embryo data. This comes in the format
of a `SpatialExperiment` object, where all the data from an IMC dataset
can be compiled and accessed with relative ease.
of a `SpatialExperiment` object, where summarized information from an
imaging dataset can be compiled and accessed with relative ease.

```{r seqFISHData}
spe <- STexampleData::seqFISH_mouseEmbryo()
Expand Down Expand Up @@ -325,7 +332,7 @@ Try starting off your exploration by answering the below questions.

```{r kerenQ1}
# try to answer the above question using the imc object.
# you may want to check the SingleCellExperiment vignette.
# you may want to check the SpatialExperiment vignette.
# https://www.bioconductor.org/packages/release/bioc/vignettes/SpatialExperiment/inst/doc/SpatialExperiment.html
```
Expand Down Expand Up @@ -359,8 +366,11 @@ To load in our images we use the `loadImages` function from
example.

```{r loadImage5}
imageLocation <- system.file("extdata", "kerenPatient5.tiff", package = "ScdneySpatial")
image5 = cytomapper::loadImages(
x = system.file("extdata", "kerenPatient5.tiff", package = "ScdneySpatial")
x = imageLocation,
as.is = TRUE #Needed as 8-bit image
)
mcols(image5) = data.frame(list("imageID" = "kerenPatient5"))
Expand All @@ -371,7 +381,83 @@ channelNames(image5) = c("Au", "Background", "Beta catenin", "Ca", "CD11b", "CD1
```

### How do I perform segmentation in R?
::: question
**Questions**

1. What class is image5? Hint: class()
2. How many images and markers are in image5?
3. Challenge: What is the dimension of the image5 image?
:::

### Visualise an image

We can visualise this image to see what we have read in. Lets highlight
4 markers.\

```{r plotImage}
# Visualise segmentation performance another way.
cytomapper::plotPixels(
image = image5[1],
colour_by = c("CD45", "Pan-Keratin", "SMA", "dsDNA"),
colour = list(
CD45 = c("black", "blue"),
`Pan-Keratin` = c("black", "yellow"),
SMA = c("black", "green"),
dsDNA = c("black", "red")
)
)
```

We can manipulate the brightness, contrast and gamma levels as follows.
See if you can do a better job.

```{r plotImage2}
# Visualise segmentation performance another way.
cytomapper::plotPixels(
image = image5[1],
colour_by = c("CD45", "Pan-Keratin", "SMA", "dsDNA"),
display = "single",
colour = list(
CD45 = c("black", "red"),
`Pan-Keratin` = c("black", "yellow"),
SMA = c("black", "green"),
dsDNA = c("black", "blue")
)
,
# Adjust the brightness, contrast and gamma of each channel.
bcg = list(
CD45 = c(0, 4, 1),
`Pan-Keratin` = c(0, 3, 1),
SMA = c(0, 2, 1),
dsDNA = c(0, 2, 1)
),
legend = NULL
)
```

### Can we identify the cells in the image?

The `EBImage` package on Bioconductor provides a lot of useful functions
for manipulating imaging data in R. This includes functionality for
finding cells, process called cell segmentation. Lets work through an
example from their vignette. This will use some functionality that
complements that which you've already learnt.

We start by loading the images of nuclei and cell bodies. To visualize
the cells we overlay these images as the green and the blue channel of a
false-color image. Notice, that with display you can zoom!

```{r readEBImage}
nuc = readImage(system.file('images', 'nuclei.tif', package='EBImage'))
cel = readImage(system.file('images', 'cells.tif', package='EBImage'))
cells = rgbImage(green=1.5*cel, blue=nuc)
display(cells, all = TRUE)
```

We will next create a nuclei mask. The `nuc` channel contains
fluorescent intensities of a protein expressed in the nuclei of cells.
The nuclei mask will threshold this channel to separate signal from
noise and then clean this with som

Images stored in a `list` or `CytoImageList` can be segmented using
`simpleSeg`. Below `simpleSeg` will identify the nuclei in the image
Expand Down Expand Up @@ -1827,7 +1913,8 @@ kerenCV_recurrence = crossValidate(
)
```

Again, using `performancePlot`, this time for recurrence, we found better performance in select spatial metrics.
Again, using `performancePlot`, this time for recurrence, we found
better performance in select spatial metrics.

```{r perfPlot-recurrence}
performancePlot(kerenCV_recurrence,
Expand All @@ -1842,4 +1929,3 @@ performancePlot(kerenCV_recurrence,
```{r sessionInfo}
sessionInfo()
```

0 comments on commit a85d2d4

Please sign in to comment.