CellCheck is a toolkit for validating single-cell analysis with visualization.
You can use CellCheck to verify malignancy annotation, cell type annotation, scoring, and deconvolution results from the different conditions. For example, you can compare the performance of different scRNA-seq analysis tools, or different improved algorithms. You can also optimize the parameters used in these tools.
It can also be applied to other use. For example, it can be extended to find biomarkers or gene signatures from databases such as TCGA or other databases.
CellCheck runs in the R statistical computing environment. You will need R version 3.6.3 or to have access to the latest features.
## Check whether the installation of those packages is required
Package.set <- c("tidyverse","caret","cvms","DescTools","devtools")
for (i in 1:length(Package.set)) {
if (!requireNamespace(Package.set[i], quietly = TRUE)){
install.packages(Package.set[i])
}
}
## Load Packages
# library(Seurat)
lapply(Package.set, library, character.only = TRUE)
rm(Package.set,i)
## install CellCheck
# Install the CellCheck package
detach("package:CellCheck", unload = TRUE)
devtools::install_github("Charlene717/CellCheck")
# Load CellCheck
library(CellCheck)
You can manipulate three types of data with CellCheck: binary data, multiple discrete data, and continuous data.
You can try to load the following demo RData to run the whole function in CellCheck.
## Load simulation datafrme ##
load("Create_simulation_datafrme3.RData")
## Load simulation datafrme ##
source("Demo_CellCheck.R", echo = TRUE, max.deparse.length=10000, encoding="utf-8",
print.eval = TRUE)
The format of input binary data can be number or character:
Run the code:
## For one prediction
CMPredSet.lt <- list(Actual = "Actual", Predict = "Predict2")
cm_Bi.lt <- CellCheck_Bi(Simu_Bi.df, Simu_Anno.df, Mode = "One", CMPredSet.lt,
Save.Path = Save.Path, ProjectName = ProjectName)
## For multiple prediction
Sum_Bi.df <- CellCheck_Bi(Simu_Bi.df, Simu_Anno.df, Mode = "Multiple",
Save.Path = Save.Path, ProjectName = ProjectName)
The outputs binary data has confusion matrix(CM), barplot, and lineplot:
The format of input discrete multiple data can be number or character:
Run the code:
## For one prediction
DisMultCM.lt <- list(Actual = "Actual", Predict = "Predict2")
cm_DisMult.lt <- CellCheck_DisMult(Simu_DisMult.df, Simu_Anno.df, Mode = "One", DisMultCM.lt,
Save.Path = Save.Path, ProjectName = ProjectName)
## For multiple prediction
Sum_DisMult.df <- CellCheck_DisMult(Simu_DisMult.df, Simu_Anno.df, Mode = "Multiple",
Save.Path = Save.Path, ProjectName = ProjectName)
The outputs discrete multiple data has confusion matrix(CM), barplot, and lineplot:
The format of input continuous data would be number:
Run the code:
## For one index
BarMetricSet.lt <- list(XValue = "Type", Metrics = "RMSE", Group = "Tool")
cm_Conti.lt <- CellCheck_Conti(Simu_Bi.df, Simu_Anno.df, Mode = "One", BarMetricSet.lt,
Save.Path = Save.Path, ProjectName = ProjectName)
## For multiple index
Sum_Conti.df <- CellCheck_Conti(Simu_Bi.df, Simu_Anno.df, Mode = "Multiple",BarMetricSet.lt,
Save.Path = Save.Path, ProjectName = ProjectName)
The outputs continuous data has barplot, and lineplot: