From 259deffc6d9812aec29dcec9417c7d1fbccbb97f Mon Sep 17 00:00:00 2001 From: vertesy Date: Fri, 24 Nov 2023 17:06:27 +0100 Subject: [PATCH] Update README.md --- README.md | 579 +++++++++++++++++++++++++++++------------------------- 1 file changed, 307 insertions(+), 272 deletions(-) diff --git a/README.md b/README.md index 6204463..0e359c0 100644 --- a/README.md +++ b/README.md @@ -101,567 +101,602 @@ source("https://raw.githubusercontent.com/vertesy/Rocinante/master/R/Rocinante.R # List of Functions -## Seurat.Utils.R -Updated: 2023/07/22 12:18 +## Seurat.Utils.R (127) +Updated: 2023/11/24 16:40 - #### 1 `parallel.computing.by.future()` - Run gc(), load multi-session computing and extend memory limits. + parallel.computing.by.future. Run gc(), load multi-session computing and extend memory limits. - #### 2 `IntersectWithExpressed()` + IntersectWithExpressed. Intersect a set of genes with genes found in the Seurat object. - #### 3 `SmallestNonAboveX()` - replace small values with the next smallest value found, which is >X. # + SmallestNonAboveX. replace small values with the next smallest value found, which is >X. - #### 4 `AreTheseCellNamesTheSame()` - Compare two character vectors (e.g.: cell IDs) how much they overlap and plot a Venn Diagramm. + AreTheseCellNamesTheSame. Compare two character vectors (e.g.: cell IDs) how much they overlap and plot a Venn Diagram. - #### 5 `getProject()` - Try to get the project name you are wokring on in Rstudio. + getProject. Try to get the project name you are wokring on in Rstudio. -- #### 6 `PlotFilters()` +- #### 6 `Create.MiscSlot()` - Plot filtering threshold and distributions, using four panels to highlight the relation between Gene- and UMI-count, ribosomal- and mitochondrial-content. + Shorten Clustering Names. This function takes in a string representing a clustering name, and shortens it according to specific rules. It replaces "snn_res." with "", "best.matching.names" with "bmatch", "ordered" with "ord", "ManualNames" with "mNames", and ".long" at the end of the string with ".L". -- #### 7 `seu.PC.var.explained()` +- #### 7 `calc.q99.Expression.and.set.all.genes()` - Determine percent of variation associated with each PC. For normal prcomp objects, see: PCA.percent.var.explained(). + calc.q99.Expression.and.set.all.genes. Calculate the gene expression of the e.g.: 90th quantile (expression in the top 10% cells). # -- #### 8 `seu.plot.PC.var.explained()` +- #### 8 `PlotFilters()` - Plot the percent of variation associated with each PC. + PlotFilters. Plot filtering threshold and distributions, using four panels to highlight the relation between Gene- and UMI-count, ribosomal- and mitochondrial-content. -- #### 9 `Percent.in.Trome()` +- #### 9 `seu.PC.var.explained()` - Gene expression as fraction of all UMI's + PCA percent of variation associated with each PC. Determine percent of variation associated with each PC. For normal prcomp objects, see: PCA.percent.var.explained(). -- #### 10 `gene.expression.level.plots()` +- #### 10 `seu.plot.PC.var.explained()` - Histogram of gene expression levels. + seu.plot.PC.var.explained. Plot the percent of variation associated with each PC. -- #### 11 `PrctCellExpringGene()` +- #### 11 `Percent.in.Trome()` - Function to calculate the proportion of cells expressing a given set of genes. + Percent.in.Trome. Gene expression as fraction of all UMI's -- #### 12 `ww.calc_helper()` +- #### 12 `gene.expression.level.plots()` - Helper function for PrctCellExpringGene() to calculate the proportion of cells in a Seurat object that express a given gene. + gene.expression.level.plots. Histogram of gene expression levels. -- #### 13 `scBarplot.FractionAboveThr()` +- #### 13 `PrctCellExpringGene()` - Create a bar plot showing the fraction of cells, within each cluster, that exceed a certain threshold based on a metadata column. + PrctCellExpringGene. Function to calculate the proportion of cells expressing a given set of genes. -- #### 14 `scBarplot.FractionBelowThr()` +- #### 14 `ww.calc_helper()` - Create a bar plot showing the fraction of cells, within each cluster, that are below a certain threshold based on a metadata column. + ww.calc_helper. Helper function for PrctCellExpringGene() to calculate the proportion of cells in a Seurat object that express a given gene. -- #### 15 `getClusterNames()` +- #### 15 `scBarplot.FractionAboveThr()` - Rename clustering in a Seurat object. + scBarplot.FractionAboveThr. Create a bar plot showing the fraction of cells, within each cluster, that exceed a certain threshold based on a metadata column. -- #### 16 `GetClusteringRuns()` +- #### 16 `scBarplot.FractionBelowThr()` - Get Clustering Runs: metadata column names # + scBarplot.FractionBelowThr. Create a bar plot showing the fraction of cells, within each cluster, that are below a certain threshold based on a metadata column. -- #### 17 `GetNamedClusteringRuns()` +- #### 17 `getClusterNames()` - Get Clustering Runs: metadata column names # + RenameClustering. Rename clustering in a Seurat object. -- #### 18 `GetOrderedClusteringRuns()` +- #### 18 `GetClusteringRuns()` - Get Clustering Runs: metadata column names # + GetClusteringRuns. Get Clustering Runs: metadata column names # -- #### 19 `GetNumberOfClusters()` +- #### 19 `GetNamedClusteringRuns()` - Get Number Of Clusters # + GetNamedClusteringRuns. Get Clustering Runs: metadata column names # -- #### 20 `calc.cluster.averages()` +- #### 20 `GetOrderedClusteringRuns()` - Calculates the average of a metadata column (numeric) per cluster. + GetOrderedClusteringRuns. Get Clustering Runs: metadata column names # -- #### 21 `plot.expression.rank.q90()` +- #### 21 `GetNumberOfClusters()` - Plot gene expression based on the expression at the 90th quantile (so you will not lose genes expressed in few cells). + GetNumberOfClusters. Get Number Of Clusters # -- #### 22 `set.mm()` +- #### 22 `calc.cluster.averages()` - Helps to find metadata columns. It creates a list with the names of of 'obj@meta.data'. + calc.cluster.averages. Calculates the average of a metadata column (numeric) per cluster. -- #### 23 `recall.all.genes()` +- #### 23 `plot.expression.rank.q90()` - all.genes set by calc.q99.Expression.and.set.all.genes() # + plot.expression.rank.q90. Plot gene expression based on the expression at the 90th quantile (so you will not lose genes expressed in few cells). -- #### 24 `recall.meta.tags.n.datasets()` +- #### 24 `set.mm()` - Recall meta.tags from obj@misc to "meta.tags" in the global environment. + set.mm. Helps to find metadata columns. It creates a list with the names of of 'obj@meta.data'. -- #### 25 `recall.parameters()` +- #### 25 `ww.get.1st.Seur.element()` - Recall parameters from obj@misc to "p" in the global environment. + Get the First Seurat Object from a List of Seurat Objects. #' If provided with a list of Seurat objects, this function returns the first Seurat object in the list. If the input is a single Seurat object, it returns the object itself. It is assumed that all elements of the list are Seurat objects if the input is a list. -- #### 26 `recall.genes.ls()` +- #### 26 `recall.all.genes()` - Recall genes.ls from obj@misc to "genes.ls" in the global environment. + recall.all.genes. all.genes set by calc.q99.Expression.and.set.all.genes() # -- #### 27 `save.parameters()` +- #### 27 `recall.meta.tags.n.datasets()` - Save parameters to obj@misc$p + recall.meta.tags.n.datasets. Recall meta.tags from obj@misc to "meta.tags" in the global environment. -- #### 28 `subsetSeuObj()` +- #### 28 `recall.parameters()` - Subset a compressed Seurat object and save it in the working directory. + recall.parameters. Recall parameters from obj@misc to "p" in the global environment. -- #### 29 `subsetSeuObj.and.Save()` +- #### 29 `recall.genes.ls()` - Subset a compressed Seurat Obj and save it in wd. # + recall.genes.ls. Recall genes.ls from obj@misc to "genes.ls" in the global environment. -- #### 30 `subsetSeuObj.ident.class()` +- #### 30 `save.parameters()` - Subset a Seurat Obj to a given column + save.parameters. Save parameters to obj@misc$p -- #### 31 `Downsample.Seurat.Objects()` +- #### 31 `create_scCombinedMeta()` - Downsample a list of Seurat objects + Create Single-Cell Metadata Object for a collection of Seurat Objects. This function creates a metadata object to correspond to a list of single-cell experiments, for storing parent level information. It initializes the object with the experiment and project name, and the creation date. The created object is of class 'scMetadata_class'. -- #### 32 `Downsample.Seurat.Objects.PC()` +- #### 32 `subsetSeuObj()` - Downsample a list of Seurat objects, by fraction + subsetSeuObj. Subset a compressed Seurat object and save it in the working directory. -- #### 33 `remove.residual.small.clusters()` +- #### 33 `subsetSeuObj.and.Save()` - E.g.: after subsetting often some residual cells remain in clusters originally defined in the full dataset. + subsetSeuObj.and.Save. Subset a compressed Seurat Obj and save it in wd. # -- #### 34 `drop.levels.Seurat()` +- #### 34 `subsetSeuObj.ident.class()` - Drop unused levels from factor variables in a Seurat object. + subsetSeuObj.ident.class. Subset a Seurat Obj to a given column -- #### 35 `remove_clusters_and_drop_levels()` +- #### 35 `Downsample.Seurat.Objects()` - This function removes residual small clusters from specified Seurat objects and drops levels in factor-like metadata. + Downsample.Seurat.Objects. Downsample a list of Seurat objects -- #### 36 `remove.cells.by.UMAP()` +- #### 36 `Downsample.Seurat.Objects.PC()` - This function applies a cutoff in the specified dimension of a given dimension reduction (UMAP, PCA, or t-SNE) to remove cells. + Downsample.Seurat.Objects.PC. Downsample a list of Seurat objects, by fraction -- #### 37 `FlipReductionCoordinates()` +- #### 37 `remove.residual.small.clusters()` - Flip reduction coordinates (like UMAP upside down). + remove.residual.small.clusters. E.g.: after subsetting often some residual cells remain in clusters originally defined in the full dataset. -- #### 38 `AutoNumber.by.UMAP()` +- #### 38 `dropLevelsSeurat()` - Relabel cluster numbers along a UMAP (or tSNE) axis # + dropLevelsSeurat. Drop unused levels from factor variables in a Seurat object. -- #### 39 `AutoNumber.by.PrinCurve()` +- #### 39 `remove_clusters_and_drop_levels()` - Relabel cluster numbers along the principal curve of 2 UMAP (or tSNE) dimensions. # + Remove Clusters and Drop Levels. This function removes residual small clusters from specified Seurat objects and drops levels in factor-like metadata. -- #### 40 `Add.DE.combined.score()` +- #### 40 `remove.cells.by.UMAP()` - Add a combined score to differential expression (DE) results. The score is calculated as log-fold change (LFC) times negative logarithm of scaled p-value (LFC * -log10( p_cutoff / pval_scaling )). + Remove Cells by Dimension Reduction. This function applies a cutoff in the specified dimension of a given dimension reduction (UMAP, PCA, or t-SNE) to remove cells. -- #### 41 `StoreTop25Markers()` +- #### 41 `FlipReductionCoordinates()` - Save the top 25 makers based on `avg_log2FC` output table of `FindAllMarkers()` (df_markers) under `@misc$df.markers$res...`. By default, it rounds up insignificant digits up to 3. # + FlipReductionCoordinates. Flip reduction coordinates (like UMAP upside down). -- #### 42 `StoreAllMarkers()` +- #### 42 `AutoNumber.by.UMAP()` - Save the output table of `FindAllMarkers()` (df_markers) under `@misc$df.markers$res...`. By default, it rounds up insignificant digits up to 3. # + AutoNumber.by.UMAP. Relabel cluster numbers along a UMAP (or tSNE) axis # -- #### 43 `GetTopMarkersDF()` +- #### 43 `AutoNumber.by.PrinCurve()` - Get the vector of N most diff. exp. genes. # + AutoNumber.by.PrinCurve. Relabel cluster numbers along the principal curve of 2 UMAP (or tSNE) dimensions. # -- #### 44 `GetTopMarkers()` +- #### 44 `Add.DE.combined.score()` - Get the vector of N most diff. exp. genes. # + Add.DE.combined.score. Add a combined score to differential expression (DE) results. The score is calculated as log-fold change (LFC) times negative logarithm of scaled p-value (LFC * -log10( p_cutoff / pval_scaling )). -- #### 45 `AutoLabelTop.logFC()` +- #### 45 `StoreTop25Markers()` - Create a new "named identity" column in the metadata of a Seurat object, with `Ident` set to a clustering output matching the `res` parameter of the function. It requires the output table of `FindAllMarkers()`. If you used `StoreAllMarkers()` is stored under `@misc$df.markers$res...`, which location is assumed by default. # + StoreTop25Markers. Save the top 25 makers based on `avg_log2FC` output table of `FindAllMarkers()` (df_markers) under `@misc$df.markers$res...`. By default, it rounds up insignificant digits up to 3. # -- #### 46 `scEnhancedVolcano()` +- #### 46 `StoreAllMarkers()` - Creates a new "named identity" column in the metadata of a Seurat object, setting 'Ident' to a clustering output matching the 'res' parameter. This function requires the output table of `FindAllMarkers()`. If you used `StoreAllMarkers()`, the output is stored under `@misc$df.markers$res...`, which is the default location. + StoreAllMarkers. Save the output table of `FindAllMarkers()` (df_markers) under `@misc$df.markers$res...`. By default, it rounds up insignificant digits up to 3. -- #### 47 `BulkGEScatterPlot()` +- #### 47 `GetTopMarkersDF()` - Plots scatterplots of bulk gene expression to identify differentially expressed genes across conditions. + GetTopMarkersDF. Get the vector of N most diff. exp. genes. # -- #### 48 `get.clustercomposition()` +- #### 48 `GetTopMarkers()` - Get cluster composition: which datasets contribute to each cluster? + GetTopMarkers. Get the vector of N most diff. exp. genes. # -- #### 49 `scBarplot.CellFractions()` +- #### 49 `AutoLabelTop.logFC()` - Generates a bar plot of cell fractions per cluster. + AutoLabelTop.logFC. Create a new "named identity" column in the metadata of a Seurat object, with `Ident` set to a clustering output matching the `res` parameter of the function. t requires the output table of `FindAllMarkers()`. If you used `StoreAllMarkers()` is stored under `@misc$df.markers$res...`, which location is assumed by default. -- #### 50 ` scBarplot.CellsPerCluster()` +- #### 50 `AutoLabel.KnownMarkers()` - Barplot the Fraction of cells per cluster. (dupl?) + AutoLabel.KnownMarkers. Creates a new "named identity" column in the metadata of a Seurat object, setting 'Ident' to a clustering output matching the 'res' parameter. This function requires the output table of `FindAllMarkers()`. If you used `StoreAllMarkers()`, the output is stored under `@misc$df.markers$res...`, which is the default location. -- #### 51 `scBarplot.CellsPerObject()` +- #### 51 `scEnhancedVolcano()` - Creates a bar plot for the number of cells per object from a list of Seurat objects. + scEnhancedVolcano. This function creates an enhanced volcano plot. -- #### 52 `plot.clust.size.distr()` +- #### 52 `BulkGEScatterPlot()` - Creates a bar plot or histogram of the cluster size distribution from a given Seurat object. + BulkGEScatterPlot. Plots scatterplots of bulk gene expression to identify differentially expressed genes across conditions. -- #### 53 `gg_color_hue()` +- #### 53 `get.clustercomposition()` - Emulates the default color palette of ggplot2. Source: https://stackoverflow.com/questions/8197559/emulate-ggplot2-default-color-palette + get.clustercomposition. Get cluster composition: which datasets contribute to each cluster? -- #### 54 `getDiscretePalette()` +- #### 54 `scBarplot.CellFractions()` - Generate a discrete color palette. + Generate Barplot of Cell Fractions. This function generates a bar plot of cell fractions per cluster from a Seurat object. It offers the option to downsample data, which equalizes the number of cells in each group to the number in the smallest group. The plot's bars are grouped by one variable and filled by another. -- #### 55 `getClusterColors()` +- #### 55 ` scBarplot.CellsPerCluster()` - get Seurat's cluster colors. + scBarplot.CellsPerCluster. Barplot the Fraction of cells per cluster. (dupl?) -- #### 56 `SeuratColorVector()` +- #### 56 `scBarplot.CellsPerObject()` - Recall a Seurat color vector. + scBarplot.CellsPerObject. Creates a bar plot for the number of cells per object from a list of Seurat objects. -- #### 57 `plot.GeneExpHist()` +- #### 57 `plot.clust.size.distr()` - Generates and optionally saves a scatter plot of two features from a Seurat object. + plot.clust.size.distr. Creates a bar plot or histogram of the cluster size distribution from a given Seurat object. -- #### 58 `qUMAP()` +- #### 58 `gg_color_hue()` - The quickest way to draw a gene expression UMAP. + gg_color_hue. Emulates the default color palette of ggplot2. Source: https://stackoverflow.com/questions/8197559/emulate-ggplot2-default-color-palette -- #### 59 `clUMAP()` +- #### 59 `getDiscretePalette()` - The quickest way to draw a clustering result UMAP. + getDiscretePalette. Generate a discrete color palette. -- #### 60 `umapNamedClusters()` +- #### 60 `getClusterColors()` - Plot and save umap based on a metadata column. # + getClusterColors. get Seurat's cluster colors. -- #### 61 `umapHiLightSel()` +- #### 61 `SeuratColorVector()` - Generates a UMAP plot from a Seurat object with a subset of cells highlighted. + SeuratColorVector. Recall a Seurat color vector. -- #### 62 `multiFeaturePlot.A4()` +- #### 62 `plotGeneExpHist()` - Save multiple FeaturePlots, as jpeg, on A4 for each gene, which are stored as a list of gene names. + qFeatureScatter. Generates and optionally saves a scatter plot of two features from a Seurat object. -- #### 63 `multiFeatureHeatmap.A4()` +- #### 63 `qUMAP()` - Save multiple FeatureHeatmaps from a list of genes on A4 jpeg. + qUMAP. The quickest way to draw a gene expression UMAP. -- #### 64 `plot.UMAP.tSNE.sidebyside()` +- #### 64 `clUMAP()` - Plot a UMAP and tSNE side by side. + clUMAP. The quickest way to draw a clustering result UMAP. -- #### 65 `PlotTopGenesPerCluster()` +- #### 65 `umapNamedClusters()` - Plot the top N diff. exp. genes in each cluster. + umapNamedClusters. Plot and save umap based on a metadata column. # -- #### 66 `qQC.plots.BrainOrg()` +- #### 66 `umapHiLightSel()` - Quickly plot key QC markers in brain organoids + umapHiLightSel. Generates a UMAP plot from a Seurat object with a subset of cells highlighted. -- #### 67 `qMarkerCheck.BrainOrg()` +- #### 67 `multiFeaturePlot.A4()` - Quickly plot key markers in brain organoids + multiFeaturePlot.A4. Save multiple FeaturePlots, as jpeg, on A4 for each gene, which are stored as a list of gene names. -- #### 68 `PlotTopGenes()` +- #### 68 `multiFeatureHeatmap.A4()` - Plot the highest expressed genes on umaps, in a subfolder. Requires calling calc.q99.Expression.and.set.all.genes before. # + multiFeatureHeatmap.A4. Save multiple FeatureHeatmaps from a list of genes on A4 jpeg. -- #### 69 `DimPlot.ClusterNames()` +- #### 69 `plot.UMAP.tSNE.sidebyside()` - Plot UMAP with Cluster names. # + plot.UMAP.tSNE.sidebyside. Plot a UMAP and tSNE side by side. -- #### 70 `save2umaps.A4()` +- #### 70 `PlotTopGenesPerCluster()` - Save 2 umaps on 1 A4 + PlotTopGenesPerCluster. Plot the top N diff. exp. genes in each cluster. -- #### 71 `save4umaps.A4()` +- #### 71 `qQC.plots.BrainOrg()` - Save 4 umaps on 1 A4 + qQC.plots.BrainOrg. Quickly plot key QC markers in brain organoids -- #### 72 `qqSaveGridA4()` +- #### 72 `qMarkerCheck.BrainOrg()` - Saves a grid of 2 or 4 ggplot objects onto an A4 page. + qMarkerCheck.BrainOrg. Quickly plot key markers in brain organoids -- #### 73 `ww.check.if.3D.reduction.exist()` +- #### 73 `PlotTopGenes()` - ww.check.if.3D.reduction.exist in backup slot # + PlotTopGenes. Plot the highest expressed genes on umaps, in a subfolder. Requires calling calc.q99.Expression.and.set.all.genes before. # -- #### 74 `ww.check.quantile.cutoff.and.clip.outliers()` +- #### 74 `DimPlot.ClusterNames()` - Function to check a specified quantile cutoff and clip outliers from a given expression vector. + DimPlot.ClusterNames. Plot UMAP with Cluster names. # -- #### 75 `plot3D.umap.gene()` +- #### 75 `save2umaps.A4()` - Plot a 3D umap with gene expression. Uses plotly. Based on github.com/Dragonmasterx87. + save2umaps.A4. Save 2 umaps on 1 A4 -- #### 76 `plot3D.umap()` +- #### 76 `save4umaps.A4()` - Plot a 3D umap based on one of the metadata columns. Uses plotly. Based on github.com/Dragonmasterx87. + save4umaps.A4. Save 4 umaps on 1 A4 -- #### 77 `SavePlotlyAsHtml()` +- #### 77 `qqSaveGridA4()` - Save a Plotly 3D scatterplot as an HTML file. + qqSaveGridA4. Saves a grid of 2 or 4 ggplot objects onto an A4 page. -- #### 78 `BackupReduction()` +- #### 78 `ww.check.if.3D.reduction.exist()` - Backup UMAP to `obj@misc$reductions.backup` from `obj@reductions$umap`. # + ww.check.if.3D.reduction.exist. ww.check.if.3D.reduction.exist in backup slot # -- #### 79 `SetupReductionsNtoKdimensions()` +- #### 79 `ww.check.quantile.cutoff.and.clip.outliers()` - Function to compute dimensionality reductions for a given Seurat object and backup the computed reductions. + ww.check.quantile.cutoff.and.clip.outliers. Function to check a specified quantile cutoff and clip outliers from a given expression vector. -- #### 80 `RecallReduction()` +- #### 80 `plot3D.umap.gene()` - Set active UMAP to `obj@reductions$umap` from `obj@misc$reductions.backup`. # + plot3D.umap.gene. Plot a 3D umap with gene expression. Uses plotly. Based on github.com/Dragonmasterx87. -- #### 81 `Annotate4Plotly3D()` +- #### 81 `plot3D.umap()` - Create annotation labels for 3D plots. Source https://plot.ly/r/text-and-annotations/#3d-annotations. + plot3D.umap. Plot a 3D umap based on one of the metadata columns. Uses plotly. Based on github.com/Dragonmasterx87. -- #### 82 `Plot3D.ListOfGenes()` +- #### 82 `SavePlotlyAsHtml()` - Plot and save list of 3D UMAP or tSNE plots using plotly. + SavePlotlyAsHtml. Save a Plotly 3D scatterplot as an HTML file. -- #### 83 `Plot3D.ListOfCategories()` +- #### 83 `BackupReduction()` - This function plots and saves a list of 3D UMAP or tSNE plots using plotly. + BackupReduction. Backup UMAP to `obj@misc$reductions.backup` from `obj@reductions$umap`. # -- #### 84 `# sparse.cor4()` +- #### 84 `SetupReductionsNtoKdimensions()` - Calculate a sparse correlation matrix. + SetupReductionsNtoKdimensions. Function to compute dimensionality reductions for a given Seurat object and backup the computed reductions. -- #### 85 `Calc.Cor.Seurat()` +- #### 85 `RecallReduction()` - Calculate gene correlation on a Seurat object. + RecallReduction. Set active UMAP to `obj@reductions$umap` from `obj@misc$reductions.backup`. # -- #### 86 `plot.Gene.Cor.Heatmap()` +- #### 86 `Annotate4Plotly3D()` - Plot a gene correlation heatmap. + Annotate4Plotly3D. Create annotation labels for 3D plots. Source https://plot.ly/r/text-and-annotations/#3d-annotations. -- #### 87 `prefix_cells_seurat()` +- #### 87 `Plot3D.ListOfGenes()` - This function adds prefixes from 'obj_IDs' to cell names in Seurat S4 objects from 'ls_obj' + Plot3D.ListOfGenes. Plot and save list of 3D UMAP or tSNE plots using plotly. -- #### 88 `find_prefix_in_cell_IDs()` +- #### 88 `Plot3D.ListOfCategories()` - This function checks if a prefix has been added to the standard cell-IDs (16 characters of A,T,C,G) in a Seurat object. If so, it prints the number of unique prefixes found, issues a warning if more than one unique prefix is found, and returns the identified prefix(es). + Plot3D.ListOfCategories. This function plots and saves a list of 3D UMAP or tSNE plots using plotly. -- #### 89 `seu.Make.Cl.Label.per.cell()` +- #### 89 `# sparse.cor4()` - Take a named vector (of e.g. values ="gene names", names = clusterID), and a vector of cell-IDs and make a vector of "GeneName.ClusterID". + sparse.cor. Calculate a sparse correlation matrix. -- #### 90 `GetMostVarGenes()` +- #### 90 `Calc.Cor.Seurat()` - Get the N most variable Genes + Calc.Cor.Seurat. Calculate gene correlation on a Seurat object. -- #### 91 `gene.name.check()` +- #### 91 `plot.Gene.Cor.Heatmap()` - Check gene names in a seurat object, for naming conventions (e.g.: mitochondrial reads have - or .). Use for reading .mtx & writing .rds files. # + plot.Gene.Cor.Heatmap. Plot a gene correlation heatmap. -- #### 92 `check.genes()` +- #### 92 `prefix_cells_seurat()` - Check if a gene name exists in a Seurat object, or in HGNC? + prefix_cells_seurat. This function adds prefixes from 'obj_IDs' to cell names in Seurat S4 objects from 'ls_obj' -- #### 93 `fixZeroIndexing.seurat()` +- #### 93 `find_prefix_in_cell_IDs()` - Fix zero indexing in seurat clustering, to 1-based indexing. replace zero indexed clusternames. + Check Prefix in Seurat Object Cell IDs. This function checks if a prefix has been added to the standard cell-IDs (16 characters of A,T,C,G) in a Seurat object. If so, it prints the number of unique prefixes found, issues a warning if more than one unique prefix is found, and returns the identified prefix(es). -- #### 94 `CalculateFractionInTrome()` +- #### 94 `seu.Make.Cl.Label.per.cell()` - This function calculates the fraction of a set of genes within the full transcriptome of each cell. + seu.Make.Cl.Label.per.cell. Take a named vector (of e.g. values ="gene names", names = clusterID), and a vector of cell-IDs and make a vector of "GeneName.ClusterID". -- #### 95 `AddNewAnnotation()` +- #### 95 `GetMostVarGenes()` - This function creates a new metadata column based on an existing metadata column and a list of mappings (name <- IDs). + GetMostVarGenes. Get the N most variable Genes -- #### 96 `whitelist.subset.ls.Seurat()` +- #### 96 `gene.name.check()` - Subsets cells in a list of Seurat objects based on an externally provided list of cell IDs. + gene.name.check. Check gene names in a seurat object, for naming conventions (e.g.: mitochondrial reads have - or .). Use for reading .mtx & writing .rds files. # -- #### 97 `FindCorrelatedGenes()` +- #### 97 `check.genes()` - Find correlated genes in a Seurat object + check.genes. Check if a gene name exists in a Seurat object, or in HGNC? -- #### 98 `UpdateGenesSeurat()` +- #### 98 `fixZeroIndexing.seurat()` - Update genes symbols that are stored in a Seurat object. It returns a data frame. The last column are the updated gene names. + fixZeroIndexing.seurat. Fix zero indexing in seurat clustering, to 1-based indexing. replace zero indexed clusternames. -- #### 99 ` check_and_rename()` +- #### 99 `CalculateFractionInTrome()` - Replace gene names in different slots of a Seurat object. Run this before integration. Run this before integration. It only changes obj@assays$RNA@counts, @data and @scale.data. # + CalculateFractionInTranscriptome. This function calculates the fraction of a set of genes within the full transcriptome of each cell. -- #### 100 `RemoveGenesSeurat()` +- #### 100 `AddNewAnnotation()` - Replace gene names in different slots of a Seurat object. Run this before integration. Run this before integration. It only changes metadata; obj@assays$RNA@counts, @data and @scale.data. # + AddNewAnnotation. This function creates a new metadata column based on an existing metadata column and a list of mappings (name <- IDs). -- #### 101 `HGNC.EnforceUnique()` +- #### 101 `whitelist.subset.ls.Seurat()` - Enforce Unique names after HGNC symbol update. + whitelist.subset.ls.Seurat. Subsets cells in a list of Seurat objects based on an externally provided list of cell IDs. -- #### 102 `GetUpdateStats()` +- #### 102 `FindCorrelatedGenes()` - Plot the Symbol-update statistics. Works on the data frame returned by `UpdateGenesSeurat()`. # + FindCorrelatedGenes. Find correlated genes in a Seurat object -- #### 103 `PlotUpdateStats()` +- #### 103 `UpdateGenesSeurat()` - Creates a scatter plot of update statistics. + UpdateGenesSeurat. Update genes symbols that are stored in a Seurat object. It returns a data frame. The last column are the updated gene names. -- #### 104 `calculate.observable.multiplet.rate.10X.LT()` +- #### 104 ` check_and_rename()` - Calculate the observable multiplet rate for 10X standard lane. + RenameGenesSeurat. Replace gene names in different slots of a Seurat object. Run this before integration. Run this before integration. It only changes obj@assays$RNA@counts, @data and @scale.data. # -- #### 105 `SNP.demux.fix.GT.table()` +- #### 105 `RemoveGenesSeurat()` - This function cleans and standardizes a Genotype assignment table obtained from the SoupOrCell tool. + RemoveGenesSeurat. Replace gene names in different slots of a Seurat object. Run this before integration. Run this before integration. It only changes metadata; obj@assays$RNA@counts, @data and @scale.data. # -- #### 106 `Convert10Xfolders()` +- #### 106 `HGNC.EnforceUnique()` - This function takes a parent directory with a number of subfolders, each containing the standard output of 10X Cell Ranger. It (1) loads the filtered data matrices, (2) converts them to Seurat objects, and (3) saves them as .RDS files. + HGNC.EnforceUnique. Enforce Unique names after HGNC symbol update. -- #### 107 `ConvertDropSeqfolders()` +- #### 107 `GetUpdateStats()` - This function takes a parent directory with a number of subfolders, each containing the standard output of 10X Cell Ranger. It (1) loads the filtered data matrices, (2) converts them to Seurat objects, and (3) saves them as .RDS files. + GetUpdateStats. Plot the Symbol-update statistics. Works on the data frame returned by `UpdateGenesSeurat()`. # -- #### 108 `LoadAllSeurats()` +- #### 108 `PlotUpdateStats()` - This function loads all Seurat objects found in a directory. It also works with symbolic links (but not with aliases). + PlotUpdateStats. Creates a scatter plot of update statistics. -- #### 109 `read10x()` +- #### 109 `calculate.observable.multiplet.rate.10X.LT()` - This function reads a 10X dataset from gzipped matrix.mtx, features.tsv and barcodes.tsv files. + calculate.observable.multiplet.rate.10X.LT. Calculate the observable multiplet rate for 10X standard lane. -- #### 110 `load10Xv3()` +- #### 110 `SNP.demux.fix.GT.table()` - Load 10X output folders. + SNP.demux.fix.GT.table. This function cleans and standardizes a Genotype assignment table obtained from the SoupOrCell tool. -- #### 111 `saveRDS.compress.in.BG()` +- #### 111 `Convert10Xfolders()` - Save and RDS object and compress resulting file in the background using system(gzip). OS X or unix. + Convert10Xfolders. This function takes a parent directory with a number of subfolders, each containing the standard output of 10X Cell Ranger. It (1) loads the filtered data matrices, (2) converts them to Seurat objects, and (3) saves them as .RDS files. -- #### 112 `isave.RDS()` +- #### 112 `ConvertDropSeqfolders()` - Save an RDS object, using a faster and efficient compression method that runs in the background. + ConvertDropSeqfolders. This function takes a parent directory with a number of subfolders, each containing the standard output of 10X Cell Ranger. It (1) loads the filtered data matrices, (2) converts them to Seurat objects, and (3) saves them as .RDS files. -- #### 113 `isave.image()` +- #### 113 `LoadAllSeurats()` - Save an image of the current workspace using a faster and efficient compression method that runs in the background. + LoadAllSeurats. This function loads all Seurat objects found in a directory. It also works with symbolic links (but not with aliases). -- #### 114 `qsave.image()` +- #### 114 `read10x()` - Faster saving of workspace, and compression outside R, when it can run in the background. Seemingly quite CPU hungry and not very efficient compression. # + read10x. This function reads a 10X dataset from gzipped matrix.mtx, features.tsv and barcodes.tsv files. -- #### 115 `clip10Xcellname()` +- #### 115 `load10Xv3()` - Clip all suffices after underscore (10X adds it per chip-lane, Seurat adds in during integration). # + load10Xv3. Load 10X output folders. -- #### 116 `make10Xcellname()` +- #### 116 `.saveRDS.compress.in.BG()` - Add a suffix to cell names, so that it mimics the lane-suffix, e.g.: "_1". # + .saveRDS.compress.in.BG. Save and RDS object and compress resulting file in the background using system(gzip). OS X or unix. -- #### 117 `plotTheSoup()` +- #### 117 `xread()` - Plot stats about the ambient RNA content in a 10X experiment. + isave.RDS. Save an RDS object, using a faster and efficient compression method that runs in the background. -- #### 118 `jJaccardIndexVec()` +- #### 118 `isave.image()` - Calculate jaccard similarity for 2 vecotrs. Helper to jPairwiseJaccardIndexList. + isave.image. Save an image of the current workspace using a faster and efficient compression method that runs in the background. -- #### 119 `jPairwiseJaccardIndexList()` +- #### 119 `qsave.image()` - Create a pairwise jaccard similarity matrix across all combinations of columns in binary.presence.matrix. Modified from: https://www.displayr.com/how-to-calculate-jaccard-coefficients-in-displayr-using-r/ # + Save workspace - qsave.image. Faster saving of workspace, and compression outside R, when it can run in the background. Seemingly quite CPU hungry and not very efficient compression. # -- #### 120 `jPresenceMatrix()` +- #### 120 `clip10Xcellname()` - Make a binary presence matrix from a list. Source: https://stackoverflow.com/questions/56155707/r-how-to-create-a-binary-relation-matrix-from-a-list-of-strings # + clip10Xcellname. Clip all suffices after underscore (10X adds it per chip-lane, Seurat adds in during integration). # + +- #### 121 `make10Xcellname()` + + make10Xcellname. Add a suffix to cell names, so that it mimics the lane-suffix, e.g.: "_1". # + +- #### 122 `plotTheSoup()` + + plotTheSoup. Plot stats about the ambient RNA content in a 10X experiment. + +- #### 123 `jJaccardIndexVec()` + + jJaccardIndexVec. Calculate jaccard similarity for 2 vecotrs. Helper to jPairwiseJaccardIndexList. + +- #### 124 `jPairwiseJaccardIndexList()` + + jPairwiseJaccardIndexList. Create a pairwise jaccard similarity matrix across all combinations of columns in binary.presence.matrix. Modified from: https://www.displayr.com/how-to-calculate-jaccard-coefficients-in-displayr-using-r/ # + +- #### 125 `jPresenceMatrix()` + + jPresenceMatrix. Make a binary presence matrix from a list. Source: https://stackoverflow.com/questions/56155707/r-how-to-create-a-binary-relation-matrix-from-a-list-of-strings # + +- #### 126 `jJaccardIndexBinary()` + + jJaccardIndexBinary. Calculate Jaccard Index. Modified from: https://www.displayr.com/how-to-calculate-jaccard-coefficients-in-displayr-using-r/ # + +- #### 127 `jPairwiseJaccardIndex()` + + jPairwiseJaccardIndex. Create a pairwise jaccard similarity matrix across all combinations of columns in binary.presence.matrix. Modified from: https://www.displayr.com/how-to-calculate-jaccard-coefficients-in-displayr-using-r/ # -- #### 121 `jJaccardIndexBinary()` - Calculate Jaccard Index. Modified from: https://www.displayr.com/how-to-calculate-jaccard-coefficients-in-displayr-using-r/ # ------------------------------------------------------------------------------------------------------------ ## Seurat.Utils.Metadata.R -Updated: 2023/07/22 12:18 + - #### 1 `meta_col_exists()` - This function checks whether a given column exists in the meta.data of a Seurat object. + Check if a Column Exists in the Metadata of an S4 Object. This function checks whether a given column exists in the meta.data of a Seurat object. -- #### 2 `getMedianMetric()` +- #### 2 `getMetadataColumn()` - Get the median values of different columns in meta.data, can iterate over a list of Seurat objects. + getMetadataColumn. Retrieves a specified metadata column from a Seurat object and returns it as a named vector. -- #### 3 `add.meta.tags()` +- #### 3 `get_levels_seu()` - Add metadata tags to a Seurat object dataset. + Get Unique Levels of a Seurat Object Ident Slot. This function extracts the unique levels present in the 'ident' slot of a Seurat object. The function throws an error if the number of levels exceeds 'max_levels'. The function optionally prints the R code to recreate the 'Levels' vector using 'dput'. -- #### 4 `add.meta.fraction()` +- #### 4 `getMedianMetric()` - Add a new metadata column to a Seurat object, representing the fraction of a gene set in the transcriptome (expressed as a percentage). + getMedianMetric. Get the median values of different columns in meta.data, can iterate over a list of Seurat objects. -- #### 5 `seu.RemoveMetadata()` +- #### 5 `getCellIDs.from.meta()` - Remove specified metadata columns from a Seurat object. + getCellIDs.from.meta. Retrieves cell IDs from a specified metadata column of a Seurat object, where the cell ID matches a provided list of values. The matching operation uses the `%in%` operator. -- #### 6 `getMetadataColumn()` +- #### 6 `seu.add.meta.from.vector()` - Retrieves a specified metadata column from a Seurat object and returns it as a named vector. + seu.add.meta.from.vector. Adds a new metadata column to a Seurat object. - #### 7 `create.metadata.vector()` - Adds a new metadata column to a Seurat object. + Create a Metadata Vector. This function creates a metadata vector from an input vector and a Seurat object. The resulting vector contains values from 'vec' for the intersecting cell names between 'vec' and 'obj'. It also checks if the intersection between the cell names in 'vec' and 'obj' is more than a minimum intersection size. -- #### 8 `seu.map.and.add.new.ident.to.meta()` +- #### 8 `add.meta.fraction()` - Adds a new metadata column to a Seurat object based on an identity mapping table. + add.meta.fraction. Add a new metadata column to a Seurat object, representing the fraction of a gene set in the transcriptome (expressed as a percentage). -- #### 9 `getCellIDs.from.meta()` +- #### 9 `add.meta.tags()` - Retrieves cell IDs from a specified metadata column of a Seurat object, where the cell ID matches a provided list of values. The matching operation uses the `%in%` operator. + add.meta.tags. Add metadata tags to a Seurat object dataset. - #### 10 `seu.add.meta.from.table()` - Add multiple new metadata columns to a Seurat object from a table. # + seu.add.meta.from.table. Add multiple new metadata columns to a Seurat object from a table. # + +- #### 11 `seu.map.and.add.new.ident.to.meta()` + + seu.map.and.add.new.ident.to.meta. Adds a new metadata column to a Seurat object based on an identity mapping table. + +- #### 12 `fix.orig.ident()` + + fix.orig.ident. Remove the string "filtered_feature_bc_matrix." from "orig.ident". Helper function. -- #### 11 `sampleNpc()` +- #### 13 `seu.RemoveMetadata()` - This function samples a specified percentage of a dataframe (specifically a subset of the metadata of a Seurat object) and returns the corresponding cell IDs. + seu.RemoveMetadata. Remove specified metadata columns from a Seurat object. -- #### 12 `calc.q99.Expression.and.set.all.genes()` +- #### 14 `sampleNpc()` - Calculate the gene expression of the e.g.: 90th quantile (expression in the top 10% cells). # + Sample N % of a dataframe (obj@metadata), and return rownames (cell IDs).. This function samples a specified percentage of a dataframe (specifically a subset of the metadata of a Seurat object) and returns the corresponding cell IDs. -- #### 13 `fix.orig.ident()` +- #### 15 `set.all.genes()` - Remove the string "filtered_feature_bc_matrix." from "orig.ident". Helper function. + set.all.genes. It is just a reminder to use calc.q99.Expression.and.set.all.genes to create the all.genes variable -- #### 14 `Create.MiscSlot()` +- #### 16 `plotMetadataCorHeatmap()` - It is just a reminder to use calc.q99.Expression.and.set.all.genes to create the all.genes variable + Plot Metadata Correlation Heatmap. This function plots a heatmap of metadata correlation values. It accepts a Seurat object and a set of metadata columns to correlate. The correlations are calculated using either Pearson or Spearman methods, and the resulting heatmap can include the principal component (PCA) values and be saved with a specific suffix. -- #### 15 `transfer_labels_seurat()` +- #### 17 `heatmap_calc_clust_median()` - Function to transfer labels from a reference Seurat object to a query Seurat object using anchoring and transfer data methods from the Seurat package. It then visualizes the reference and the combined objects using Uniform Manifold Approximation and Projection (UMAP). + Calculate and plot heatmap of cluster medians. This function calculates the median of specified variables in a dataframe, grouped by a column ('ident'). The function also provides an option to scale the medians, subset the ident levels, and either return a matrix of median values or plot a heatmap. -- #### 16 `match_best_identity()` +- #### 18 `plotMetadataCategPie()` - This function matches the best identity from `ident_from` to `ident_to` in an object, updates the metadata of the object with this new identity and returns the updated object. Additionally, it generates a UMAP plot based on the new identity. The function replaces original categories with the most frequent ones, hence helps to filter out less important categories. + plotMetadataMedianFractionBarplot. Generates a barplot of metadata median values. -- #### 17 `replace_by_most_frequent_categories()` +- #### 19 `transfer_labels_seurat()` - This function replaces each category in a query column of a data frame with the most frequently corresponding category in a reference column. It calculates the assignment quality, reports it, and optionally plots it. + Transfer Labels in Seurat. Function to transfer labels from a reference Seurat object to a query Seurat object using anchoring and transfer data methods from the Seurat package. It then visualizes the reference and the combined objects using Uniform Manifold Approximation and Projection (UMAP). -- #### 18 `plot.Metadata.Cor.Heatmap()` +- #### 20 `match_best_identity()` - Plots a heatmap of metadata correlation values. + Match and Translate Best Identity. This function matches the best identity from `ident_from` to `ident_to` in an object, updates the metadata of the object with this new identity and returns the updated object. Additionally, it generates a UMAP plot based on the new identity. The function replaces original categories with the most frequent ones, hence helps to filter out less important categories.