diff --git a/docs/decoupler_api_doc.rst b/docs/decoupler_api_doc.rst index baaa7fd..d690e1b 100644 --- a/docs/decoupler_api_doc.rst +++ b/docs/decoupler_api_doc.rst @@ -3,8 +3,8 @@ sphinx-quickstart on Wed Apr 27 09:20:15 2022. You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. -gssnng using the decoupler-omnipath api -======================================= +The decoupler-omnipath api +========================== Gene Set Scoring on the Nearest Neighbor Graph (gssnng) for Single Cell RNA-seq (scRNA-seq). @@ -13,13 +13,8 @@ Gene Set Scoring on the Nearest Neighbor Graph (gssnng) for Single Cell RNA-seq :caption: Table of Contents :maxdepth: 1 - -`**Notebook using gmt files** `_ - `**Notebook using Decoupler/Omnipath style API** `_ -`**Notebook for creating smoothed count matrices**`_ - `**See the paper** `_ This package works with AnnData objects stored as h5ad files. Expression values are taken from adata.X. @@ -102,23 +97,23 @@ Scoring Functions The list of scoring functions:: - geneset_overlap: For each geneset, number (or fraction) of genes expressed past a given threshold. + **geneset_overlap**: For each geneset, number (or fraction) of genes expressed past a given threshold. - singscore: Normalised mean (median centered) ranks (requires ranked data) + **singscore**: Normalised mean (median centered) ranks (requires ranked data) - ssGSEA: Single sample GSEA based on ranked data. + **ssGSEA**: Single sample GSEA based on ranked data. - rank_biased_overlap: RBO, Weighted average of agreement between sorted ranks and gene set. + **rank_biased_overlap**: RBO, Weighted average of agreement between sorted ranks and gene set. - robust_std: Med(x-med / mad), median of robust standardized values (recommend unranked). + **robust_std**: Med(x-med / mad), median of robust standardized values (recommend unranked). - mean_z: Mean( (x - mean)/stddv ), average z score. (recommend unranked). + **mean_z**: Mean( (x - mean)/stddv ), average z score. (recommend unranked). - average_score: Mean ranks or counts + **average_score**: Mean ranks or counts - median_score: Median of counts or ranks + **median_score**: Median of counts or ranks - summed_up: Sum up the ranks or counts. + **summed_up**: Sum up the ranks or counts. diff --git a/docs/gmt_files_doc.rst b/docs/gmt_files_doc.rst index 3939ba7..1234177 100644 --- a/docs/gmt_files_doc.rst +++ b/docs/gmt_files_doc.rst @@ -3,7 +3,7 @@ sphinx-quickstart on Wed Apr 27 09:20:15 2022. You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. -gssnng using gmt files +Genesets as .gmt files ====================== Gene Set Scoring on the Nearest Neighbor Graph (gssnng) for Single Cell RNA-seq (scRNA-seq). @@ -16,10 +16,6 @@ Gene Set Scoring on the Nearest Neighbor Graph (gssnng) for Single Cell RNA-seq `**Notebook using gmt files** `_ -`**Notebook using Decoupler/Omnipath style API** `_ - -`**Notebook for creating smoothed count matrices**`_ - `**See the paper** `_ @@ -57,7 +53,6 @@ Copy the script out from the cloned repo and run, check the paths if you get an python3.10 example_gmt_input.py - Usage ----- @@ -90,25 +85,25 @@ See gssnng/notebooks for examples on all methods. Scoring Functions ----------------- -The list of scoring functions: +The list of scoring functions:: - geneset_overlap: For each geneset, number (or fraction) of genes expressed past a given threshold. + **geneset_overlap**: For each geneset, number (or fraction) of genes expressed past a given threshold. - singscore: Normalised mean (median centered) ranks (requires ranked data) + **singscore**: Normalised mean (median centered) ranks (requires ranked data) - ssGSEA: Single sample GSEA based on ranked data. + **ssGSEA**: Single sample GSEA based on ranked data. - rank_biased_overlap: RBO, Weighted average of agreement between sorted ranks and gene set. + **rank_biased_overlap**: RBO, Weighted average of agreement between sorted ranks and gene set. - robust_std: Med(x-med / mad), median of robust standardized values (recommend unranked). + **robust_std**: Med(x-med / mad), median of robust standardized values (recommend unranked). - mean_z: Mean( (x - mean)/stddv ), average z score. (recommend unranked). + **mean_z**: Mean( (x - mean)/stddv ), average z score. (recommend unranked). - average_score: Mean ranks or counts + **average_score**: Mean ranks or counts - median_score: Median of counts or ranks + **median_score**: Median of counts or ranks - summed_up: Sum up the ranks or counts. + **summed_up**: Sum up the ranks or counts. Parameters @@ -176,7 +171,7 @@ Some methods have some additional options. They are passed as a dictionary, meth singscore: {'normalization', 'theoretical'}, {'normalization', 'standard'} -The singscore manuscript describes the theoretical method of standarization which involves determining the theoretical max and minimum ranks for the given gene set.:: +The singscore manuscript describes the theoretical method of standardization which involves determining the theoretical max and minimum ranks for the given gene set.:: rank_biased_overlap: {'rbo_depth', n} (n: int) diff --git a/docs/index.rst b/docs/index.rst index a71ad5c..276216d 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -13,9 +13,9 @@ Contents -------- .. toctree:: - gssnng with gene set gmt files - gssnng with a decoupler/omnipath style - gssnng to smooth count matrices + Using gene set gmt files + The decoupler/omnipath api + Smooth count matrices The problem: The sparsity of scRNA-seq data creates a poor overlap with many gene sets, which in turn makes gene set scoring difficult. The GSSNNG method is based on using the nearest neighbor graph of cells for data smoothing. This essentially creates mini-pseudobulk expression profiles for each cell, which can be scored by using single sample gene set scoring methods often associated with bulk RNA-seq. @@ -23,9 +23,7 @@ Nearest neighbor graphs (NNG) are constructed based on user defined groups (see This package works with AnnData objects stored as h5ad files. Expression values are taken from adata.X. For creating groups, up to four categorical variables can be used, which are found in the adata.obs table. Gene sets can be provided by using .gmt files or through the OmniPath API (see below). -Scoring functions work with ranked or unranked data ("your mileage may vary") - .. note:: - This project is under active development. Please consider using a named release. + This project is under active development. Please consider using a named release if you're concerned about reproducibility. diff --git a/docs/smoothing_adatas_doc.rst b/docs/smoothing_adatas_doc.rst index e483b8b..05cd34b 100644 --- a/docs/smoothing_adatas_doc.rst +++ b/docs/smoothing_adatas_doc.rst @@ -3,8 +3,8 @@ sphinx-quickstart on Wed Apr 27 09:20:15 2022. You can adapt this file completely to your liking, but it should at least contain the root `toctree` directive. -gssnng to make smoothed count matrices -====================================== +Smoothing count matrices +======================== Gene Set Scoring on the Nearest Neighbor Graph (gssnng) for Single Cell RNA-seq (scRNA-seq). @@ -13,16 +13,8 @@ Gene Set Scoring on the Nearest Neighbor Graph (gssnng) for Single Cell RNA-seq :caption: Table of Contents :maxdepth: 1 - -`**Notebook using gmt files** `_ - -`**Notebook using Decoupler/Omnipath style API** `_ - -`**Notebook for creating smoothed count matrices**`_ - `**See the paper** `_ - This package works with AnnData objects stored as h5ad files. Expression values are taken from adata.X. For creating groups, up to four categorical variables can be used, which are found in the adata.obs table.