From 8af9cec41848b7ceed4dc221a1af922b141c97e5 Mon Sep 17 00:00:00 2001
From: skjandu <106275737+skjandu@users.noreply.github.com>
Date: Thu, 8 Feb 2024 11:57:41 -0800
Subject: [PATCH] Update index.md

---
 docs/v1/index.md | 101 +++++++++++++++++++++++++++++++++--------------
 1 file changed, 71 insertions(+), 30 deletions(-)
diff --git a/docs/v1/index.md b/docs/v1/index.md
index d8a6cd3..81df665 100644
--- a/docs/v1/index.md
+++ b/docs/v1/index.md
@@ -13,58 +13,99 @@
 
 ## Introduction
 
-tfsites.DifferentialBindingAnalysis  compares the e-scores between two PBM datasets.
+differentialBindingAnalysis plots the enrichment scores (e-scores) from two PBM datasets against each other. This allows us to assess whether differential binding occurs between the two transcription factors. 
 
-## Functionality
-
-TBD
 
 ## Methodology
 
-TBD
+The raw PBM datasets for two transcription factors are downloaded from uniPROBE. For each file, the user indicates the columns of the forward k-mer sequences and e-scores. For each k-mer sequence, its e-score from the first PBM file is plotted against its e-score from the second PBM file. Therefore, each data point in the plot is a k-mer with the ordered pair: (PBM 1 e-score, PBM 2 e-score). To indicate whether differential binding occurs, the resulting scatterplot can have either a trendline of the data points or a line with a slope of 1. 
 
 ## Parameters
 
-<span style="color: red;">*</span> indicates required parameter
+### Inputs and Outputs
 
-- **pbm data**<span style="color: red;">*</span>
-    - This is a [ state what the format and content is supposed to be] list of SNVs to be analyzed.
-- **header.present**<span style="color: red;">*</span>
-    -  TRUE/FALSE, genomic coordinates are 0-indexed 
-- **out filename**<span style="color: red;">*</span>
-    - Out file name for the annotated PBM data
-- **header.sequence.present**
-    - TRUE/FALSE, Is there a header sequence in the raw PBM file?.
-- **column.forward**
-    - Column of the forward DNA sequence in the pbm file (1-indexed).
-- **column.MFI**
-    - Column of the MFI in the pbm file (1-indexed).
-- **sequence**
-    - Sequence to be scanned.
-- **plot.resolution**
-    - Plot resolution in DPI.
-- **zoom**
-    - Zoom into the plot by the number of base pairs.
+<span style="color: red;">*</span> indicates required parameter
 
+- <span style="color: red;">*</span>**Raw PBM Input for First TF (.tsv)**
+    - Input file containing the raw PBM dataset for the first transcription factor of interest obtained from uniPROBE.
+- <span style="color: red;">*</span>**Raw PBM Input for Second TF (.tsv)**
+    - Input file containing the raw PBM dataset for the second transcription factor of interest obtained from uniPROBE.
+- <span style="color: red;">*</span>**Scatterplot of Enrichment Scores (.png)**
+    - Name of the output file containing a scatterplot of the enrichment scores (e-scores) from the first PBM dataset plotted against the e-scores from the second PBM dataset. 
+
+
+### Other Parameters
+- <span style="color: red;">*</span>**Header Present in First PBM File (boolean)**
+    - If `True`, a header exists in the first PBM data file. If `False`, no header exists.
+- <span style="color: red;">*</span>**Column Index of K-mers in First PBM File (integer)**
+    - Number of the column containing the forward DNA sequence in the first PBM file. (1-indexed, 1 is the first column)
+- <span style="color: red;">*</span>**Column Index of E-Scores in First PBM File (integer)**
+    - Number of the column containing the e-score in the first PBM file. (1-indexed, 1 is the first column)
+- <span style="color: red;">*</span>**Header Present in Second PBM File (boolean)**
+    - If True, a header exists in the first PBM data file. If False, no header exists.
+- <span style="color: red;">*</span>**Column Index of K-mers in Second PBM File (integer)**
+    - Number of the column containing the forward DNA sequence in the second PBM file. (1-indexed, 1 is the first column)
+- <span style="color: red;">*</span>**Column Index of E-Scores in Second PBM File (integer)**
+    - Number of the column containing the e-score in the second PBM file. (1-indexed, 1 is the first column)
+- **Label K-mers (comma-separated string)**
+    - `Default = None`
+    - List of kmers to be labeled on the plot.
+- **Scatter Alpha Threshold (float)**
+    - `Default = 1`
+    - Alpha threshold that sets the transparency for data points, to show where most data points are concentrated.
+- **Trendline (boolean)**
+    - Default = `False`
+    - If `True`, plot a line of regression through the data points. If `False`, plot a line through (0,0) with a slope of 1. 
 
 ## Input Files
 
-1.  pbm data.   [ define format and contents in detail ] 
-    
-
+1.  Raw PBM Input For First TF (.tsv)
+- Columns
+  - `8-mer:` every possible forward k-mer sequence with length k
+  - `8-mer:` the reverse complement of the forward k-mer
+  - `E-score:` the enrichment score of the k-mer
+  - `Median:` the median fluorescence intensity of the k-mer
+  - `Z-score:` the z-score of the k-mer
+
+```
+8-mer        8-mer        E-score     Median      Z-score
+AAAAAAAA     TTTTTTTT     0.29130     2871.60     3.5965
+AAAAAAAC     TTTTTTTG     0.10748     2086.00     0.3958
+AAAAAAAG     TTTTTTTC     0.23656     2539.91     2.3673
+AAAAAAAT     TTTTTTTA     0.21760     2434.82     1.9442
+AAAAAACA     TTTTTTGT     0.19839     2407.46     1.8310
+```
+
+2.  Raw PBM Input For Second TF (.tsv)
+- Columns
+  - `8-mer:` every possible forward k-mer sequence with length k
+  - `8-mer:` the reverse complement of the forward k-mer
+  - `E-score:` the enrichment score of the k-mer
+  - `Median:` the median fluorescence intensity of the k-mer
+  - `Z-score:` the z-score of the k-mer
+
+```
+8-mer        8-mer        E-score     Median      Z-score
+AAAAAAAA     TTTTTTTT     0.04621     1378.79     0.0023
+AAAAAAAC     TTTTTTTG     0.05236     1595.93     1.2232
+AAAAAAAG     TTTTTTTC     0.11724     1515.64     0.7923
+AAAAAAAT     TTTTTTTA     0.04593     1390.77     0.0745
+AAAAAACA     TTTTTTGT     0.11884     1477.50     0.5795
+```
        
 ## Output Files
 
-  1.line plot: <output filename>.png.  [ describe the plot contennts here ]
+  1. Scatterplot of Enrichment Scores (.png)
+
+   <img src="./02-output-ets-gata4-diff-analysis.png"/>
     
   
 ## Example Data
 
 [Example input data is available on github](https://github.com/genepattern/tfsites.annotateTfSites/data)
     
-## References
-
     
 ## Version Comments
 
 - **1.0.0** (2023-11-28): Initial draft of document scaffold.
+- **1.0.1** (2024-02-02): Draft completed.