Skip to content

Commit

Permalink
Update to version 1.5.0
Browse files Browse the repository at this point in the history
  • Loading branch information
susannasiebert committed Aug 7, 2019
1 parent 9c8e296 commit c79efab
Show file tree
Hide file tree
Showing 610 changed files with 78,353 additions and 60,859 deletions.
4 changes: 2 additions & 2 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,9 +67,9 @@
# built documents.
#
# The short X.Y version.
version = '1.4'
version = '1.5'
# The full version, including alpha/beta/rc tags.
release = '1.4.5'
release = '1.5.0'

# The language for content autogenerated by Sphinx. Refer to documentation
# for a list of supported languages.
Expand Down
Binary file added docs/images/pVACbind_logo_trans-bg_sm_v4b.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/images/pVACbind_logo_trans-bg_v4b.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
107 changes: 70 additions & 37 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -48,48 +48,81 @@ tools:
mailing_list


New in release |release|
------------------------

This is a hotfix release. It fixes the following issues:

- In a previous version we implemented a faster method for reading data from
the database in pVACapi. However, this would fail if the postgres user is
not a superuser. This version fixes this issue by using the previous
database file read method in this situation.
- This version marks certain columns of the output reports as not visualizable
in pVACviz/pVACapi because they contain string content that cannot be
plotted in a scatterplot.

New in version |version|
------------------------

This version adds the following features:

- pVACvector now tests spacers iteratively. During the first iteration, the
first spacer in the list of ``--spacers`` gets tested. In the next
iteration, the next spacer in the list gets added to the pool of spacers to
tests, and so on. If at any point a valid ordering is found, pVACvector will
finish its run and output the result. This might result in slightly
less optimal (but still valid) ordering but improves runtime significantly.
- If, after testing all spacers, no valid ordering if found, pVACvector will
clip the beginning and/or ends of problematic peptides by one amino acid.
The ordering finding process is then repeated on the updated list of
peptides. This process may be repeated up to a maximum set by the
``--max-clip-length`` parameter.
- This version adds a standalone command to create the pVACvector
visualizations that can be run by calling ``pvacvector visualize`` using a
pVACvector result file as the input.
- We removed the ``--aditional-input-file-list`` option to pVACseq. Readcount and
expression information are now taken directly from the VCF annotations.
Instructions on how to add these annotations to your input VCF can be found
on the :ref:`prerequisites_label` page.
- We added support for variants to pVACseq that are only annotated as
``protein_altering_variant`` without a more specific consequence of
``missense_variant``, ``inframe_insertion``, ``inframe_deletion``, or ``frameshift_variant``.
- We resolved some syntax differences that prevented pVACtools from being run
under python 3.6 or python 3.7. pVACtools should now be compatible with all
python3 versions.
- This version introduces a new tool, ``pVACbind``, which can be used
to run our immunotherapy pipeline with a peptides
FASTA file as input. This new tool is similar to pVACseq but certain
options and filters are removed:

- All input sequences are interpreted in isolation so corresponding
wildtype sequence and score information are not assigned. As a consequence,
the filter threshold option on fold change is removed.
- Because the input format doesn't allow for association of readcount,
expression or transcript support level data, pVACbind doesn't run the coverage
filter or transcript support level filter.
- No condensed report is generated.

Please see the :ref:`pvacbind` documentation for more information.

- pVACfuse now support annotated fusion files from `AGFusion <https://github.com/murphycj/AGFusion>`_ as input. The
:ref:`pvacfuse` documentation has been updated with instructions on how to
run AGFusion in the Prerequisites section.
- The top score filter has been updated to take into account alternative known
transcripts that might result in non-indentical peptide sequences/epitopes.
The top score filter now picks the best epitope for every available transcript of a
variant. If the resulting list of epitopes for one variant is not identical,
the filter will output all eptiopes. If the resulting list of epitopes for one
variant are identical, the filter only outputs the epitope for the transcript with the highest
transcript expression value. If no expression data is available, or if
multiple transcripts remain, the filter outputs the epitope for the
transcripts with the lowest transcript Ensembl ID.
- This version adds a few new options to the ``pvacseq
generate_protein_fasta`` command:

- The ``--mutant-only`` option can be used to only output mutant peptide
sequences instead of mutant and wildtype sequences.
- This command now has an option to provide a pVACseq all_eptiopes or
filtered TSV file as an input (``--input-tsv``). This will limit the
output fasta to only sequences that originated from the variants in that file.

- This release adds a ``pvacfuse generate_protein_fasta`` command that works
similarly to the ``pvacseq generate_protein_fasta`` command but works with
Integrate-NEO or AGFusion input files.
- We removed the sorting of the all_epitopes result file in order to reduce
memory usage. Only the filtered files will be sorted. This version also updates the sorting algorithm of the
filtered files as follows:

- If the ``--top-score-metric`` is set to ``median`` the results are first
filtered by the ``Median MT Score``. If multiple epitopes have the same
``Median MT Score`` they are then filtered by the ``Corresponding Fold
Change``. The last sorting criteria is the ``Best MT Score``.
- If the ``--top-score-metric`` is set to ``lowest`` the results are first
filtered by the ``Best MT Score``. If multiple epitopes have the same
``Best MT Score`` they are then filtered by the ``Corresponding Fold
Change``. The last sorting criteria is the ``Median MT Score``.

- pVACseq, pVACfuse, and pVACbind now calculcate manufacturability metrics
calculated for the predicted epitopes. Manufacturability metrics are also
calculcated for all protein sequences when running the ``pvacseq generate_protein_fasta``
and ``pvacfuse generate_protein_fasta`` commands. They are saved in the ``.manufacturability.tsv``
along to the result fasta.
- The pVACseq score that gets calculated for epitopes in the condensed report
is now converted into a rank. This will hopefully remove any confusion about
whether the previous score could be treated as an absolute measure of
immunogencity, which it was not intended for. Converting this score to a
rank ensures that it gets treated in isolation for only the epitopes in the
condensed file.
- The condensed report now also outputs the mutation position as well as the
full set of lowest and median wildtype and mutant scores.
- This version adds a clear cache function to pVACapi that can be called by
running ``pvacapi clear_cache``. Sometimes pVACapi can get into a state
where the cache file contains conflicting data compared to the actual
process outputs which results in errors. Clearing the cache using the ``pvacapi clear_cache``
function can be used in that situation to resolve these errors.

Past release notes can be found on our :ref:`releases` page.

Expand Down
2 changes: 2 additions & 0 deletions docs/pvacbind.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@
:align: right
:alt: pVACbind logo

.. _pvacbind:

pVACbind
====================================

Expand Down
2 changes: 0 additions & 2 deletions docs/pvacbind/filter_commands.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,6 @@
:align: right
:alt: pVACbind logo

.. _filter_commands:

Filtering Commands
=============================

Expand Down
102 changes: 72 additions & 30 deletions docs/pvacbind/output_files.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,33 +15,75 @@ which prediction algorithms were chosen:
Each folder will contain the same list of output files (listed in the order
created):

=================================================== ===========
File Name Description
=================================================== ===========
``<sample_name>.tsv`` An intermediate file with variant and transcript information parsed from the input file.
``<sample_name>.tsv_<chunks>`` (multiple) The above file but split into smaller chunks for easier processing with IEDB.
``<sample_name>.all_epitopes.tsv`` A list of all predicted epitopes and their binding affinity scores, with additional variant information from the ``<sample_name>.tsv``.
``<sample_name>.filtered.tsv`` The above file after applying all filters.
=================================================== ===========

Final Report Columns
--------------------

=============================================================== ===========
Column Name Description
=============================================================== ===========
``Mutation`` The FASTA ID of the peptide sequence the epitope belongs to
``HLA Allele`` The HLA allele for this prediction
``Sub-peptide Position`` The one-based position of the epitope in the protein sequence used to make the prediction
``Epitope Seq`` The epitope sequence
``Median Score`` Median ic50 binding affinity of the epitope of all prediction algorithms used
``Best Score`` Lowest ic50 binding affinity of all prediction algorithms used
``Best Score Method`` Prediction algorithm with the lowest ic50 binding affinity for this epitope
``Individual Prediction Algorithm Scores`` (multiple) ic50 scores for the ``Epitope Seq`` for the individual prediction algorithms used
``Best Cleavage Position`` (optional) Position of the highest predicted cleavage score
``Best Cleavage Score`` (optional) Highest predicted cleavage score
``Cleavage Sites`` (optional) List of all cleavage positions and their cleavage score
``Predicted Stability Half Life`` (optional) The stability half life of the ``MT Epitope Seq``
``Stability Rank`` (optional) The % rank stability of the ``MT Epitope Seq``
``NetMHCstab allele`` (optional) Nearest neighbor to the ``HLA Allele``. Used for NetMHCstab prediction
=============================================================== ===========
.. list-table::
:header-rows: 1

* - File Name
- Description
* - ``<sample_name>.tsv``
- An intermediate file with variant information parsed from the input files.
* - ``<sample_name>.tsv_<chunks>`` (multiple)
- The above file but split into smaller chunks for easier processing with IEDB.
* - ``<sample_name>.all_epitopes.tsv``
- A list of all predicted epitopes and their binding affinity scores, with
additional variant information from the ``<sample_name>.tsv``.
* - ``<sample_name>.filtered.tsv``
- The above file after applying all filters, with cleavage site and stability
predictions added.

all_epitopes.tsv and filtered.tsv Report Columns
------------------------------------------------

.. list-table::
:header-rows: 1

* - Column Name
- Description
* - ``Mutation``
- The FASTA ID of the peptide sequence the epitope belongs to
* - ``HLA Allele``
- The HLA allele for this prediction
* - ``Sub-peptide Position``
- The one-based position of the epitope in the protein sequence used to make the prediction
* - ``Epitope Seq``
- The epitope sequence
* - ``Median Score``
- Median ic50 binding affinity of the epitope of all prediction algorithms used
* - ``Best Score``
- Lowest ic50 binding affinity of all prediction algorithms used
* - ``Best Score Method``
- Prediction algorithm with the lowest ic50 binding affinity for this epitope
* - ``Individual Prediction Algorithm Scores`` (multiple)
- ic50 scores for the ``Epitope Seq`` for the individual prediction algorithms used
* - ``cterm_7mer_gravy_score``
- Mean hydropathy of last 7 residues on the C-terminus of the peptide
* - ``max_7mer_gravy_score``
- Max GRAVY score of any kmer in the amino acid sequence. Used to determine if there are any extremely
hydrophobic regions within a longer amino acid sequence.
* - ``difficult_n_terminal_residue`` (T/F)
- Is N-terminal amino acid a Glutamine, Glutamic acid, or Cysteine?
* - ``c_terminal_cysteine`` (T/F)
- Is the C-terminal amino acid a Cysteine?
* - ``c_terminal_proline`` (T/F)
- Is the C-terminal amino acid a Proline?
* - ``cysteine_count``
- Number of Cysteines in the amino acid sequence. Problematic because they can form disulfide bonds across
distant parts of the peptide
* - ``n_terminal_asparagine`` (T/F)
- Is the N-terminal amino acid a Asparagine?
* - ``asparagine_proline_bond_count``
- Number of Asparagine-Proline bonds. Problematic because they can spontaneously cleave the peptide
* - ``Best Cleavage Position`` (optional)
- Position of the highest predicted cleavage score
* - ``Best Cleavage Score`` (optional)
- Highest predicted cleavage score
* - ``Cleavage Sites`` (optional)
- List of all cleavage positions and their cleavage score
* - ``Predicted Stability`` (optional)
- Stability of the pMHC-I complex
* - ``Half Life`` (optional)
- Half-life of the pMHC-I complex
* - ``Stability Rank`` (optional)
- The % rank stability of the pMHC-I complex
* - ``NetMHCstab allele`` (optional)
- Nearest neighbor to the ``HLA Allele``. Used for NetMHCstab prediction
2 changes: 2 additions & 0 deletions docs/pvacfuse.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@
:align: right
:alt: pVACfuse logo

.. _pvacfuse:

pVACfuse
====================================

Expand Down
7 changes: 7 additions & 0 deletions docs/pvacfuse/filter_commands.rst
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,13 @@ This filter picks the top epitope for a variant. Epitopes with the same
Chromosome - Start - Stop - Reference - Variant are identified as coming from
the same variant.

In order to account for different splice sites among the transcripts of a
variant that would lead to different peptides, this filter also takes into
account the different transcripts returned by Integrate-Neo/AGFusion and will return
the top epitope for all transcripts if they are non-identical. If the
resulting list of top epitopes for the transcripts of a variant is identical,
the epitope for the transcript with the lowest Ensembl ID is returned.

By default the
``--top-score-metric`` option is set to ``median`` which will apply this
filter to the ``Median MT Score`` column and pick the epitope with the lowest
Expand Down
19 changes: 18 additions & 1 deletion docs/pvacfuse/getting_started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,12 @@ Getting Started
pVACfuse provides a set of example data to show the expected format of input and output files.
You can download the data set by running the ``pvacfuse download_example_data`` :ref:`command <pvacfuse_example_data>`.

The example data output can be reproduced by running the following command:
There are two option as to how to run pVACfuse. It accepts either a
INTEGRATE-neo output bedpe file or a AGFusion output directory.

The following command is an example for how to run pVACfuse with an
INTEGRATE-neo bedpe file and will regenerate the
``results_from_integrate_neo`` example data:

.. code-block:: none
Expand All @@ -20,4 +25,16 @@ The example data output can be reproduced by running the following command:
<output_dir> \
-e 8,9,10
The ``results_from_agfusion`` example data can be regenerated like so:

.. code-block:: none
pvacfuse run \
<example_data_dir>/agfusion/ \
Test \
HLA-A*02:01,HLA-B*35:01,DRB1*11:01 \
MHCflurry MHCnuggetsI MHCnuggetsII NNalign NetMHC PickPocket SMM SMMPMBEC SMMalign \
<output_dir> \
-e 8,9,10
A detailed description of all command options can be found on the :ref:`Usage <pvacfuse_run>` page.
Loading

0 comments on commit c79efab

Please sign in to comment.