Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nemo-v2 <- nemo-vt #3

Closed
wants to merge 36 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
4084669
Cut branch r1.19.0
titu1994 May 17, 2023
1dc8b37
Fix a bug, use _ceil_to_nearest instead as _round_to_nearest is not d…
BestJuly May 19, 2023
0ca1dd3
Fix k2 installation in Docker with CUDA 12 (#6707)
artbataev May 23, 2023
db6e29b
Tutorial fixes (#6717)
titu1994 May 24, 2023
2e2df4a
VP Fixes for converter + Config management (#6698) (#6738)
titu1994 May 26, 2023
4df8f33
Fix fastpitch test nightly (#6742)
hsiehjackson May 26, 2023
e806e11
check for first or last stage (#6708)
ericharper May 26, 2023
dbd6a56
Bug fix to restore act ckpt (#6753)
markelsanz14 May 29, 2023
a0f757e
Bug fix to reset sequence parallelism (#6756)
markelsanz14 May 31, 2023
39dd654
Fix checkpointed forward and add test for full activation checkpointi…
aklife97 May 31, 2023
216bcab
Fix Links (#6777)
titu1994 May 31, 2023
4ecc769
add call to p2p overlap (#6779)
aklife97 Jun 1, 2023
1486b12
Fix get_parameters when using main params optimizer (#6764)
ericharper Jun 1, 2023
aff5217
Lddl bert (#6761)
wdykas Jun 1, 2023
4bbb3c6
Debug Transformer Engine FP8 support with Megatron-core infrastructur…
timmoon10 Jun 1, 2023
e4460d1
Tensor-parallel communication overlap with userbuffer backend (#6780)
erhoo82 Jun 1, 2023
9bd8ecd
Fix adapter tutorial r1.19.0 (#6776)
hsiehjackson Jun 2, 2023
913e5e5
Fix check (#6798)
MaximumEntropy Jun 2, 2023
a8aa8f1
Bug fix for reset_sequence_parallel_args (#6802)
markelsanz14 Jun 2, 2023
0e0253e
Add ub communicator initialization to validation step (#6807)
erhoo82 Jun 5, 2023
41bb941
update core version (#6817)
aklife97 Jun 6, 2023
45144f5
Add trainer.validate example for GPT (#6794)
ericharper Jun 6, 2023
dc52b94
fix notebook error (#6840)
yidong72 Jun 8, 2023
4239b80
fix (#6842)
yidong72 Jun 8, 2023
87e1b81
Add API docs for NeMo Megatron (#6850)
ericharper Jun 13, 2023
f875702
Apply garbage collection interval to validation steps (#6870)
erhoo82 Jun 14, 2023
2331b06
update mcore version (#6875)
ericharper Jun 15, 2023
9b1774e
fix chekpoint loading error
DevMehendale Jan 15, 2024
3bddb03
Add multi-softmax architecture for CTC, RNN-T and Hybrid models
kaushal-py Jan 15, 2024
1ed6d46
added some things
Jan 24, 2024
c6cb1c8
Fix inference bug
kaushal-py Feb 5, 2024
b6ca450
Update hybrid_rnnt_ctc_bpe_models.py
tahirjmakhdoomi Feb 6, 2024
452f4f4
Merge pull request #1 from AI4Bharat/multi-softmax
tahirjmakhdoomi Feb 6, 2024
29b9f7b
added decoding fixes
Feb 21, 2024
438aa07
fixed multisoftmax for single language
tahirjmakhdoomi Feb 21, 2024
fc968b0
fixed multisoftmax for single language
tahirjmakhdoomi Feb 21, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
341 changes: 176 additions & 165 deletions Jenkinsfile

Large diffs are not rendered by default.

62 changes: 31 additions & 31 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@
:target: http://www.repostatus.org/#active
:alt: Project Status: Active – The project has reached a stable, usable state and is being actively developed.

.. |documentation| image:: https://readthedocs.com/projects/nvidia-nemo/badge/?version=main
.. |documentation| image:: https://readthedocs.com/projects/nvidia-nemo/badge/?version=r1.19.0
:alt: Documentation
:target: https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/
:target: https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/r1.19.0/

.. |license| image:: https://img.shields.io/badge/License-Apache%202.0-brightgreen.svg
:target: https://github.com/NVIDIA/NeMo/blob/master/LICENSE
Expand All @@ -25,15 +25,15 @@
:target: https://pepy.tech/project/nemo-toolkit
:alt: PyPi total downloads

.. |codeql| image:: https://github.com/nvidia/nemo/actions/workflows/codeql.yml/badge.svg?branch=main&event=push
.. |codeql| image:: https://github.com/nvidia/nemo/actions/workflows/codeql.yml/badge.svg?branch=r1.19.0&event=push
:target: https://github.com/nvidia/nemo/actions/workflows/codeql.yml
:alt: CodeQL

.. |black| image:: https://img.shields.io/badge/code%20style-black-000000.svg
:target: https://github.com/psf/black
:alt: Code style: black

.. _main-readme:
.. _r1.19.0-readme:

**NVIDIA NeMo**
===============
Expand Down Expand Up @@ -61,7 +61,7 @@ We have extensive `tutorials <https://docs.nvidia.com/deeplearning/nemo/user-gui
can all be run on `Google Colab <https://colab.research.google.com>`_.

For advanced users that want to train NeMo models from scratch or finetune existing NeMo models
we have a full suite of `example scripts <https://github.com/NVIDIA/NeMo/tree/main/examples>`_ that support multi-GPU/multi-node training.
we have a full suite of `example scripts <https://github.com/NVIDIA/NeMo/tree/r1.19.0/examples>`_ that support multi-GPU/multi-node training.

For scaling NeMo LLM training on Slurm clusters or public clouds, please see the `NVIDIA NeMo Megatron Launcher <https://github.com/NVIDIA/NeMo-Megatron-Launcher>`_.
The NM launcher has extensive recipes, scripts, utilities, and documentation for training NeMo LLMs and also has an `Autoconfigurator <https://github.com/NVIDIA/NeMo-Megatron-Launcher#53-using-autoconfigurator-to-find-the-optimal-configuration>`_
Expand All @@ -74,7 +74,7 @@ Key Features

* Speech processing
* `HuggingFace Space for Audio Transcription (File, Microphone and YouTube) <https://huggingface.co/spaces/smajumdar/nemo_multilingual_language_id>`_
* `Automatic Speech Recognition (ASR) <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/intro.html>`_
* `Automatic Speech Recognition (ASR) <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/r1.19.0/asr/intro.html>`_
* Supported ASR models: `<https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/asr/models.html>`_
* Jasper, QuartzNet, CitriNet, ContextNet
* Conformer-CTC, Conformer-Transducer, FastConformer-CTC, FastConformer-Transducer
Expand All @@ -88,42 +88,42 @@ Key Features
* Streaming/Buffered ASR (CTC/Transducer) - `Chunked Inference Examples <https://github.com/NVIDIA/NeMo/tree/stable/examples/asr/asr_chunked_inference>`_
* Cache-aware Streaming Conformer - `<https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/asr/models.html#cache-aware-streaming-conformer>`_
* Beam Search decoding
* `Language Modelling for ASR <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/asr_language_modeling.html>`_: N-gram LM in fusion with Beam Search decoding, Neural Rescoring with Transformer
* `Support of long audios for Conformer with memory efficient local attention <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/results.html#inference-on-long-audio>`_
* `Speech Classification, Speech Command Recognition and Language Identification <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/speech_classification/intro.html>`_: MatchboxNet (Command Recognition), AmberNet (LangID)
* `Language Modelling for ASR <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/r1.19.0/asr/asr_language_modeling.html>`_: N-gram LM in fusion with Beam Search decoding, Neural Rescoring with Transformer
* `Support of long audios for Conformer with memory efficient local attention <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/r1.19.0/asr/results.html#inference-on-long-audio>`_
* `Speech Classification, Speech Command Recognition and Language Identification <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/r1.19.0/asr/speech_classification/intro.html>`_: MatchboxNet (Command Recognition), AmberNet (LangID)
* `Voice activity Detection (VAD) <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/asr/speech_classification/models.html#marblenet-vad>`_: MarbleNet
* ASR with VAD Inference - `Example <https://github.com/NVIDIA/NeMo/tree/stable/examples/asr/asr_vad>`_
* `Speaker Recognition <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/speaker_recognition/intro.html>`_: TitaNet, ECAPA_TDNN, SpeakerNet
* `Speaker Diarization <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/speaker_diarization/intro.html>`_
* `Speaker Recognition <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/r1.19.0/asr/speaker_recognition/intro.html>`_: TitaNet, ECAPA_TDNN, SpeakerNet
* `Speaker Diarization <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/r1.19.0/asr/speaker_diarization/intro.html>`_
* Clustering Diarizer: TitaNet, ECAPA_TDNN, SpeakerNet
* Neural Diarizer: MSDD (Multi-scale Diarization Decoder)
* `Speech Intent Detection and Slot Filling <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/asr/speech_intent_slot/intro.html>`_: Conformer-Transformer
* `Speech Intent Detection and Slot Filling <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/r1.19.0/asr/speech_intent_slot/intro.html>`_: Conformer-Transformer
* `Pretrained models on different languages. <https://ngc.nvidia.com/catalog/collections/nvidia:nemo_asr>`_: English, Spanish, German, Russian, Chinese, French, Italian, Polish, ...
* `NGC collection of pre-trained speech processing models. <https://ngc.nvidia.com/catalog/collections/nvidia:nemo_asr>`_
* Natural Language Processing
* `NeMo Megatron pre-training of Large Language Models <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/nlp/nemo_megatron/intro.html>`_
* `Neural Machine Translation (NMT) <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/machine_translation/machine_translation.html>`_
* `Punctuation and Capitalization <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/punctuation_and_capitalization.html>`_
* `Token classification (named entity recognition) <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/token_classification.html>`_
* `Text classification <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/text_classification.html>`_
* `Joint Intent and Slot Classification <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/joint_intent_slot.html>`_
* `Question answering <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/question_answering.html>`_
* `GLUE benchmark <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/glue_benchmark.html>`_
* `Information retrieval <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/information_retrieval.html>`_
* `Entity Linking <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/entity_linking.html>`_
* `Dialogue State Tracking <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/sgd_qa.html>`_
* `Prompt Learning <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/nemo_megatron/prompt_learning.html>`_
* `Neural Machine Translation (NMT) <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/r1.19.0/nlp/machine_translation/machine_translation.html>`_
* `Punctuation and Capitalization <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/r1.19.0/nlp/punctuation_and_capitalization.html>`_
* `Token classification (named entity recognition) <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/r1.19.0/nlp/token_classification.html>`_
* `Text classification <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/r1.19.0/nlp/text_classification.html>`_
* `Joint Intent and Slot Classification <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/r1.19.0/nlp/joint_intent_slot.html>`_
* `Question answering <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/r1.19.0/nlp/question_answering.html>`_
* `GLUE benchmark <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/r1.19.0/nlp/glue_benchmark.html>`_
* `Information retrieval <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/r1.19.0/nlp/information_retrieval.html>`_
* `Entity Linking <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/r1.19.0/nlp/entity_linking.html>`_
* `Dialogue State Tracking <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/r1.19.0/nlp/sgd_qa.html>`_
* `Prompt Learning <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/r1.19.0/nlp/nemo_megatron/prompt_learning.html>`_
* `NGC collection of pre-trained NLP models. <https://ngc.nvidia.com/catalog/collections/nvidia:nemo_nlp>`_
* `Synthetic Tabular Data Generation <https://developer.nvidia.com/blog/generating-synthetic-data-with-transformers-a-solution-for-enterprise-data-challenges/>`_
* `Speech synthesis (TTS) <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/tts/intro.html#>`_
* `Speech synthesis (TTS) <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/r1.19.0/tts/intro.html#>`_
* Spectrogram generation: Tacotron2, GlowTTS, TalkNet, FastPitch, FastSpeech2, Mixer-TTS, Mixer-TTS-X
* Vocoders: WaveGlow, SqueezeWave, UniGlow, MelGAN, HiFiGAN, UnivNet
* End-to-end speech generation: FastPitch_HifiGan_E2E, FastSpeech2_HifiGan_E2E, VITS
* `NGC collection of pre-trained TTS models. <https://ngc.nvidia.com/catalog/collections/nvidia:nemo_tts>`_
* `Tools <https://github.com/NVIDIA/NeMo/tree/stable/tools>`_
* `Text Processing (text normalization and inverse text normalization) <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/nlp/text_normalization/intro.html>`_
* `CTC-Segmentation tool <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/tools/ctc_segmentation.html>`_
* `Speech Data Explorer <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/tools/speech_data_explorer.html>`_: a dash-based tool for interactive exploration of ASR/TTS datasets
* `Text Processing (text normalization and inverse text normalization) <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/r1.19.0/nlp/text_normalization/intro.html>`_
* `CTC-Segmentation tool <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/r1.19.0/tools/ctc_segmentation.html>`_
* `Speech Data Explorer <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/r1.19.0/tools/speech_data_explorer.html>`_: a dash-based tool for interactive exploration of ASR/TTS datasets
* `Speech Data Processor <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/tools/speech_data_processor.html>`_


Expand All @@ -139,10 +139,10 @@ Requirements
Documentation
-------------

.. |main| image:: https://readthedocs.com/projects/nvidia-nemo/badge/?version=main
.. |r1.19.0| image:: https://readthedocs.com/projects/nvidia-nemo/badge/?version=r1.19.0
:alt: Documentation Status
:scale: 100%
:target: https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/
:target: https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/r1.19.0/

.. |stable| image:: https://readthedocs.com/projects/nvidia-nemo/badge/?version=stable
:alt: Documentation Status
Expand All @@ -152,7 +152,7 @@ Documentation
+---------+-------------+------------------------------------------------------------------------------------------------------------------------------------------+
| Version | Status | Description |
+=========+=============+==========================================================================================================================================+
| Latest | |main| | `Documentation of the latest (i.e. main) branch. <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/>`_ |
| Latest | |r1.19.0| | `Documentation of the latest (i.e. main) branch. <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/main/>`_ |
+---------+-------------+------------------------------------------------------------------------------------------------------------------------------------------+
| Stable | |stable| | `Documentation of the stable (i.e. most recent release) branch. <https://docs.nvidia.com/deeplearning/nemo/user-guide/docs/en/stable/>`_ |
+---------+-------------+------------------------------------------------------------------------------------------------------------------------------------------+
Expand Down Expand Up @@ -263,7 +263,7 @@ packaging is also needed:

.. code-block:: bash

pip install -y packaging
pip install packaging


Transformer Engine
Expand Down
2 changes: 1 addition & 1 deletion docs/source/_static/css/custom.css
Original file line number Diff line number Diff line change
Expand Up @@ -255,7 +255,7 @@ article ul {
}
}

@media (min-width: 1400px) {
@media (min-width: none) {
body {
font-size: 18px;
}
Expand Down
5 changes: 3 additions & 2 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,6 @@

sys.path.insert(0, os.path.abspath("../.."))
sys.path.insert(0, os.path.abspath("../../nemo"))
sys.path.insert(0, os.path.abspath("../../nemo_text_processing"))

from package_info import __version__

Expand All @@ -47,18 +46,20 @@
'hydra', # hydra-core in requirements, hydra during import
'dateutil', # part of core python
'transformers.tokenization_bert', # has ., troublesome for this regex
'megatron', # megatron-lm in requirements, megatron in import
'sklearn', # scikit_learn in requirements, sklearn in import
'nemo_text_processing.inverse_text_normalization', # Not installed automatically
'nemo_text_processing.text_normalization', # Not installed automatically
'attr', # attrdict in requirements, attr in import
'torchmetrics', # inherited from PTL
'lightning_utilities', # inherited from PTL
'apex',
'megatron.core',
'transformer_engine',
'joblib', # inherited from optional code
'IPython',
'ipadic',
'psutil',
'regex',
]

_skipped_autodoc_mock_imports = ['wrapt', 'numpy']
Expand Down
Loading
Loading