This project aims to evaluate and compare the performance of nGPT (Nvidia's model) against two state-of-the-art abstractive text summarization models: PEGASUS (Google) and BART (Facebook). The focus is on assessing nGPT's claims of achieving 4-20x faster training times and improved stability for Large Language Models (LLMs) in the context of abstractive text summarization.
- Assess nGPT's Efficiency: Validate nGPT's claims of faster training times and improved stability.
- Compare Performance: Evaluate the summarization quality of nGPT, PEGASUS, and BART.
- Stability Analysis: Investigate the stability of these models during training and inference.
- PEGASUS and BART: Train base models on standardized dataset CNN/Daily Mail.
- nGPT: Adapt nGPT for abstractive summarization, optimizing its performance for the task.
- ROUGE Scores: Use ROUGE-1, ROUGE-2, and ROUGE-L to measure the quality of summaries.
- Training Speed: Measure the time taken for each epoch during training.
- Stability: Monitor loss curves, gradient norms, and other stability indicators.
- Summarization Quality: Compare the summaries generated by each model.
- Training Efficiency: Analyze the training times and resource utilization.
- Stability: Evaluate how each model handles training and inference stability.
Model | ROUGE-1 | ROUGE-2 | ROUGE-L |
---|---|---|---|
nGPT | 0.289 | 0.039 | 0.141 |
PEGASUS | 0.125 | 0.0125 | 0.095 |
BART | 0.122 | 0.013 | 0.101 |
Model | Time per Epoch (min) | Total Training Time (h) |
---|---|---|
nGPT | 2 mins | 1hr 11min |
PEGASUS | 26 mins | 5hr 32min |
BART | 20 mins | 4hr 42min |
# make sure to have git installed
git clone https://github.com/NU-6120-24-SJSKP/nGPT-BART-PEGASUS-efficiency-study.git
cd nGPT-BART-PEGASUS-efficiency-study
git checkout main
# make sure have python 3.12 and python3-pip installed
python -m venv .env
source .env/bin/activate
pip install -r ngpt/requirements.txt
pip install -r bart/requirements.txt
pip install -r pegasus/requirements.txt
# help on using the script
python main.py -h
python main.py --MODEL `ngpt|bart|pegasus`
# example
# python main.py --MODEL ngpt --PARAMS params/ngpt.json #use corresponding model params only
# or just python main.py --MODEL ngpt
- HuggingFace
- lucidrains for the nGPT-pytorch (org link: https://github.com/lucidrains/nGPT-pytorch)