Benchmarks - Add LLaMA-2 Models #668

dpower4 · 2024-11-19T02:53:11Z

Added llama benchmark - training and inference in accordance with the existing pytorch models implementation like gpt2, lstm etc.

added llama fp8 unit test for better code coverage, to reduce memory required
updated transformers version >= 4.28.0 for LLamaConfig
set tokenizers version <= 0.20.3 to avoid 0.20.4 version issues with py3.8
added llama2 to tensorrt
llama2 tests not added to test_tensorrt_inference_performance.py due to large memory requirement for worker gpu. tests validated separately on gh200

abuccts

pls use python3 setup.py lint to check the format and run python3 setup.py format to format and code

superbench/benchmarks/micro_benchmarks/_export_torch_to_onnx.py

dpower4 · 2024-11-19T06:37:26Z

@abuccts can I get access to the unit test logs.

codecov · 2024-11-20T02:44:14Z

Codecov Report

Attention: Patch coverage is 87.70492% with 15 lines in your changes missing coverage. Please review.

Project coverage is 85.61%. Comparing base (4e6935a) to head (1570707).
Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
...bench/benchmarks/model_benchmarks/pytorch_llama.py	87.17%	15 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #668      +/-   ##
==========================================
+ Coverage   85.58%   85.61%   +0.03%     
==========================================
  Files          98       99       +1     
  Lines        7046     7165     +119     
==========================================
+ Hits         6030     6134     +104     
- Misses       1016     1031      +15

Flag	Coverage Δ
cpu-python3.10-unit-test	`71.21% <35.53%> (-0.64%)`	⬇️
cpu-python3.7-unit-test	`71.18% <36.06%> (-0.63%)`	⬇️
cpu-python3.8-unit-test	`71.22% <36.13%> (-0.62%)`	⬇️
cuda-unit-test	`83.42% <85.95%> (+0.03%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

guoshzhao · 2024-11-21T01:47:50Z

LGTM, thanks! Please fix the UT failures with Python 1.10. And since the CUDA tests are running on K80 which is very old GPU, we can skip the "cuda-init-test", and just make sure "cpu-unit-test" can pass.

/__w/1/s/.eggs/setuptools_scm-8.1.0-py3.10.egg/setuptools_scm/_integration/setuptools.py:92: UserWarning: version of superbench already set
  warnings.warn(f"version of {dist_name} already set")
running lint
tests/analyzer/test_summaryop.py:7: error: Module "numpy" has no attribute "NaN"  [attr-defined]

dpower4 · 2024-11-22T03:02:26Z

pytorch-llama validation on gh200.

dpower4 · 2024-11-27T07:49:00Z

tokenizers Rust:cargo issue: huggingface/tokenizers#1691

setup.py

superbench/benchmarks/base.py

dpower4 added 5 commits November 18, 2024 11:48

add llama init template

36bbf10

add llama2 unit test

697138a

fix dims for llama2 unit test

9355e22

update transformers version for LLamaConfig

60eae36

update docs

dadb56a

dpower4 requested review from cp5555 and a team as code owners November 19, 2024 02:53

update opset for torch onnx conversion

d2731a8

abuccts reviewed Nov 19, 2024

View reviewed changes

superbench/benchmarks/micro_benchmarks/_export_torch_to_onnx.py Show resolved Hide resolved

abuccts changed the title ~~Feat/llama2~~ Benchmarks - Add LLaMA-2 Models Nov 19, 2024

dpower4 added 2 commits November 19, 2024 00:27

format and lint

3644985

remove remnant

52f4900

lint fix

f826676

dpower4 force-pushed the feat/llama branch from b9746ca to f826676 Compare November 20, 2024 05:28

dpower4 added 5 commits November 20, 2024 00:34

replace py 3.6 with 3.10 and update cuda to 12.4 for unit test

6a41087

remove 3.6 from setup, codecov and docs

5b816b4

add llama fp8 unit test for better code coverage

f322c98

llama fp8 precision test only, to reduce memory required

b28ee17

lint fix

e6f6be3

guoshzhao approved these changes Nov 21, 2024

View reviewed changes

remove deprecated NaN usage for numpy>2.0

297a229

dpower4 force-pushed the feat/llama branch from 1c6f908 to 297a229 Compare November 21, 2024 09:30

dpower4 added 3 commits November 21, 2024 01:47

fix argparse formatting related test cases failure for 3.10

5f72f51

fix lint

8bbe326

fix lint

50452ef

dpower4 requested a review from abuccts November 21, 2024 16:55

dpower4 force-pushed the feat/llama branch from c869c95 to 50452ef Compare November 21, 2024 18:27

dpower4 and others added 3 commits November 21, 2024 10:27

add llama2 to tensorrt

0b1da4f

Merge branch 'main' into feat/llama

97b3d72

add more params to llama config

e423aec

dpower4 added benchmarks SuperBench Benchmarks micro-benchmarks Micro Benchmark Test for SuperBench Benchmarks labels Nov 22, 2024

dpower4 and others added 2 commits November 21, 2024 19:11

fix lint

d850210

Merge branch 'main' into feat/llama

b08f9e3

dpower4 added the model-benchmarks Model Benchmark Test for SuperBench Benchmarks label Nov 25, 2024

abuccts and others added 4 commits November 26, 2024 22:23

Merge branch 'main' into feat/llama

54c3e85

llama test: use fp16 instead of fp8 to relax cuda CC req.

670dc76

fix comment and lint

27c788c

fix precision arg as float16

00d09ba

limit tokenizers version to < 0.20.3 as 0.20.4 doesnt support py3.8

bd47fc3

abuccts approved these changes Nov 28, 2024

View reviewed changes

setup.py Outdated Show resolved Hide resolved

superbench/benchmarks/base.py Outdated Show resolved Hide resolved

dpower4 and others added 2 commits November 27, 2024 16:47

address review comments

bed3e01

Merge branch 'main' into feat/llama

1570707

abuccts enabled auto-merge (squash) November 28, 2024 00:56

abuccts merged commit 249e21c into microsoft:main Nov 28, 2024
19 of 20 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Benchmarks - Add LLaMA-2 Models #668

Benchmarks - Add LLaMA-2 Models #668

dpower4 commented Nov 19, 2024 •

edited by abuccts

Loading

abuccts left a comment

dpower4 commented Nov 19, 2024

codecov bot commented Nov 20, 2024 •

edited

Loading

guoshzhao commented Nov 21, 2024

dpower4 commented Nov 22, 2024

dpower4 commented Nov 27, 2024 •

edited

Loading

Benchmarks - Add LLaMA-2 Models #668

Benchmarks - Add LLaMA-2 Models #668

Conversation

dpower4 commented Nov 19, 2024 • edited by abuccts Loading

abuccts left a comment

Choose a reason for hiding this comment

dpower4 commented Nov 19, 2024

codecov bot commented Nov 20, 2024 • edited Loading

Codecov Report

guoshzhao commented Nov 21, 2024

dpower4 commented Nov 22, 2024

dpower4 commented Nov 27, 2024 • edited Loading

dpower4 commented Nov 19, 2024 •

edited by abuccts

Loading

codecov bot commented Nov 20, 2024 •

edited

Loading

dpower4 commented Nov 27, 2024 •

edited

Loading