Gemlite fixes #1432

HDCharles · 2024-12-18T14:38:56Z

Summary:

shapes need to be divisible by 128 or they will not work with gemlite need fp32 accumulation for groupsize None on int4

Test Plan:

python test_integration.py -k "test_gemlite" (new test for non divisible shape)a

python generate.py --checkpoint_path $CHECKPOINT_PATH/$MODEL_REPO/model.pth --compile --precision float16 --quantization gemlite-8-4-None --write_result benchmark_results.txt python generate.py --checkpoint_path
$CHECKPOINT_PATH/$MODEL_REPO/model.pth --compile --precision float16 --quantization gemlite-32-4-None --write_result benchmark_results.txta

(previously these gave nonsense responses)

Reviewers:

Subscribers:

Tasks:

Tags:

Summary: shapes need to be divisible by 128 or they will not work with gemlite need fp32 accumulation for groupsize None on int4 Test Plan: python test_integration.py -k "test_gemlite" (new test for non divisible shape)a python generate.py --checkpoint_path $CHECKPOINT_PATH/$MODEL_REPO/model.pth --compile --precision float16 --quantization gemlite-8-4-None --write_result benchmark_results.txt python generate.py --checkpoint_path $CHECKPOINT_PATH/$MODEL_REPO/model.pth --compile --precision float16 --quantization gemlite-32-4-None --write_result benchmark_results.txta (previously these gave nonsense responses) Reviewers: Subscribers: Tasks: Tags:

pytorch-bot · 2024-12-18T14:39:00Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1432

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 2 New Failures

As of commit 1797c75 with merge base 33d57af ():

NEW FAILURES - The following jobs have failed:

Code Analysis with Ruff / build (3.9) (gh)
Process completed with exit code 1.
PR Label Check / Check PR Labels (gh)
Process completed with exit code 1.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jerryzh168 · 2024-12-18T16:41:12Z

torchao/_models/llama/generate.py

@@ -41,6 +37,14 @@ def elapsed_time(self, other_event):
        return abs(other_event.event_time - self.event_time) * 1000


+def get_arch_name() -> str:


why these changes? is this some rebase issue

@HDCharles

Summary: Resubmitting fixes from @HDCharles in pytorch#1432 since that seems to have issues with rebase Test Plan: see pytorch#1432 Reviewers: Subscribers: Tasks: Tags:

mobicham · 2024-12-18T17:11:27Z

torchao/dtypes/uintx/gemlite_layout.py

+        if _layout.group_size == None and _layout.bit_width == 4:
+                from gemlite.core import GEMLITE_ACC_DTYPE
+                from gemlite.dtypes import DType
+                GEMLITE_ACC_DTYPE[DType.FP16] = DType.FP32


This will only work when all the layers use the same group_size, which is ok for now.
The other option will be using this https://github.com/mobiusml/gemlite/blob/master/gemlite/core.py#L87 but for now let's keep it like this

I tested this manually, it works in all cases even when there are different group sizes.

I mean when different layers use different settings within the same model, but let's not worry about that !

Summary: Resubmitting pytorch#1432 since it has some rebase issues and we want to merge the fix asap Test Plan: see pytorch#1432 Reviewers: Subscribers: Tasks: Tags:

* [resubmit] Gemlite fix Summary: Resubmitting #1432 since it has some rebase issues and we want to merge the fix asap Test Plan: see #1432 Reviewers: Subscribers: Tasks: Tags: * ruff

jerryzh168 · 2024-12-19T00:42:51Z

landed in #1435, please feel free to submit any follow up fixes

HDCharles requested a review from jerryzh168 December 18, 2024 14:39

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Dec 18, 2024

HDCharles requested a review from mobicham December 18, 2024 14:39

jerryzh168 reviewed Dec 18, 2024

View reviewed changes

jerryzh168 added a commit to jerryzh168/ao that referenced this pull request Dec 18, 2024

Fix gemlite shape issues

20b816f

Summary: Resubmitting fixes from @HDCharles in pytorch#1432 since that seems to have issues with rebase Test Plan: see pytorch#1432 Reviewers: Subscribers: Tasks: Tags:

mobicham reviewed Dec 18, 2024

View reviewed changes

jerryzh168 added a commit to jerryzh168/ao that referenced this pull request Dec 18, 2024

[resubmit] Gemlite fix

967e35a

Summary: Resubmitting pytorch#1432 since it has some rebase issues and we want to merge the fix asap Test Plan: see pytorch#1432 Reviewers: Subscribers: Tasks: Tags:

jerryzh168 mentioned this pull request Dec 18, 2024

[resubmit] Gemlite fix #1435

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gemlite fixes #1432

Gemlite fixes #1432

HDCharles commented Dec 18, 2024

pytorch-bot bot commented Dec 18, 2024 •

edited

Loading

jerryzh168 Dec 18, 2024

mobicham Dec 18, 2024

HDCharles Dec 19, 2024

mobicham Dec 21, 2024

jerryzh168 commented Dec 19, 2024

		@@ -41,6 +37,14 @@ def elapsed_time(self, other_event):
		return abs(other_event.event_time - self.event_time) * 1000


		def get_arch_name() -> str:

Gemlite fixes #1432

Are you sure you want to change the base?

Gemlite fixes #1432

Conversation

HDCharles commented Dec 18, 2024

pytorch-bot bot commented Dec 18, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1432

❌ 2 New Failures

jerryzh168 Dec 18, 2024

Choose a reason for hiding this comment

mobicham Dec 18, 2024

Choose a reason for hiding this comment

HDCharles Dec 19, 2024

Choose a reason for hiding this comment

mobicham Dec 21, 2024

Choose a reason for hiding this comment

jerryzh168 commented Dec 19, 2024

pytorch-bot bot commented Dec 18, 2024 •

edited

Loading