Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GPU] Implement per-token FC dyn-quan #27763

Open
wants to merge 18 commits into from

Conversation

byungilm
Copy link
Contributor

@byungilm byungilm commented Nov 26, 2024

Details:

  • item1
  • ...

Tickets:

  • 158513

@github-actions github-actions bot added the category: GPU OpenVINO GPU plugin label Nov 26, 2024
@byungilm byungilm force-pushed the validate_per_token_dyn_quan branch 3 times, most recently from ca6918d to ae38233 Compare November 29, 2024 01:59
@byungilm byungilm changed the title [GPU][TEMP] Implement per-token FC dyn-quan [GPU] Implement per-token FC dyn-quan Nov 29, 2024
@byungilm byungilm force-pushed the validate_per_token_dyn_quan branch from 73dfeff to 2812544 Compare December 5, 2024 01:53
@byungilm byungilm marked this pull request as ready for review December 9, 2024 21:20
@byungilm byungilm requested review from a team as code owners December 9, 2024 21:20
@byungilm byungilm self-assigned this Dec 9, 2024
@byungilm byungilm force-pushed the validate_per_token_dyn_quan branch from c7b123f to d83debc Compare December 12, 2024 11:48
@byungilm
Copy link
Contributor Author

Resolved some conflicts from fc_gpu_bf_tile.cpp

@byungilm
Copy link
Contributor Author

byungilm commented Dec 12, 2024

  • Reverted change of data type 'quan_var'. (It assums that activation sum of sym 8bit model input would not overflow from half range.
  • Changed important log from TRACE_DETAIL to _LOG level about setting group size.
  • Moved out scale calculation from inner loops if token size is per-token

+ Resolved accuracy issue
+ Cleared OOR error

Signed-off-by: Min, Byungil <[email protected]>
Signed-off-by: Min, Byungil <[email protected]>
Signed-off-by: Min, Byungil <[email protected]>
Signed-off-by: Min, Byungil <[email protected]>
+ Fixed CI issue
+ Added unit-tests

Signed-off-by: Min, Byungil <[email protected]>
Signed-off-by: Min, Byungil <[email protected]>
@byungilm byungilm force-pushed the validate_per_token_dyn_quan branch from 75e1dda to 5edca58 Compare December 24, 2024 05:52
@byungilm byungilm added this pull request to the merge queue Dec 26, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Dec 26, 2024
@byungilm byungilm enabled auto-merge January 1, 2025 14:34
@byungilm byungilm added this pull request to the merge queue Jan 1, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: GPU OpenVINO GPU plugin
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants