Skip to content
This repository has been archived by the owner on Aug 30, 2024. It is now read-only.

resubmit "Implement the YaRN rop scaling feature" #147

Merged
merged 1 commit into from
Mar 4, 2024

Conversation

xiguiw
Copy link
Contributor

@xiguiw xiguiw commented Mar 1, 2024

Resubmit "Implement the YaRN rop scaling feature (#109)"

Fix Windows build error.

Type of Change

Implement the YaRN feature.

Add two new API for YaRN rop-scale:

NE_API struct ne_tensor* ne_rope_custom_inplace(struct ne_context* ctx, struct ne_tensor* a, int n_past, int n_dims, int mode,
int prompt_size, float freq_base, float freq_scale, int yarn_orig_ctx, float ext_factor, float attn_factor,
float beta_fast, float beta_slow);

NE_API struct ne_tensor ne_rope_custom_shift_inplace(struct ne_context* ctx, struct ne_tensor* a, int n_shift, int n_dims,
int mode, int prompt_size, int n_keep, struct ne_tensor* cossin,
float freq_base, float freq_scale, int yarn_orig_ctx, float ext_factor, float attn_factor,
float beta_fast, float beta_slow);

Change ne_layer internal API:
original:
struct ne_tensor* ne_rope_impl(struct ne_context* ctx, struct ne_tensor* a, int n_past, int n_dims, int mode,
int prompt_size, bool inplace, int n_keep, struct ne_tensor* cossin, int* n_padding,
bool padding_left, float freq_base, float freq_scale)

new API:
struct ne_tensor* ne_rope_impl(struct ne_context* ctx, struct ne_tensor* a, int n_past, int n_dims, int mode,
int prompt_size, bool inplace, int n_keep, struct ne_tensor* cossin, int* n_padding,
bool padding_left, float freq_base, float freq_scale,
int yarn_orig_ctx, float ext_factor, float attn_factor,
float beta_fast, float beta_slow)

Description

Expected Behavior & Potential Risk

No behaviors changes to all the models as yarn is not called.

How has this PR been tested?

Teste llama2 7B ,fp32, q40, q4j format. No changes found yet.

Dependency Change?

No.

@VincyZhang
Copy link
Contributor

@VincyZhang
Copy link
Contributor

@VincyZhang VincyZhang requested a review from intellinjun March 4, 2024 01:30
@VincyZhang VincyZhang merged commit 6c36f54 into intel:main Mar 4, 2024
11 checks passed
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants