Refactors #27

doomslide · 2024-10-07T08:27:12Z

Added full attention scores + some memory optimization for kvcache + other minor things

doomslide · 2024-10-07T14:43:55Z

did a bit more cleaning, most major is factoring generate out of main and factoring initialize out of generate. they have different logic so easier to debug this way.

redundant right now

qdbp · 2024-10-08T14:00:52Z

@hallucinomeny since this PR looks like it's ahead in the queue of my #40 and you also "lift" the generator function out of main, would you be open to moving to the dataclass-based impl I have there?

I think that provides better encapsulation and will also play more nicely with the "sampler interface standardization" I have in mind. Specifically:

# Create the batch of tokens
@dataclass(kw_only=True)
class TokenGenerator(Generic[Cfg_contra, ST]):
  weights: XfmrWeights
  model_params: ModelParams
  tokenizer: Tokenizer
  sampler: EntropySampler[Cfg_contra, ST]
  sampler_cfg: Cfg_contra

  def generate_from_prompt(self, init_tokens) -> Generator[str, None, None]:
    gen_tokens = None
    cur_pos = 0
    tokens = jnp.array([init_tokens], jnp.int32)
    bsz, seqlen = tokens.shape
    attn_mask = build_attn_mask(seqlen, cur_pos)
    mp = self.model_params
    freqs_cis = precompute_freqs_cis(mp.head_dim, mp.max_seq_len, mp.rope_theta, mp.use_scaled_rope)
    kvcache = KVCache.new(mp.n_layers, bsz, mp.max_seq_len, mp.n_local_kv_heads, mp.head_dim)
    logits, kvcache, _, _ = xfmr(self.weights, mp, tokens, cur_pos, freqs_cis[:seqlen], kvcache, attn_mask=attn_mask)
    next_token = jnp.argmax(logits[:, -1], axis=-1, keepdims=True).astype(jnp.int32)
    gen_tokens = next_token

    yield self.tokenizer.decode([next_token.item()])

    cur_pos = seqlen
    stop = jnp.array([128001, 128008, 128009])
    state: ST | None = None
    while cur_pos < 8192:
      cur_pos += 1
      logits, kvcache, scores, _ = xfmr(
        self.weights, mp, next_token, cur_pos, freqs_cis[cur_pos : cur_pos + 1], kvcache
      )
      next_token, state = self.sampler(gen_tokens, logits, scores, cfg=self.sampler_cfg, state=state)
      gen_tokens = jnp.concatenate((gen_tokens, next_token))
      yield self.tokenizer.decode(next_token.tolist()[0])
      if jnp.isin(next_token, stop).any():
        break

obviously this includes changes (such as e.g. sampler and state being arguments) that are only part of my PR but if this structure is used those can be deconflicted easily later

doomslide added 30 commits October 5, 2024 08:54

attn_entropy visualization

426281c

Merge branch 'main' of https://github.com/xjdr-alt/entropix

966b1a6

factored stats out of attention

c080dde

factored stats out of attention

1773362

merged with origin

7bc0599

initial

167effe

lets goo

793feae

added stft

7d450f8

merged with origin

530e793

attn_entropy visualization

029d844

Merge branch 'main' of https://github.com/xjdr-alt/entropix

caec96f

repeated 'def main'

5e8a241

removed repeated 'def main'

6c6a516

merged main

65b7a6c

deleted tokenizer directory

8bbc014

renaming

982e247

implemented forensics

36d7711

forensics still not working

4c6b7ad

attn_entropy visualization

1abb5bc

factored stats out of attention

a26ded2

factored stats out of attention

f71de86

attn_entropy visualization

627b82b

Merge branch 'main' of https://github.com/xjdr-alt/entropix

06d3f2f

Merge branch 'shrek' into frog

99a3696

cleaned some dependencies

835e6a8

rope tests

a09a6da

moved shit around

4167572

minor

d900250

rewrite of xmfr

f621d9a

Merge branch 'main' into frog

a887232

doomslide added 21 commits October 7, 2024 01:44

still weird

e410d40

still off...

c729795

Merge remote-tracking branch 'origin' into forensics

cb33990

fixed params

80aa926

small fix

fa5d25f

Merge branch 'frog' into refactor

d114369

refactored calls to sampler

bc4d15c

refactored rope

bad1767

fixed score update

5a8d007

refactors

c6ac8aa

refactoring around attention stats

f612a75

Merge remote-tracking branch 'origin' into refactor

7456c88

refactors of kvcache and attn_stats

33d1a25

removed lm_state

04b7049

.

c72bce2

Merge remote-tracking branch 'origin' into forensics

15fa837

Merge branch 'refactor' into forensics

db77fe3

factored generation out of main and initialization out of generation

6736035

.

9f15bf0

.

c864f3b

cleaned imports

4e931e3

doomslide and others added 6 commits October 7, 2024 18:01

Delete entropix/lm_state.py

60499c8

redundant right now

added vanilla generator

7101fa0

Merge remote-tracking branch 'fork/refactor' into refactor

c979592

.

f4233ef

cleaned some accidental nonsense

5e1de5a

some scoring for ar generation

5e9e79c

xjdr-alt mentioned this pull request Oct 7, 2024

~nothing #9

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactors #27

Refactors #27

doomslide commented Oct 7, 2024

doomslide commented Oct 7, 2024

qdbp commented Oct 8, 2024 •

edited

Loading

Refactors #27

Are you sure you want to change the base?

Refactors #27

Conversation

doomslide commented Oct 7, 2024

doomslide commented Oct 7, 2024

qdbp commented Oct 8, 2024 • edited Loading

qdbp commented Oct 8, 2024 •

edited

Loading