fix: better "generate more" functionality #891

codedealer · 2024-04-07T01:28:42Z

Reasoning:
Empirically, continuation of the already generated text is more consistent when the latest message is supplied as is. ~~With that in mind in this PR we remove {{post}} from template when request is passed with kind continue.~~

Update
Our goal here is not to manipulate {{post}} per se but to cut the prompt as close to the last response in history as possible to let the model continue the generation.

New algorithm:

Manipulate ast so that the history is the last node (if present)
When rendering complex nodes (conditionals, iterables) cut strictly after the message was rendered

That means that stuff like ujb and whatever was written in validated/unvalidated presets after history (like modifiers) will be erased and it can affect the generation results. However, I have tested with different prompt formats (albeit with sub 1 temperatures) and this approach beats the state of art every time.

Known issues:
Currently the concatenation of the existing message and the newly generated response will insert either white space or a new line between the chunks (which expands the existing behavior that always inserts a space). In case where the existing message chunk cuts off mid-word this will cause a white space inserted in the middle of a word. However, the chances of that are small enough comparing to the potential workload to factor this case in, so we ignore it for now.

Additional information
Closes: #869
Discord thread

sceuick · 2024-04-07T11:17:06Z

common/prompt.ts

@@ -503,7 +499,7 @@ function createPostPrompt(
  >
 ) {
  const post = []
-  post.push(`${opts.replyAs.name}:`)
+  opts.kind !== 'continue' && post.push(`${opts.replyAs.name}:`)


Use if here instead of JSX-ish inlining.

sceuick · 2024-04-07T11:21:44Z

You probably just need to update the snapshots using pnpm snapshot.
I can't see anything obviously run with the snapshot failures. It'll be easier to see if they're okay once you push the updated snapshots.

sceuick · 2024-04-08T02:20:46Z

common/template-parser.ts

@@ -139,6 +139,18 @@ export async function parseTemplate(
  }

  const ast = parser.parse(template, {}) as PNode[]
+
+  // hack: when we continue, remove the post along with the last newline from tree
+  if (opts.continue && ast.length > 1) {


In unvalidated prompts the {{post}} placeholder can technically be anywhere.

That's a good point! I have reexamined my whole approach to this problem. Our goal here is not to manipulate {{post}} per se but to cut the prompt as close to the last response in history as possible to let the model continue the generation. I have rolled back my change to createPostMessage.

New algorithm:

Manipulate ast so that the history is the last node (if present)

When rendering complex nodes (conditionals, iterables) cut strictly after the message was rendered

That means that stuff like ujb and whatever was written in validated/unvalidated presets after history (like modifiers) will be erased and it can affect the generation results. However, I have tested with different prompt formats (albeit with sub 1 temperatures) and this approach beats the state of art every time.

We might be able to avoid modifying the template-parser if we send the correct history/lines from the frontend?

Yeah, realistically the parser shouldn't make any assumptions about the parts (including history) that it is passed so it shouldn't be modifying it.

I don't think it's possible. It works without any modification to the template if {{history}} is used. Then we can just cut the tree and it will be enough. The complexity arises from the usage of {{each msg}} because it allows users to modify how each individual line in history is rendered.

So, the line we pass from the front may be perfectly fine (e.g. Hello world) but upon render it will turn into Hello world\n\n. There could be any symbols at the end, not just new lines and after it's rendered it is impossible to determine whether they came from the template or from the message. This is a major cause of interference when it comes to continuing the message.

The only way I found to ensure that the prompt ends on what was sent by the bot is to change how history props are edited in template parser.

agnai/common/template-parser.ts

Line 489 in ce433fc

if (opts.continue && i === 0 && isHistory) break children_loop

I guess an alternative would be to have a completely different "continue" template that is triggered in this particular case?

Note: this approach doesn't account for the event when continuation starts mid-word (white space will be inserted anyway). However, statistically chance of this happening is low.

codedealer · 2024-05-02T02:25:29Z

I tried another approach, that doesn't require editing the template parser, although I think there was nothing wrong with it and it worked better because it was more granular.

The new method modifies AST to always put {{history}} in the tree. Although this can interfere with the user's template in a substantial way, it ensures that generation always starts from where it left off.

Sentence concatenation also changed a bit with more reliance on LLM to compose it. Prior there was a discrepancy between the message on the client and the prompt where the prompt was always trimmed, now the client message is also trimmed so the LLM and the user see the same thing.

sceuick reviewed Apr 7, 2024

View reviewed changes

codedealer force-pushed the fix/generate-more branch from feac32b to d29eb45 Compare April 7, 2024 15:41

sceuick reviewed Apr 8, 2024

View reviewed changes

codedealer added 6 commits May 2, 2024 02:36

fix: remove {{post}} from when continuing generation

acf5f93

feat: better concat for generation that was continued

7d4006c

Note: this approach doesn't account for the event when continuation starts mid-word (white space will be inserted anyway). However, statistically chance of this happening is low.

test: update snapshot

3b05e33

fix: cut prompt to history's last message

8adb120

feat: always put {{history}} when continue

69a011b

feat: better sentence concat

e54b72e

codedealer force-pushed the fix/generate-more branch from ce433fc to e54b72e Compare May 2, 2024 02:18

codedealer requested a review from sceuick May 2, 2024 02:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: better "generate more" functionality #891

fix: better "generate more" functionality #891

codedealer commented Apr 7, 2024 •

edited

Loading

sceuick Apr 7, 2024

sceuick commented Apr 7, 2024

sceuick Apr 8, 2024

codedealer Apr 8, 2024 •

edited

Loading

sceuick Apr 9, 2024

sceuick Apr 9, 2024

codedealer Apr 9, 2024

codedealer Apr 9, 2024

codedealer commented May 2, 2024

fix: better "generate more" functionality #891

Are you sure you want to change the base?

fix: better "generate more" functionality #891

Conversation

codedealer commented Apr 7, 2024 • edited Loading

sceuick Apr 7, 2024

Choose a reason for hiding this comment

sceuick commented Apr 7, 2024

sceuick Apr 8, 2024

Choose a reason for hiding this comment

codedealer Apr 8, 2024 • edited Loading

Choose a reason for hiding this comment

sceuick Apr 9, 2024

Choose a reason for hiding this comment

sceuick Apr 9, 2024

Choose a reason for hiding this comment

codedealer Apr 9, 2024

Choose a reason for hiding this comment

codedealer Apr 9, 2024

Choose a reason for hiding this comment

codedealer commented May 2, 2024

codedealer commented Apr 7, 2024 •

edited

Loading

codedealer Apr 8, 2024 •

edited

Loading