Guarding against LLMs that would learn our repo "by heart" #119

tischi · 2024-09-07T16:05:32Z

Even though I think LLMs generally do not work like this, I still wonder whether we could guard against some - otherwise super dumb - LLM to just learn our repo by heart and then achieve great results.

Given the discussions in #118 I wonder whether we could somehow maintain a separate secret branch where we ask the conceptually same questions but just with a slightly modifications?

Maybe:

changing the english in the prompt a bit
changing the actual values in the input data and the corresponding assertions
changing the order of the input and output arguments

It would be a bit of work...but maybe worth it?

What do you think?

haesleinhuepf · 2024-09-07T16:09:45Z

I'm currently working on training such an LLM, because I wanted to know how to achieve this:

and then achieve great results.

When I'm done, I'll share it (+ training scripts) and we can develop a strategy against it.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Guarding against LLMs that would learn our repo "by heart" #119

Guarding against LLMs that would learn our repo "by heart" #119

tischi commented Sep 7, 2024

haesleinhuepf commented Sep 7, 2024

Guarding against LLMs that would learn our repo "by heart" #119

Guarding against LLMs that would learn our repo "by heart" #119

Comments

tischi commented Sep 7, 2024

haesleinhuepf commented Sep 7, 2024