Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add prompt injection protection mechanism #28

Merged
merged 11 commits into from
Aug 21, 2024

Conversation

nextedoff
Copy link

@nextedoff nextedoff commented Aug 16, 2024

Purpose

This feature is adding a way to implement new protection methods for the query, focusing on prompt injection protection in this PR. For injection protection model, model named protectai/deberta-v3-base-prompt-injection was used.
New models and guard mechanisms can be added to promptprotection.py.

image image

Currently, the injection protection can either be turned on via API and user interface as seen in the screenshot, or it can be set via environment variables:
USE_INJECTION_PROTECTION="true"

Does this introduce a breaking change?

When developers merge from main and run the server, azd up, or azd deploy, will this produce an error?
If you're not sure, try it out on an old environment.

  • Yes
  • No

Does this require changes to learn.microsoft.com docs?

  • Yes
  • No

Type of change

  • Bugfix
  • Feature
  • Code style update (formatting, local variables)
  • Refactoring (no functional changes, no api changes)
  • Documentation content changes
  • Other... Please describe:

Code quality checklist

  • The current tests all pass (python -m pytest).
  • I added tests that prove my fix is effective or that my feature works
  • I ran python -m pytest --cov to verify 100% coverage of added lines
  • I ran python -m mypy to check for type errors
  • I either used the pre-commit hooks or ran ruff and black manually on my code.

@nextedoff nextedoff added the enhancement New feature or request label Aug 16, 2024
@nextedoff nextedoff requested a review from phoevos August 18, 2024 17:27
@nextedoff nextedoff force-pushed the feature-injection-model branch from 36d880f to 7d2d98b Compare August 18, 2024 18:17
Copy link

@phoevos phoevos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for opening this!

Left some comments with suggestions, but I'm not finished yet. I'll continue tomorrow to check the core of the implementation and the frontend changes (which I'm duly excited about)!

infra/main.parameters.json Outdated Show resolved Hide resolved
app/backend/api_wrappers/openai.py Outdated Show resolved Hide resolved
app/backend/api_wrappers/hugging_face.py Outdated Show resolved Hide resolved
app/backend/app.py Outdated Show resolved Hide resolved
app/backend/app.py Outdated Show resolved Hide resolved
app/backend/core/promptprotection.py Outdated Show resolved Hide resolved
app/backend/error.py Outdated Show resolved Hide resolved
app/backend/error.py Outdated Show resolved Hide resolved
@phoevos phoevos changed the title feat: Injection protection model feat: Add prompt injection protection mechanism Aug 20, 2024
Fixed docstrings in promptprotection.py, slightly rephrasing some for
clarity and making sure that they're in proper markdown so that they
are rendered correctly in the documentation.
Renamed the 'config' dictionary of the 'PromptProtection' class to
'protections'.

Signed-off-by: Phoevos Kalemkeris <[email protected]>
Signed-off-by: Phoevos Kalemkeris <[email protected]>
Copy link

@phoevos phoevos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll do some quick manual testing, but other than that LGTM!

app/backend/app.py Outdated Show resolved Hide resolved
@phoevos phoevos merged commit 7a1c2e1 into main Aug 21, 2024
11 checks passed
@phoevos phoevos deleted the feature-injection-model branch August 21, 2024 17:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants