Add support for finetune guard classifier #325

VibhuJawa · 2024-10-25T12:19:13Z

This PR adds support for FineTuneGuard Model

Signed-off-by: Vibhu Jawa <[email protected]>

VibhuJawa · 2024-11-18T16:04:37Z

nemo_curator/classifiers/aegis.py

+    - Demonstrated strong zero-shot detection capabilities on novel attacks
+    - Particularly effective at identifying trigger patterns in partially poisoned datasets
+
+    Dataset Format:


TODO: Emphasize more about english

Signed-off-by: Vibhu Jawa <[email protected]>

sarahyurick

Added a few typo fixes but looks good so far!

When you get a chance can you add it to the examples/ and nemo_curator/scripts/ folders? And to the documentation? You should be able to reference #361 which has all the files that should be created/updated.

sarahyurick · 2024-11-21T21:18:13Z

nemo_curator/classifiers/aegis.py

 class AegisModel(nn.Module):
    def __init__(
        self,
        pretrained_model_name_or_path: str,
        peft_model_name_or_path: str,
        dtype: torch.dtype,
-        token: str,
+        token: Optional[Union[str, bool]],
+        add_fintune_gaurd: bool = False,


Suggested change

add_fintune_gaurd: bool = False,

add_finetune_guard: bool = False,

sarahyurick · 2024-11-21T21:18:33Z

nemo_curator/classifiers/aegis.py

    ):
        super().__init__()
        base_model = AutoModelForCausalLM.from_pretrained(
            pretrained_model_name_or_path, torch_dtype=dtype, token=token
        )
        self.model = PeftModel.from_pretrained(base_model, peft_model_name_or_path)
+        self.autocast = autocast
+        self.add_fintune_gaurd = add_fintune_gaurd


Suggested change

self.add_fintune_gaurd = add_fintune_gaurd

self.add_finetune_guard = add_finetune_guard

sarahyurick · 2024-11-21T21:18:48Z

nemo_curator/classifiers/aegis.py

    ):
        super().__init__()
        base_model = AutoModelForCausalLM.from_pretrained(
            pretrained_model_name_or_path, torch_dtype=dtype, token=token
        )
        self.model = PeftModel.from_pretrained(base_model, peft_model_name_or_path)
+        self.autocast = autocast
+        self.add_fintune_gaurd = add_fintune_gaurd
+        if self.add_fintune_gaurd:


Suggested change

if self.add_fintune_gaurd:

if self.add_finetune_guard:

sarahyurick · 2024-11-21T21:19:01Z

nemo_curator/classifiers/aegis.py

-            pad_token_id=0,
-        )
+    def _forward(self, batch):
+        if self.add_fintune_gaurd:


Suggested change

if self.add_fintune_gaurd:

if self.add_finetune_guard:

sarahyurick · 2024-11-21T21:19:45Z

nemo_curator/classifiers/aegis.py

+        if self.config.add_finetune_guard:
+            if self.config.finetune_guard_path is None:
+                raise ValueError(
+                    "finetune_guard_path must be provided if add_fine_guard is True"


Suggested change

"finetune_guard_path must be provided if add_fine_guard is True"

"finetune_guard_path must be provided if add_finetune_guard is True"

sarahyurick · 2024-11-21T21:20:04Z

nemo_curator/classifiers/aegis.py

+            peft_model_name_or_path=self.config.peft_model_name_or_path,
+            dtype=self.config.dtype,
+            token=self.config.token,
+            add_fintune_gaurd=self.config.add_finetune_guard,


Suggested change

add_fintune_gaurd=self.config.add_finetune_guard,

add_finetune_guard=self.config.add_finetune_guard,

VibhuJawa added 2 commits October 25, 2024 04:56

Add support for fine-tune classifier

84c5c31

Signed-off-by: Vibhu Jawa <[email protected]>

Add the if around prompt fomatting

8dc5934

Signed-off-by: Vibhu Jawa <[email protected]>

VibhuJawa changed the title ~~Add support for fine-tune classifier~~ Add support for finetune guard classifier Oct 29, 2024

VibhuJawa and others added 3 commits November 14, 2024 02:05

Create a new classifier instead of using the old classifier

e04fcf5

Signed-off-by: Vibhu Jawa <[email protected]>

Revert to a single pipeline

9bddaac

Signed-off-by: Vibhu Jawa <[email protected]>

Merge branch 'NVIDIA:main' into vjawa/enable_fintune_gaurd_via_crossfit

a4ee73f

VibhuJawa requested a review from sarahyurick November 14, 2024 10:23

VibhuJawa marked this pull request as ready for review November 14, 2024 10:23

VibhuJawa commented Nov 18, 2024

View reviewed changes

VibhuJawa and others added 2 commits November 18, 2024 12:15

Fix based on discussions with oflek and NIR

a617f1a

Signed-off-by: Vibhu Jawa <[email protected]>

Merge branch 'main' into vjawa/enable_fintune_gaurd_via_crossfit

db33bb0

sarahyurick reviewed Nov 21, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for finetune guard classifier #325

Add support for finetune guard classifier #325

VibhuJawa commented Oct 25, 2024 •

edited

Loading

VibhuJawa Nov 18, 2024

sarahyurick left a comment

sarahyurick Nov 21, 2024

sarahyurick Nov 21, 2024

sarahyurick Nov 21, 2024

sarahyurick Nov 21, 2024

sarahyurick Nov 21, 2024

sarahyurick Nov 21, 2024

	add_fintune_gaurd: bool = False,
	add_finetune_guard: bool = False,

	self.add_fintune_gaurd = add_fintune_gaurd
	self.add_finetune_guard = add_finetune_guard

	"finetune_guard_path must be provided if add_fine_guard is True"
	"finetune_guard_path must be provided if add_finetune_guard is True"

	add_fintune_gaurd=self.config.add_finetune_guard,
	add_finetune_guard=self.config.add_finetune_guard,

Add support for finetune guard classifier #325

Are you sure you want to change the base?

Add support for finetune guard classifier #325

Conversation

VibhuJawa commented Oct 25, 2024 • edited Loading

VibhuJawa Nov 18, 2024

Choose a reason for hiding this comment

sarahyurick left a comment

Choose a reason for hiding this comment

sarahyurick Nov 21, 2024

Choose a reason for hiding this comment

sarahyurick Nov 21, 2024

Choose a reason for hiding this comment

sarahyurick Nov 21, 2024

Choose a reason for hiding this comment

sarahyurick Nov 21, 2024

Choose a reason for hiding this comment

sarahyurick Nov 21, 2024

Choose a reason for hiding this comment

sarahyurick Nov 21, 2024

Choose a reason for hiding this comment

VibhuJawa commented Oct 25, 2024 •

edited

Loading