Integrate AutoModelForSequenceClassification through PytorchModel #339

abarbosa94 · 2024-03-04T15:16:09Z

Description

This PR adds an initial implementation for the HuggingFace SequenceClassification model with some initial functionality. I also organized imports using isort and made a few adjustments to increase flake8 compliance

This should be taken as an initial step toward: #238, #103, and #217

Implemented changes

Insert a description of the changes implemented in the pull request.
- Modify the PyTorchModel prediction method to accept the HuggingFace model
- Adds new tests that prove new feature
- Add flake8 parametrization to tox
- Modify some import ordering according to isort

Minimum acceptance criteria

Specify what is necessary for the PR to be merged with the main branch.
@mentions of the person that is apt to review these changes, e.g., @annahedstroem

annahedstroem · 2024-03-04T16:06:09Z

quantus/helpers/model/pytorch_model.py

+                raise ValueError(
+                    "When using HuggingFace pretrained models, please use Tokenizers output for `x`"
+                )
+            pred = self.model(**x, **model_predict_kwargs).logits


Should we also enable softmax here (post accessing the logits)? so that we convert the pred to softmax, if softmax=True? (we can add it as a class attribute above).

I implemented this way, please see if you agree: 9a67c4c

quantus/helpers/model/pytorch_model.py

annahedstroem · 2024-03-04T16:09:12Z

pyproject.toml

@@ -36,7 +36,9 @@ dependencies = [
    "scipy>=1.7.3",
    "tqdm>=4.62.3",
    "matplotlib>=3.3.4",
-    "typing_extensions; python_version <= '3.8'"
+    "typing_extensions; python_version <= '3.8'",
+    "transformers<=4.30.2; python_version == '3.7'",


I wonder if this should be a part of the base dependencies. I think it is better fitted to add it under 'torch' (see line 78/80 and below)?

I'd say those should go into

[project.optional_dependecies] transformers = [...]

aaarrti · 2024-03-04T16:23:52Z

tests/conftest.py

+
+
+@pytest.fixture(scope="session", autouse=True)
+def mock_hf_text():


contrary to the name, this is not a mock 🤷

Haha, good catch! Just renamed it

aaarrti · 2024-03-04T16:26:51Z

tests/conftest.py


 CIFAR_IMAGE_SIZE = 32
 MNIST_IMAGE_SIZE = 28
 BATCH_SIZE = 124
 MINI_BATCH_SIZE = 8
+RANDOM_SEED = 42
+
+set_seed(42)


I can't believe we have forgotten to set PRNG seed 🤦
Thanks for noticing!

Mb, to ensure each test runs with the same PRNG state, we could do

@pytest.fixture(scope='function', autouse=True) def reset_prngs(): # module names might be a bit wrong ;) torch.seed() np.seed() tf.keras.set_seed() random.seed()

set_seed from huggingface ensure all of these (and some others as well), but using autouse is a clever idea :) just did it!

aaarrti · 2024-03-04T16:27:50Z

tests/conftest.py

+    return model
+
+
+@pytest.fixture(scope="session", autouse=True)


Please remove autouse,
autouse=True will force the model to be loaded into memory every time any tests is executed, even if the test does not use it.

Sorry, just did it

aaarrti · 2024-03-04T16:30:01Z

quantus/helpers/model/pytorch_model.py

+        elif isinstance(self.model, nn.Module):
+            pred_model = self.get_softmax_arg_model()
+            pred = pred_model(torch.Tensor(x).to(self.device), **model_predict_kwargs)
+        return pred


Let's try not to return None, either tensor, or raise exception

Done: 179da1e

aaarrti · 2024-03-04T16:31:06Z

pyproject.toml

@@ -36,7 +36,9 @@ dependencies = [
    "scipy>=1.7.3",
    "tqdm>=4.62.3",
    "matplotlib>=3.3.4",
-    "typing_extensions; python_version <= '3.8'"
+    "typing_extensions; python_version <= '3.8'",
+    "transformers<=4.30.2; python_version == '3.7'",


I'd say those should go into

[project.optional_dependecies] transformers = [...]

aaarrti · 2024-03-04T20:48:16Z

quantus/helpers/model/pytorch_model.py

+                raise ValueError(
+                    "When using HuggingFace pretrained models, please use Tokenizers output for `x`"
+                )
+            pred = self.model(**x, **model_predict_kwargs).logits


just return self.model(**x, **model_predict_kwargs).logits

I did slightly different (in 179da1e) to handle and raise the softmax param properly. Could you see if you agree? Thanks

I think that looks great :D @abarbosa94

you're right, it looks a bit different now. Can we also remove pred = None at the top?

tox.ini

aaarrti · 2024-03-11T17:51:40Z

pyproject.toml

@@ -52,6 +52,7 @@ dynamic = ["version"]
 #
 [project.optional-dependencies]
 tests = [
+    "cachetools>=5.3.3",


Why do we need cachetools for tests?
If it is used by library it must be in [project.dependecies] otherwise users can face issues after installation

You're right, removing it

Done: 179da1e

tests/functions/test_pytorch_model.py

codecov-commenter · 2024-03-15T13:16:45Z

Codecov Report

Attention: Patch coverage is 95.83333% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 91.20%. Comparing base (8d88cd7) to head (4afbfec).

Files	Patch %	Lines
quantus/helpers/model/pytorch_model.py	95.83%	1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #339   +/-   ##
=======================================
  Coverage   91.19%   91.20%           
=======================================
  Files          66       66           
  Lines        3906     3921   +15     
=======================================
+ Hits         3562     3576   +14     
- Misses        344      345    +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

…gration-v0-huggingface

improve type hint and raise error when predict is None

annahedstroem · 2024-03-18T15:59:55Z

based on the tesing, it looks like we need to add transformers also to the tests and not only full: https://github.com/understandable-machine-intelligence-lab/Quantus/actions/runs/8328433263/job/22788366589?pr=339

pyproject.toml

aaarrti

Please make sure Quantus is usable without transformers installation

aaarrti · 2024-03-19T09:04:57Z

tests/functions/test_pytorch_model.py

+    ],
+)
+def test_huggingface_classifier_predict(hf_model, data, softmax, model_kwargs, expected):
+    model = PyTorchModel(model=hf_model, softmax=softmax, model_predict_kwargs=model_kwargs)


I though softmax must be a bool, or?

aaarrti · 2024-03-19T09:05:42Z

tests/conftest.py

+    return model
+
+
+@pytest.fixture(scope="session", autouse=False)


autouse=False is the default

aaarrti · 2024-03-19T09:07:31Z

quantus/helpers/model/pytorch_model.py

 import torch
 from torch import nn
-from functools import lru_cache
+from transformers import PreTrainedModel


This will cause ModuleNotFoundError when user tries to import Quantus without transformers installed.

aaarrti · 2024-03-19T09:07:51Z

pyproject.toml

@@ -104,8 +81,39 @@ zennit = [
    "quantus[torch]",
    "zennit>=0.5.1"
 ]
+transformers = [
+    "quantus[torch, tensorflow]",


quantus[torch] should be enough

aaarrti · 2024-03-19T09:08:32Z

pyproject.toml

@@ -85,7 +60,9 @@ torch = [
    "torchvision<=0.12.0; python_version == '3.7'",
    "torchvision>=0.15.1; sys_platform != 'linux' and python_version > '3.7'",
    "torchvision>=0.14.0, <0.15.1; sys_platform == 'linux' and python_version > '3.7' and python_version <= '3.10'",
-    "torchvision>=0.15.1; sys_platform == 'linux' and python_version >= '3.11'"
+    "torchvision>=0.15.1; sys_platform == 'linux' and python_version >= '3.11'",
+    "transformers<=4.30.2; python_version == '3.7'",


please remove transformers from torch = [...] section

abarbosa94 added 3 commits March 4, 2024 11:45

implement predict integration with HuggingFace

3d3daff

add tests for new method

5ae4c4c

make changes to be flake8 compliant

676418c

abarbosa94 requested a review from annahedstroem March 4, 2024 15:16

abarbosa94 self-assigned this Mar 4, 2024

abarbosa94 requested a review from aaarrti March 4, 2024 15:16

Merge branch 'main' into u/andrebarbosa/integration-v0-huggingface

959092a

abarbosa94 added the enhancement New feature or request label Mar 4, 2024

abarbosa94 changed the title ~~Adds Initial HuggingFace integration to PytorchModel~~ Integrate AutoModelForSequenceClassification through PytorchModel Mar 4, 2024

update project dependencies

9266efc

annahedstroem reviewed Mar 4, 2024

View reviewed changes

quantus/helpers/model/pytorch_model.py Outdated Show resolved Hide resolved

annahedstroem reviewed Mar 4, 2024

View reviewed changes

abarbosa94 added 7 commits March 11, 2024 09:32

moving the dependencies to torch

e4b4869

add docstring and make possible to pass dict to hf_model

885a3e2

address softmax argument

9a67c4c

rename layer_name to layer to fix unused attribute

2fee55f

add dependencies to test

42b6edd

fix missing comma

4491f48

include cachedtools in pyproject tests

06c3754

aaarrti reviewed Mar 15, 2024

View reviewed changes

abarbosa94 added 2 commits March 18, 2024 09:12

Merge remote-tracking branch 'upstream/main' into u/andrebarbosa/inte…

c837f9d

…gration-v0-huggingface

change context manager in test; remove cachetool

179da1e

improve type hint and raise error when predict is None

abarbosa94 force-pushed the u/andrebarbosa/integration-v0-huggingface branch from aba53a7 to 179da1e Compare March 18, 2024 13:12

adjust random seed in testing

9c56c50

annahedstroem reviewed Mar 18, 2024

View reviewed changes

pyproject.toml Show resolved Hide resolved

abarbosa94 and others added 2 commits March 18, 2024 14:26

move tests at the end of pyproject

150b26b

Merge branch 'main' into u/andrebarbosa/integration-v0-huggingface

4afbfec

annahedstroem merged commit b0b6cda into understandable-machine-intelligence-lab:main Mar 19, 2024
6 of 7 checks passed

aaarrti requested changes Mar 19, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integrate AutoModelForSequenceClassification through PytorchModel #339

Integrate AutoModelForSequenceClassification through PytorchModel #339

abarbosa94 commented Mar 4, 2024

annahedstroem Mar 4, 2024

abarbosa94 Mar 11, 2024

annahedstroem Mar 4, 2024

aaarrti Mar 4, 2024

aaarrti Mar 4, 2024

abarbosa94 Mar 18, 2024

abarbosa94 Mar 18, 2024

aaarrti Mar 4, 2024

abarbosa94 Mar 18, 2024

abarbosa94 Mar 18, 2024

aaarrti Mar 4, 2024

abarbosa94 Mar 18, 2024

aaarrti Mar 4, 2024

abarbosa94 Mar 18, 2024

aaarrti Mar 4, 2024

aaarrti Mar 4, 2024

abarbosa94 Mar 18, 2024 •

edited

Loading

annahedstroem Mar 18, 2024

aaarrti Mar 19, 2024

aaarrti Mar 11, 2024

abarbosa94 Mar 18, 2024

abarbosa94 Mar 18, 2024

codecov-commenter commented Mar 15, 2024 •

edited

Loading

annahedstroem commented Mar 18, 2024

aaarrti left a comment

aaarrti Mar 19, 2024

aaarrti Mar 19, 2024

aaarrti Mar 19, 2024

aaarrti Mar 19, 2024

aaarrti Mar 19, 2024



		@pytest.fixture(scope="session", autouse=True)
		def mock_hf_text():

Integrate AutoModelForSequenceClassification through PytorchModel #339

Integrate AutoModelForSequenceClassification through PytorchModel #339

Conversation

abarbosa94 commented Mar 4, 2024

Description

Implemented changes

Minimum acceptance criteria

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abarbosa94 Mar 18, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

codecov-commenter commented Mar 15, 2024 • edited Loading

Codecov Report

annahedstroem commented Mar 18, 2024

aaarrti left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abarbosa94 Mar 18, 2024 •

edited

Loading

codecov-commenter commented Mar 15, 2024 •

edited

Loading