Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading model using DGMR.from_pretrained("openclimatefix/dgmr") fails #68

Open
agijsberts opened this issue Apr 14, 2024 · 6 comments · Fixed by #69
Open

Loading model using DGMR.from_pretrained("openclimatefix/dgmr") fails #68

agijsberts opened this issue Apr 14, 2024 · 6 comments · Fixed by #69
Labels
bug Something isn't working

Comments

@agijsberts
Copy link
Contributor

Describe the bug

When loading the pretrained model from HuggingFace as per the instructions in README.md I get a KeyError: 'config' exception. The issue is that when following the instructions **model_kwargs will be empty in NowcastingModelHubMixin._from_pretrained, but it then attempts to pass **model_kwargs['config'] to the class constructor in

model = cls(**model_kwargs["config"])

To Reproduce

>>> from dgmr import DGMR
>>> model = DGMR.from_pretrained("openclimatefix/dgmr")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/me/venv/dgmr/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 119, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/home/me/venv/dgmr/lib/python3.11/site-packages/huggingface_hub/hub_mixin.py", line 420, in from_pretrained
    instance = cls._from_pretrained(
               ^^^^^^^^^^^^^^^^^^^^^
  File "/home/me/venv/dgmr/git/skillful_nowcasting/dgmr/hub.py", line 154, in _from_pretrained
    model = cls(**model_kwargs["config"])
                  ~~~~~~~~~~~~^^^^^^^^^^
KeyError: 'config'

Expected behavior

I expect to have a variable model with the pretrained DGMR model.

Additional context

One can trivially work around the problem by passing an empty config argument when loading the pretrained model:

model = DGMR.from_pretrained("openclimatefix/dgmr", config={})

so maybe this is just a matter of updating README.md.

Still, I'd argue that the cleaner fix would be to pass the entire **model_kwargs to the class constructor, thus

model = cls(**model_kwargs)

That would also coincide with how HuggingFace themselves do it in https://github.com/huggingface/huggingface_hub/blob/855ee997202e003b80bafd3c02dac65146c89cc4/src/huggingface_hub/hub_mixin.py#L633.

@agijsberts agijsberts added the bug Something isn't working label Apr 14, 2024
@agijsberts
Copy link
Contributor Author

Given that I was baffled that nobody seemed to have reported such a basic problem earlier, I investigated this issue further and, not surprisingly, it is a bit more complicated than I initially though.

It turns out the issue surfaced only recently due to changes introduced in huggingface_hub >= 0.22.0. So for the moment the best workaround is to downgrade huggingface_hub to version 0.21.4:

pip3 install huggingface-hub==0.21.4

Prior to huggingface/huggingface_hub@45147c5, config would be injected in the model_kwargs if the model class accepted a **kwargs argument. From version 0.22.0 onwards, the logic for when config is injected has changed . Looking at the code and the docs at
https://huggingface.co/docs/huggingface_hub/main/en/guides/integrations#a-more-complex-approach-class-inheritance, I believe that config is injected if (a) the model class expects a parameter config in its __init__, and/or (b)
the model class expects a parameter config in its _from_pretrained.

More relevantly though, if the model class accepts a **kwargs argument in its __init__, then the contents of the config are injected one by one in the **model_kwargs (as opposed to injecting the config as-is in **model_kwargs['config']). This setting actually seems more applicable to me in NowcastingModelHubMixin: wouldn't it be sufficient to replace model = cls(**model_kwargs['config']) with model = cls(**model_kwargs) in _from_pretrained? And wouldn't that also allow to simplify DGMR.__init__?

@peterdudfield
Copy link
Contributor

Thanks @agijsberts for finding this. Should we pin the requirements file to huggingface-hub==0.21.4. Would you be interested in making a PR for this?

Would you be able to expand further on the the model_kwargs point? Perhaps even some links to the code where it could be improved?

@agijsberts
Copy link
Contributor Author

Pinning huggingface-hub==0.21.4 is a sensible stopgap in my opinion. I'll prepare a PR right away.

Regarding the model_kwargs and without going into too much detail: a significant chunk of DGMR.__init__ is dedicated to dealing with a potential config parameter and, when passed, to use it to override the normal arguments. If I understood the workings of ModelHubMixin correctly, it should no longer be necessary to rely on such a "magic" config argument and removing the logic would make DGMR.__init__ agnostic to the mixin.

I volunteer to have a look at what would be the best way to resolve this.

@peterdudfield
Copy link
Contributor

Thanks @agijsberts, yea if you were able to have a look at the DGMR.__init__ and how the best way to resolve this is, then that would be super

@agijsberts
Copy link
Contributor Author

I have just pushed the possible __init__ simplification in my fork https://github.com/agijsberts/skillful_nowcasting/tree/refactor_huggingface-hub. This already includes tests to make sure that serialization-deserialization returns an identical object for DGMR, Discriminator, etc. Unfortunately that introduces a pip dependency on safetensors. In any case, after my holidays I want to check that predictions on some test data are still the same before turning this into a PR.

Perhaps someone can clarify what the reason is that DGMR needs a custom NowcastingModelHubMixin instead of using huggingface-hub's PyTorchModelHubMixin (as used by Discriminator, Sampler etc.). My impression is that both do the same thing (except that PyTorchModelHubMixin writes in safetensor format), but I'm probably missing something.

@peterdudfield
Copy link
Contributor

Thanks @agijsberts - have a good holiday, and I look foreward to seeing the PR

If Im honest Im not totlaly sure why NowcastingModelHubMixin is used instead of PyTorchModelHubMixin, but ill have to have a look at it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants