Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't retrieve model ChemGPT-1.2B from the store! #109

Open
1 task done
hisplan opened this issue Sep 14, 2024 · 2 comments
Open
1 task done

Can't retrieve model ChemGPT-1.2B from the store! #109

hisplan opened this issue Sep 14, 2024 · 2 comments
Labels
bug Something isn't working

Comments

@hisplan
Copy link

hisplan commented Sep 14, 2024

Is there an existing issue for this?

  • I have searched the existing issues and found nothing

Bug description

I've been trying to use ChemGPT-1.2B, but I'm getting this error Can't retrieve model ChemGPT-1.2B from the store !.

Just FYI, I have successfully used the following models. It appears that I'm having the issue with only this model ChemGPT-1.2B.

  • GPT2-Zinc480M-87M
  • ChemBERTa-77M-MLM
  • ChemGPT-19M

How to reproduce the bug

transformer = PretrainedHFTransformer(kind='ChemGPT-1.2B', notation='selfies', dtype=float)
features = transformer(smiles)

Error messages and logs

  0%|          | 0.00/736 [00:00<?, ?B/s]
  0%|          | 0/7 [00:00<?, ?it/s]
---------------------------------------------------------------------------
ModelStoreError                           Traceback (most recent call last)
File [~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/store/loader.py:100](http://localhost:8889/lab/tree/~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/store/loader.py#line=99), in PretrainedStoreModel._load_or_raise(cls, name, download_path, store, **kwargs)
     99     modelcard = store.search(name=name)[0]
--> 100     artifact_dir = store.download(modelcard, download_path, **kwargs)
    101 except Exception:

File [~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/store/modelstore.py:239](http://localhost:8889/lab/tree/~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/store/modelstore.py#line=238), in ModelStore.download(self, modelcard, output_dir, chunk_size, force)
    238     mapper.fs.delete(output_dir, recursive=True)
--> 239     raise ModelStoreError(
    240         f"""The destination artifact at {model_dest_path} has a different sha256sum ({cache_sha256sum}) """
    241         f"""than the Remote artifact sha256sum ({modelcard.sha256sum}). The destination artifact has been removed !"""
    242     )
    244 return output_dir

ModelStoreError: The destination artifact at [/Users/chunj/Library/Caches/molfeat/ChemGPT-1.2B/model.save](http://localhost:8889/Users/chunj/Library/Caches/molfeat/ChemGPT-1.2B/model.save) has a different sha256sum (4d8819f7c8c91ba94ba446d32f29342360d62971a9fa37c8cab2e31f9c3fc4c5) than the Remote artifact sha256sum (e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855). The destination artifact has been removed !

During handling of the above exception, another exception occurred:

ModelStoreError                           Traceback (most recent call last)
Cell In[6], line 1
----> 1 features = transformer(smiles)

File [~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/trans/base.py:384](http://localhost:8889/lab/tree/~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/trans/base.py#line=383), in MoleculeTransformer.__call__(self, mols, enforce_dtype, ignore_errors, **kwargs)
    359 def __call__(
    360     self,
    361     mols: List[Union[dm.Mol, str]],
   (...)
    364     **kwargs,
    365 ):
    366     r"""
    367     Calculate features for molecules. Using __call__, instead of transform.
    368     If ignore_error is True, a list of features and valid ids are returned.
   (...)
    382 
    383     """
--> 384     features = self.transform(mols, ignore_errors=ignore_errors, enforce_dtype=False, **kwargs)
    385     ids = np.arange(len(features))
    386     if ignore_errors:

File [~/miniconda3/envs/datamol/lib/python3.11/site-packages/sklearn/utils/_set_output.py:316](http://localhost:8889/lab/tree/~/miniconda3/envs/datamol/lib/python3.11/site-packages/sklearn/utils/_set_output.py#line=315), in _wrap_method_output.<locals>.wrapped(self, X, *args, **kwargs)
    314 @wraps(f)
    315 def wrapped(self, X, *args, **kwargs):
--> 316     data_to_wrap = f(self, X, *args, **kwargs)
    317     if isinstance(data_to_wrap, tuple):
    318         # only wrap the first output for cross decomposition
    319         return_tuple = (
    320             _wrap_data_with_container(method, data_to_wrap[0], X, self),
    321             *data_to_wrap[1:],
    322         )

File [~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/trans/pretrained/base.py:207](http://localhost:8889/lab/tree/~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/trans/pretrained/base.py#line=206), in PretrainedMolTransformer.transform(self, smiles, **kwargs)
    204 mols = [mols[i] for i in ind_to_compute]
    206 if len(mols) > 0:
--> 207     converted_mols = self._convert(mols, **kwargs)
    208     out = self._embed(converted_mols, **kwargs)
    210     if not isinstance(out, list):

File [~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/trans/pretrained/hf_transformers.py:367](http://localhost:8889/lab/tree/~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/trans/pretrained/hf_transformers.py#line=366), in PretrainedHFTransformer._convert(self, inputs, **kwargs)
    358 def _convert(self, inputs: list, **kwargs):
    359     """Convert the list of molecules to the right format for embedding
    360 
    361     Args:
   (...)
    365         processed: pre-processed input list
    366     """
--> 367     self._preload()
    369     if isinstance(inputs, (str, dm.Mol)):
    370         inputs = [inputs]

File [~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/trans/pretrained/hf_transformers.py:326](http://localhost:8889/lab/tree/~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/trans/pretrained/hf_transformers.py#line=325), in PretrainedHFTransformer._preload(self)
    324 def _preload(self):
    325     """Perform preloading of the model from the store"""
--> 326     super()._preload()
    327     self.featurizer.model.to(self.device)
    328     self.featurizer.max_length = self.max_length

File [~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/trans/pretrained/base.py:90](http://localhost:8889/lab/tree/~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/trans/pretrained/base.py#line=89), in PretrainedMolTransformer._preload(self)
     88 """Preload the pretrained model for later queries"""
     89 if self.featurizer is not None and isinstance(self.featurizer, PretrainedModel):
---> 90     self.featurizer = self.featurizer.load()
     91     self.preload = True

File [~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/trans/pretrained/hf_transformers.py:209](http://localhost:8889/lab/tree/~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/trans/pretrained/hf_transformers.py#line=208), in HFModel.load(self)
    207 if self._model is not None:
    208     return self._model
--> 209 download_output_dir = self._artifact_load(
    210     name=self.name, download_path=self.cache_path, store=self.store
    211 )
    212 model_path = dm.fs.join(download_output_dir, self.store.MODEL_PATH_NAME)
    213 self._model = HFExperiment.load(model_path)

File [~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/store/loader.py:81](http://localhost:8889/lab/tree/~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/store/loader.py#line=80), in PretrainedStoreModel._artifact_load(cls, name, download_path, **kwargs)
     79 if not dm.fs.exists(download_path):
     80     cls._load_or_raise.cache_clear()
---> 81 return cls._load_or_raise(name, download_path, **kwargs)

File [~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/store/loader.py:103](http://localhost:8889/lab/tree/~/miniconda3/envs/datamol/lib/python3.11/site-packages/molfeat/store/loader.py#line=102), in PretrainedStoreModel._load_or_raise(cls, name, download_path, store, **kwargs)
    101 except Exception:
    102     mess = f"Can't retrieve model {name} from the store !"
--> 103     raise ModelStoreError(mess)
    104 return artifact_dir

ModelStoreError: Can't retrieve model ChemGPT-1.2B from the store !

Environment

Current environment
molfeat 0.10.1
pytorch 2.4.0
rdkit 2024.03.5
macOS Ventura 13.6.7
scikit-learn 1.5.2
Used conda to install molfeat

Additional context

I'm using my local laptop + Jupyter Lab.

@hisplan hisplan added the bug Something isn't working label Sep 14, 2024
@maclandrol
Copy link
Member

Hi @hisplan, sorry for late response.

It's likely that you have loss internet while downloading a previous version. Clearing the molfeat cache (/Users/chunj/Library/Caches/molfeat/ in your case) should help. Try removing the ChemGPT folders.

Otherwise, can you check if the instructions in #29 or #84 have solved your issue ?

@hisplan
Copy link
Author

hisplan commented Sep 18, 2024

I cleared the molfeat cache, tried again as follows, but still didn't work. The error message was pretty much the same as before.

transformer = PretrainedHFTransformer(kind='ChemGPT-1.2B', notation='selfies', dtype=float)
features = transformer(smiles)

Looking at the error message more carefully, it looks like the SHA256 checksum didn't match. Here's the part of error message I noticed:

ModelStoreError: The destination artifact at [/Users/chunj/Library/Caches/molfeat/ChemGPT-1.2B/model.save](http://localhost:8889/Users/chunj/Library/Caches/molfeat/ChemGPT-1.2B/model.save)
has a different sha256sum (4d8819f7c8c91ba94ba446d32f29342360d62971a9fa37c8cab2e31f9c3fc4c5)
than the Remote artifact sha256sum (e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855).
The destination artifact has been removed !

Any idea?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants