Skip to content

Commit

Permalink
Merge pull request #61 from neeravkaushal/main
Browse files Browse the repository at this point in the history
Update tokenizer.py
  • Loading branch information
maclandrol authored Nov 24, 2024
2 parents 752b101 + 5f1169c commit 84a4697
Show file tree
Hide file tree
Showing 2 changed files with 41 additions and 2 deletions.
19 changes: 19 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,25 @@ You can use conda/mamba:
mamba install -c conda-forge safe-mol
```

#### 2024/11/22
NOTE: Installation might cause issues like no detection of GPUs (which can be checked by `torch.cuda.is_available()`) and sengmentation error due to mismatch between installed and driver cuda versions. In that case, follow these steps:

Create a new environment using conda:

```bash
conda create -n env_safe python=3.12
conda activate env_safe
```

Check nvidia driver version on machine by running `nvcc --version` or `nvidia-smi` commands

Install pytorch with compatible cuda versions (from `https://pytorch.org/get-started/locally/`) and safe-mol:

```bash
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
conda install -c conda-forge safe-mol
```

### Datasets and Models

| Type | Name | Infos | Size | Comment |
Expand Down
24 changes: 22 additions & 2 deletions safe/tokenizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -136,14 +136,34 @@ def bos_token_id(self):

@property
def pad_token_id(self):
"""Get the bos token id"""
"""Get the pad token id"""
return self.tokenizer.token_to_id(self.tokenizer.pad_token)

@property
def eos_token_id(self):
"""Get the bos token id"""
"""Get the eos token id"""
return self.tokenizer.token_to_id(self.tokenizer.eos_token)

@property
def unk_token_id(self):
"""Get the unk token id"""
return self.tokenizer.token_to_id(self.tokenizer.unk_token)

@property
def mask_token_id(self):
"""Get the mask token id"""
return self.tokenizer.token_to_id(self.tokenizer.mask_token)

@property
def cls_token_id(self):
"""Get the cls token id"""
return self.tokenizer.token_to_id(self.tokenizer.cls_token)

@property
def sep_token_id(self):
"""Get the sep token id"""
return self.tokenizer.token_to_id(self.tokenizer.sep_token)

@classmethod
def set_special_tokens(
cls,
Expand Down

0 comments on commit 84a4697

Please sign in to comment.