You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
`!pip install sentencepiece
from codetf.models import load_model_pipeline
from codetf.data_utility.human_eval_dataset import HumanEvalDataset
from codetf.performance.model_evaluator import ModelEvaluator
import os
Above is the code that was used. During execution in Google Colab, I received the error,
in <cell line: 15>:15 │
│ │
│ /usr/local/lib/python3.10/dist-packages/codetf/data_utility/human_eval_dataset.py:29 in load │
│ │
│ 26 │ │ │ unit_test = re.sub(r'METADATA = {[^}]*}', '', unit_test, flags=re.MULTILINE) │
│ 27 │ │ │ references.append(unit_test) │
│ 28 │ │ │
│ ❱ 29 │ │ prompt_token_ids, prompt_attention_masks = self.process_data(prompts, use_max_le │
│ 30 │ │ │
│ 31 │ │ return prompt_token_ids, prompt_attention_masks, references │
│ 32 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: BaseDataset.process_data() got an unexpected keyword argument 'use_max_length'
After looking through the source code I don't seem to see this keyword argument, apart from max_length. Would anyone mind shedding some light on the issue?
The text was updated successfully, but these errors were encountered:
After I tried to remove the keyword, it also generates the error like the following: NameError: name 'TensorDataset' is not defined
I think this is something missing in the import part.
After I fixed all things mentioned above, it began to work.
And I looked into the package(1.0.1.1) installed on my local server, I found the codes for this version did not sync with the main branch of the repo. It seems the latest main branch has fixed this issue. So I think we can fix it by reinstall the package from the repo rather than pip.
`!pip install sentencepiece
from codetf.models import load_model_pipeline
from codetf.data_utility.human_eval_dataset import HumanEvalDataset
from codetf.performance.model_evaluator import ModelEvaluator
import os
os.environ["HF_ALLOW_CODE_EVAL"] = "1"
os.environ["TOKENIZERS_PARALLELISM"] = "true"
model_class = load_model_pipeline(model_name="causallm", task="pretrained",
model_type="codegen-350M-mono", is_eval=True,
load_in_8bit=True, weight_sharding=False)
dataset = HumanEvalDataset(tokenizer=model_class.get_tokenizer())
prompt_token_ids, prompt_attention_masks, references = dataset.load()
problems = TensorDataset(prompt_token_ids, prompt_attention_masks)
evaluator = ModelEvaluator(model_class)
avg_pass_at_k = evaluator.evaluate_pass_k(problems=problems, unit_tests=references)
print("Pass@k: ", avg_pass_at_k)`
Above is the code that was used. During execution in Google Colab, I received the error,
in <cell line: 15>:15 │
│ │
│ /usr/local/lib/python3.10/dist-packages/codetf/data_utility/human_eval_dataset.py:29 in load │
│ │
│ 26 │ │ │ unit_test = re.sub(r'METADATA = {[^}]*}', '', unit_test, flags=re.MULTILINE) │
│ 27 │ │ │ references.append(unit_test) │
│ 28 │ │ │
│ ❱ 29 │ │ prompt_token_ids, prompt_attention_masks = self.process_data(prompts, use_max_le │
│ 30 │ │ │
│ 31 │ │ return prompt_token_ids, prompt_attention_masks, references │
│ 32 │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
TypeError: BaseDataset.process_data() got an unexpected keyword argument 'use_max_length'
After looking through the source code I don't seem to see this keyword argument, apart from max_length. Would anyone mind shedding some light on the issue?
The text was updated successfully, but these errors were encountered: