Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can't load NER dataset into tner , I have tried to use ai4bharat_namapadham dataset from hugging face but it doesnt load into any of the tner methods, any suggestions on loading this datset #52

Open
Ananthzeke opened this issue Mar 10, 2023 · 0 comments

Comments

@Ananthzeke
Copy link

searcher = GridSearcher(
   checkpoint_dir='./ckpt_xlmr_naamapadam_ta',
   dataset='ai4bharat/naamapadam',  # either of `dataset` (huggingface dataset) or `local_dataset` (custom dataset) should be given
   dataset_name='ta',
   model="xlm-roberta-base",  # language model to fine-tune
   epoch=10,  # the total epoch (`L` in the figure)
   epoch_partial=5,  # the number of epoch at 1st stage (`M` in the figure)
   n_max_config=1,  # the number of models to pass to 2nd stage (`K` in the figure)
   batch_size=32,
   gradient_accumulation_steps=[2],
   crf=[True],
   lr=[1e-5],
   weight_decay=[None],
   random_seed=[42],
   lr_warmup_step_ratio=[0.1],
   max_grad_norm=[10]
)
searcher.train()

INFO:root:INITIALIZE GRID SEARCHER: 1 configs to try
INFO:root:## 1st RUN: Configuration 0/1 ##
INFO:root:hyperparameters
INFO:root: * dataset: ai4bharat/naamapadam
INFO:root: * dataset_split: train
INFO:root: * dataset_name: ta
INFO:root: * local_dataset: None
INFO:root: * model: xlm-roberta-base
INFO:root: * crf: True
INFO:root: * max_length: 128
INFO:root: * epoch: 10
INFO:root: * batch_size: 32
INFO:root: * lr: 1e-05
INFO:root: * random_seed: 42
INFO:root: * gradient_accumulation_steps: 2
INFO:root: * weight_decay: None
INFO:root: * lr_warmup_step_ratio: 0.1
INFO:root: * max_grad_norm: 10
WARNING:datasets.builder:Found cached dataset naamapadam (/root/.cache/huggingface/datasets/ai4bharat___naamapadam/ta/1.0.0/c1b045180d60b208d2468bdad897d04461f08c7137c04a85220697b1bef7df9a)


JSONDecodeError Traceback (most recent call last)

in
16 max_grad_norm=[10]
17 )
---> 18 searcher.train()

8 frames

/usr/lib/python3.9/json/decoder.py in raw_decode(self, s, idx)
353 obj, end = self.scan_once(s, idx)
354 except StopIteration as err:
--> 355 raise JSONDecodeError("Expecting value", s, err.value) from None
356 return obj, end

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant