Skip to content

Commit

Permalink
Merge remote-tracking branch 'upstream/master' into add-readme-trainer
Browse files Browse the repository at this point in the history
  • Loading branch information
atnanahidiw committed Oct 3, 2020
2 parents b780179 + 1dcb1ae commit 11de51d
Show file tree
Hide file tree
Showing 15 changed files with 193 additions and 572 deletions.
14 changes: 12 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,13 @@
IndoNLU is a collection of Natural Language Understanding (NLU) resources for Bahasa Indonesia.

## 12 Downstream Tasks
- Link [[Link]](https://github.com/indobenchmark/indonlu/tree/master/dataset)
- You can check [[Link]](https://github.com/indobenchmark/indonlu/tree/master/dataset)
- We provide train, valid, and test set (with masked labels, no true labels). We are currently preparing a platform for auto-evaluation using Codalab. Please stay tuned!

## Examples
- A guide to load IndoBERT model and finetune the model on Sequence Classification and Sequence Tagging task.
- You can check [[Link]](https://github.com/indobenchmark/indonlu/tree/master/examples)

## Indo4B
- 23GB Indo4B Pretraining Dataset [[Link]](https://storage.googleapis.com/babert-pretraining/IndoNLU_finals/dataset/preprocessed/dataset_all_uncased_blankline.txt.xz)

Expand All @@ -24,8 +28,14 @@ IndoNLU is a collection of Natural Language Understanding (NLU) resources for Ba
- Phase 1 [[Link]](https://huggingface.co/indobenchmark/indobert-lite-large-p1)
- Phase 2 [[Link]](https://huggingface.co/indobenchmark/indobert-lite-large-p2)

## Leaderboard (Under Construction)
## Leaderboard
- Community Portal and Public Leaderboard [[Link]](https://www.indobenchmark.com/leaderboard.html)
- Submission Portal https://competitions.codalab.org/competitions/26537

### Submission Format
Please kindly check [[Link]](https://github.com/indobenchmark/indonlu/tree/master/submission_examples). For each task, there is different format. Every submission file always start with the `index` column (the id of the test sample following the order of the masked test set).

For the submission, first you need to rename your prediction into 'pred.txt', then zip the file.

## Quick Start

Expand Down
Loading

0 comments on commit 11de51d

Please sign in to comment.