Skip to content

Commit

Permalink
Added Members, Refreshed Theme
Browse files Browse the repository at this point in the history
  • Loading branch information
w11wo committed Nov 28, 2023
1 parent a24018d commit a0b1d6b
Show file tree
Hide file tree
Showing 4 changed files with 34 additions and 5 deletions.
26 changes: 26 additions & 0 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,32 @@ Despite the richness and diversity of these regional languages, there has very b

Through this project, we hope to see the same successes occur in other languages of Indonesia. In the long run, we also hope that through these technological advancements, we will be able to prevent these languages from becoming endangered and in turn, spark innovative work around these languages.

## Members

LazarusNLP is driven with love by:

<div style="display: flex;">
<a href="https://github.com/anantoj">
<img src="https://github.com/anantoj.png" alt="GitHub Profile" style="border-radius: 50%;width: 64px;margin:0 4px;">
</a>

<a href="https://github.com/BrandonScottt">
<img src="https://github.com/BrandonScottt.png" alt="GitHub Profile" style="border-radius: 50%;width: 64px;margin:0 4px;">
</a>

<a href="https://github.com/DavidSamuell">
<img src="https://github.com/DavidSamuell.png" alt="GitHub Profile" style="border-radius: 50%;width: 64px;margin:0 4px;">
</a>

<a href="https://github.com/stevenlimcorn">
<img src="https://github.com/stevenlimcorn.png" alt="GitHub Profile" style="border-radius: 50%;width: 64px;margin:0 4px;">
</a>

<a href="https://github.com/w11wo">
<img src="https://github.com/w11wo.png" alt="GitHub Profile" style="border-radius: 50%;width: 64px;margin:0 4px;">
</a>
</div>

[^1]: Wikipedia contributors. (2022, May 20). Languages of Indonesia. In Wikipedia, The Free Encyclopedia. Retrieved 07:55, May 21, 2022, from [https://en.wikipedia.org/w/index.php?title=Languages_of_Indonesia&oldid=1088838885](https://en.wikipedia.org/w/index.php?title=Languages_of_Indonesia&oldid=1088838885).
[^2]: Abtahian, Maya & Cohn, Abigail. (2014). Can a language with millions of speakers be endangered?. Journal of the Southeast Asian Linguistics Society. 7. 64-75.
[^3]: Moseley, Christopher, ed. (2010). Atlas of the World’s Languages in Danger. Memory of Peoples (3rd ed.). Paris: UNESCO Publishing. ISBN 978-92-3-104096-2.
3 changes: 3 additions & 0 deletions docs/projects/indo-sentence-embeddings.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ Like SimCSE, [ConGen: Unsupervised Control and Generalization Distillation For S
| [ConGen-IndoBERT Lite Base](https://huggingface.co/LazarusNLP/congen-indobert-lite-base) | 79.97 | 12M | [IndoBERT Lite Base](https://huggingface.co/indobenchmark/indobert-lite-base-p1) | [paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) | [Wikipedia](https://huggingface.co/datasets/LazarusNLP/wikipedia_id_20230520) | |
| [ConGen-IndoBERT Base](https://huggingface.co/LazarusNLP/congen-indobert-base) | 80.47 | 125M | [IndoBERT Base](https://huggingface.co/indobenchmark/indobert-base-p1) | [paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) | [Wikipedia](https://huggingface.co/datasets/LazarusNLP/wikipedia_id_20230520) | |
| [ConGen-SimCSE-IndoBERT Base](https://huggingface.co/LazarusNLP/congen-simcse-indobert-base) | 81.16 | 125M | [SimCSE-IndoBERT Base](https://huggingface.co/LazarusNLP/simcse-indobert-base) | [paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) | [Wikipedia](https://huggingface.co/datasets/LazarusNLP/wikipedia_id_20230520) | |
| [S-IndoBERT Base mMARCO](https://huggingface.co/LazarusNLP/s-indobert-base-mmarco) | 72.95 | 125M | [IndoBERT Base](https://huggingface.co/indobenchmark/indobert-base-p1) | N/A | [mMARCO](https://huggingface.co/datasets/unicamp-dl/mmarco) ||
| [distiluse-base-multilingual-cased-v2](https://huggingface.co/sentence-transformers/distiluse-base-multilingual-cased-v2) | 75.08 | 134M | [DistilBERT Base Multilingual](https://huggingface.co/distilbert-base-multilingual-cased) | mUSE | See: [SBERT](https://www.sbert.net/docs/pretrained_models.html#model-overview) ||
| [paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) | 83.83 | 125M | [XLM-RoBERTa Base](https://huggingface.co/xlm-roberta-base) | [paraphrase-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-mpnet-base-v2) | See: [SBERT](https://www.sbert.net/docs/pretrained_models.html#model-overview) ||

Expand All @@ -57,6 +58,7 @@ Like SimCSE, [ConGen: Unsupervised Control and Generalization Distillation For S
| [ConGen-IndoBERT Lite Base](https://huggingface.co/LazarusNLP/congen-indobert-lite-base) | 58.18 | 58.84 |
| [ConGen-IndoBERT Base](https://huggingface.co/LazarusNLP/congen-indobert-base) | 57.04 | 57.06 |
| [ConGen-SimCSE-IndoBERT Base](https://huggingface.co/LazarusNLP/congen-simcse-indobert-base) | 59.54 | 60.37 |
| [S-IndoBERT Base mMARCO](https://huggingface.co/LazarusNLP/s-indobert-base-mmarco) | 48.86 | 47.92 |
| [distiluse-base-multilingual-cased-v2](https://huggingface.co/sentence-transformers/distiluse-base-multilingual-cased-v2) | 63.63 | 64.13 |
| [paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) | 63.18 | 63.78 |

Expand All @@ -70,6 +72,7 @@ Like SimCSE, [ConGen: Unsupervised Control and Generalization Distillation For S
| [ConGen-IndoBERT Lite Base](https://huggingface.co/LazarusNLP/congen-indobert-lite-base) | 81.2 | 75.59 |
| [ConGen-IndoBERT Base](https://huggingface.co/LazarusNLP/congen-indobert-base) | 85.4 | 82.12 |
| [ConGen-SimCSE-IndoBERT Base](https://huggingface.co/LazarusNLP/congen-simcse-indobert-base) | 83.0 | 78.74 |
| [S-IndoBERT Base mMARCO](https://huggingface.co/LazarusNLP/s-indobert-base-mmarco) | 80.2 | 75.73 |
| [distiluse-base-multilingual-cased-v2](https://huggingface.co/sentence-transformers/distiluse-base-multilingual-cased-v2) | 78.8 | 73.64 |
| [paraphrase-multilingual-mpnet-base-v2](https://huggingface.co/sentence-transformers/paraphrase-multilingual-mpnet-base-v2) | 89.6 | 86.56 |

Expand Down
4 changes: 2 additions & 2 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,8 @@ theme:
name: Switch to dark mode
- media: "(prefers-color-scheme: dark)"
scheme: slate
primary: red
accent: red
primary: black
accent: black
toggle:
icon: material/weather-sunny
name: Switch to light mode
Expand Down
6 changes: 3 additions & 3 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
mkdocs==1.4.2
mkdocs-material==8.5.10
mkdocs-material-extensions==1.1.1
mkdocs==1.5.3
mkdocs-material==9.4.14
mkdocs-material-extensions==1.3.1

0 comments on commit a0b1d6b

Please sign in to comment.