Added Demo to Launch Blog

LazarusNLP · Feb 19, 2024 · f10280b · f10280b
1 parent fa574fb
commit f10280b
Showing 1 changed file with 30 additions and 1 deletion.
diff --git a/docs/blogs/launch.md b/docs/blogs/launch.md
@@ -9,6 +9,19 @@ Today we are launching LazarusNLP, an independent research group dedicated to le
 
 This blog aims to discuss the gaps in NLP research and development for Indonesian languages and introduce our initial projects. We are excited to share our work and invite the community to join us in our mission!
 
+You can try out our projects in the following web app demo:
+
+<iframe
+	src="https://lazarusnlp-lazarusnlp.hf.space"
+	frameborder="0"
+	width="100%"
+	height="500"
+></iframe>
+
+!!! info
+
+    This web app is available at our [🤗 HuggingFace Space](https://huggingface.co/spaces/LazarusNLP/LazarusNLP).
+
 ## Background
 
 Indonesia's linguistic landscape is rich and varied, with languages evolving independently across different regions. Despite the prevalence of Indonesian (*Bahasa Indonesia*) as the national language, many of these regional languages face the threat of extinction. UNESCO has identified 137 Indonesian languages as vulnerable or endangered, highlighting the urgent need for action[^1].
@@ -25,9 +38,13 @@ While advancements in NLP have benefited major languages like Indonesian, there
 
 IndoT5 is a T5-based language model trained specifically for the Indonesian language. With just 8 hours of training on a limited budget, we developed a competitive sequence-to-sequence, encoder-decode model capable of fine-tuning tasks such as summarization, chit-chat, and question-answering. Despite the limited training constraints, our model is competitive when evaluated on the [IndoNLG](https://github.com/IndoNLP/indonlg) (text generation) benchmark.
 
+<div class="grid cards" markdown>
+
 - [:material-github: GitHub Repository](https://github.com/LazarusNLP/IndoT5/)
 - [🤗 HuggingFace Collection](https://huggingface.co/collections/LazarusNLP/indonesian-t5-language-models-65c1b9a0f6342b3eb3d6d450)
 
+</div>
+
 ### Indonesian Sentence Embedding Models
 
 <div align="center">
@@ -36,23 +53,35 @@ IndoT5 is a T5-based language model trained specifically for the Indonesian lang
 
 We trained open-source sentence embedding models for Indonesian, enabling applications such as information retrieval (useful for retrieval-augmented generation!) semantic text similarity, and zero-shot text classification. We leverage existing pre-trained Indonesian language models like [IndoBERT](https://github.com/IndoNLP/indonlu) and state-of-the-art unsupervised techniques and established sentence embedding benchmarks.
 
+<div class="grid cards" markdown>
+
 - [:material-github: GitHub Repository](https://github.com/LazarusNLP/indonesian-sentence-embeddings)
 - [:material-web: Documentation](https://lazarusnlp.github.io/indonesian-sentence-embeddings/)
 - [🤗 HuggingFace Collection](https://huggingface.co/collections/LazarusNLP/indonesian-sentence-embedding-6541fce662e82d932ff360c5)
 
+</div>
+
 ### Indonesian Natural Language Inference (NLI) Models
 
 Open-source lightweight NLI models that are competitive with larger models on IndoNLI benchmark, with significantly less parameters. We applied knowledge distillation methods to small existing pre-trained language models like IndoBERT Lite. These models offer efficient solutions for tasks requiring natural language inference capabilities while minimizing computational resources such as cross-encoder-based semantic search.
 
+<div class="grid cards" markdown>
+
 - [🤗 HuggingFace Collection](https://huggingface.co/collections/LazarusNLP/indonesian-natural-language-inference-65b9d95539ac63290a418d67)
 
+</div>
+
 ### Many-to-Many Multilingual Translation Models
 
 Adapting mT5 to 45 languages of Indonesia, we developed a robust baseline model for multilingual translation for languages of Indonesia. This facilitates further fine-tuning for niche domains and low-resource languages, contributing to greater linguistic inclusivity. Our models are competitive with existing multilingual translation models on the [NusaX](https://github.com/IndoNLP/nusax) benchmark.
 
+<div class="grid cards" markdown>
+
 - [:material-github: GitHub Repository](https://github.com/LazarusNLP/machine-translation)
 - [🤗 HuggingFace Collection](https://huggingface.co/collections/LazarusNLP/indot5-6541fbdfa385933e811c2e1f)
 
+</div>
+
 ## Future Plans
 
 Our journey has just begun. Looking ahead, we are committed to expanding our repository of open-source pre-trained language models, with a focus on Indonesia's languages, multilinguality, culture, and code-switching. By democratizing access to NLP tools for all Indonesian languages, we aim to catalyze a renaissance in linguistic diversity.
@@ -65,6 +94,6 @@ We are always open to collaboration and welcome contributions from the community
 
 ---
 
-_Written by David Samuel Setiawan, Steven Limcorn, and Wilson Wongso. Last updated 13 February 2024._
+_Written by David Samuel Setiawan, Steven Limcorn, and Wilson Wongso. Last updated 19 February 2024._
 
 [^1]: Moseley, Christopher, ed. (2010). Atlas of the World’s Languages in Danger. Memory of Peoples (3rd ed.). Paris: UNESCO Publishing. ISBN 978-92-3-104096-2.