Create dataset loader for Parallel Bible Corpus #703

SamuelCahyawijaya · 2024-07-30T15:39:27Z

Dataloader name: parallel_bible_mayer/parallel_bible_mayer.py
DataCatalogue: http://seacrowd.github.io/seacrowd-catalogue/card.html?parallel_bible_mayer

Dataset	parallel_bible_mayer
Description	The Parallel Bible Corpus is a project that aims at collecting Bible texts from the world's languages with the inherent parallelism of the verses being preserved. The ultimate goal is to provide a large collection of language data from all continents and language families that can be used for language comparison on the text level. The corpus contains 38 languages in total.
Subsets	Zyphe Chin, Teduray, Yakan
Languages	tiy, yka, zyp
Tasks	Language Modeling, Machine Translation
License	Unknown (unknown)
Homepage	https://github.com/cysouw/studentCheck/tree/master
HF URL	-
Paper URL	https://aclanthology.org/L14-1215/

The text was updated successfully, but these errors were encountered:

SamuelCahyawijaya added this to SEACrowd Data Hub Jul 30, 2024

SamuelCahyawijaya converted this from a draft issue Jul 30, 2024

Provide feedback