Create dataset loader for OpenSpeech Dataset V1 by Wang #714

SamuelCahyawijaya · 2024-07-30T15:43:44Z

Dataset	openspeech_v1
Description	The OpenSpeech Dataset V1 by Wang: Data Market is a collection of speech data designed to facilitate research and development in the field of speech processing. This dataset comprises 10 hours of diverse audio recordings (8450 sentences) contributed by a collaborative effort of 1077 users. The dataset encompasses a wide range of sentences, capturing various linguistic nuances and acoustic environments. Contributors were encouraged to provide diverse speech samples, resulting in a rich and comprehensive dataset suitable for tasks such as speech recognition, language modeling, and speaker identification. A registration (email, password) is needed to download this free dataset.
Subsets	-
Languages	tha
Tasks	Spoken Language Identification, Language Modeling
License	Creative Commons Attribution Share Alike 4.0 (cc-by-sa-4.0)
Homepage	https://www.wang.in.th/dataset/654dfdbb6147c33fbf172957
HF URL	-
Paper URL	-

The text was updated successfully, but these errors were encountered:

SamuelCahyawijaya added this to SEACrowd Data Hub Jul 30, 2024

SamuelCahyawijaya converted this from a draft issue Jul 30, 2024

Provide feedback