GitHub - mkorvas/vystadial-asr: Vystadial 2013 ASR scripts & updates

Introduction

This is a preliminary version of the Vystadial 2013 acoustic data & scripts dataset. This version is not an official release, it serves only as a support for the paper submission for `LREC 2014`_.

The abstract of the paper describing the data and related scripts is in the file abstract.pdf.

The data and scripts are found in directories as follows:

data_voip_cs/{train,dev,test}

the Czech data

data_voip_en/{train,dev,test}

the English data

htk

scripts for HTK

kaldi

scripts for Kaldi

Data statistics

Latest statistics for the data are as follows:

dataset audio #sents #words

English

train 19:52 22,933 103,136

dev 1:44 2,000 9,174

test 1:43 2,000 8,970

Czech

train 10:57 15,319 93,177

dev 1:24 2,000 11,854

test 1:24 2,000 11,841

Licence for the data

See LICENSE-CC-BY-SA-3.0.TXT.

Licence for the training scripts

See LICENSE-APACHE-2.0.TXT.

Authors

Matěj Korvas
Ondřej Plátek
Ondřej Dušek
Lukáš Žilka
Filip Jurčíček

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
htk		htk
kaldi		kaldi
.gitignore		.gitignore
LICENSE-APACHE-2.0.TXT		LICENSE-APACHE-2.0.TXT
LICENSE-CC-BY-SA-3.0.TXT		LICENSE-CC-BY-SA-3.0.TXT
README.rst		README.rst

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Introduction

Data statistics

Licence for the data

Licence for the training scripts

Authors

About

Releases

Packages

Contributors 2

Languages

dataset	audio	#sents	#words
English
train	19:52	22,933	103,136
dev	1:44	2,000	9,174
test	1:43	2,000	8,970
Czech
train	10:57	15,319	93,177
dev	1:24	2,000	11,854
test	1:24	2,000	11,841

mkorvas/vystadial-asr

Folders and files

Latest commit

History

Repository files navigation

Introduction

Data statistics

Licence for the data

Licence for the training scripts

Authors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages