Skip to content

mkorvas/vystadial-asr

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

This is a preliminary version of the Vystadial 2013 acoustic data & scripts dataset. This version is not an official release, it serves only as a support for the paper submission for `LREC 2014`_.

The abstract of the paper describing the data and related scripts is in the file abstract.pdf.

The data and scripts are found in directories as follows:

data_voip_cs/{train,dev,test}
the Czech data
data_voip_en/{train,dev,test}
the English data
htk
scripts for HTK
kaldi
scripts for Kaldi

Data statistics

Latest statistics for the data are as follows:

dataset audio #sents #words
English      
train 19:52 22,933 103,136
dev 1:44 2,000 9,174
test 1:43 2,000 8,970
Czech      
train 10:57 15,319 93,177
dev 1:24 2,000 11,854
test 1:24 2,000 11,841

Licence for the data

See LICENSE-CC-BY-SA-3.0.TXT.

Licence for the training scripts

See LICENSE-APACHE-2.0.TXT.

Authors

  • Matěj Korvas
  • Ondřej Plátek
  • Ondřej Dušek
  • Lukáš Žilka
  • Filip Jurčíček

About

Vystadial 2013 ASR scripts & updates

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published