From f6d5cb879c43487ffed0726799cb270087e47e2e Mon Sep 17 00:00:00 2001 From: KaterinaRysova Date: Wed, 20 Nov 2024 12:44:35 +0100 Subject: [PATCH] Update TECH_DOC.md --- data_preparation/70.releasing/TECH_DOC.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/data_preparation/70.releasing/TECH_DOC.md b/data_preparation/70.releasing/TECH_DOC.md index 7d3feb0..9a30bc6 100644 --- a/data_preparation/70.releasing/TECH_DOC.md +++ b/data_preparation/70.releasing/TECH_DOC.md @@ -1,4 +1,4 @@ -# Technical Documentation +# Database of Spoken Czech as a Foreign Language (Permanent Residency in the Czech Republic): Technical Documentation The language corpus of spoken performances by non-native speakers of Czech, focused on the A2 language level (according to the CEFR), required for obtaining permanent residency in the Czech Republic, is the result of a project implemented at the Institute of Formal and Applied Linguistics of the Faculty of Mathematics and Physics, Charles University. The corpus contains recordings capturing the oral part of the [Czech Language Certificate Exam](https://ujop.cuni.cz/UJOPEN-70.html?ujopcmsid=12:czech-language-certificate-exam-cce) at the A2 level. The recordings include dialogues between the examiner (a native speaker) and the candidate (a non-native speaker). We have provided transcriptions of the recordings, enriched with extensive linguistic annotations. Some recordings are accompanied by multiple transcriptions from different annotators, allowing for comparisons of various transcriptions of the same recording and the assessment of the degree of agreement when converting spoken language into written text.