README.md

opusTCv20210807_transformer-big_2022-03-13.zip

dataset: opusTCv20210807
model: transformer-big
source language(s): ita
target language(s): bel bel_Latn orv_Cyrl rus ukr
raw source language(s): ita
raw target language(s): bel orv rus ukr
model: transformer-big
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
valid language labels:
download: opusTCv20210807_transformer-big_2022-03-13.zip
test set translations: opusTCv20210807_transformer-big_2022-03-13.test.txt
test set scores: opusTCv20210807_transformer-big_2022-03-13.eval.txt

testset	BLEU	chr-F	#sent	#words	BP
Tatoeba-test-v2021-08-07.ita-bel	32.5	0.52986	264	1513	1.000
Tatoeba-test-v2021-08-07.ita-bel_Latn	5.7	0.476	1	8	0.867
Tatoeba-test-v2021-08-07.ita-multi	45.9	0.65738	10000	60666	0.988
Tatoeba-test-v2021-08-07.ita-orv	1.7	0.14338	8	41	1.000
Tatoeba-test-v2021-08-07.ita-rus	44.8	0.64943	10045	65765	0.992
Tatoeba-test-v2021-08-07.ita-ukr	46.5	0.66326	5000	25294	1.000

dataset: opusTCv20210807
model: transformer-big
source language(s): ita
target language(s): bel bel_Latn orv_Cyrl rus ukr
raw source language(s): ita
raw target language(s): bel orv rus ukr
model: transformer-big
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
valid language labels:
download: opusTCv20210807_transformer-big_2022-03-19.zip
test set translations: opusTCv20210807_transformer-big_2022-03-19.test.txt
test set scores: opusTCv20210807_transformer-big_2022-03-19.eval.txt

testset	BLEU	chr-F	#sent	#words	BP
Tatoeba-test-v2021-08-07.ita-bel	33.3	0.55214	264	1513	1.000
Tatoeba-test-v2021-08-07.ita-bel_Latn	5.7	0.460	1	8	0.867
Tatoeba-test-v2021-08-07.ita-multi	46.5	0.66062	10000	60666	0.987
Tatoeba-test-v2021-08-07.ita-orv	2.1	0.14404	8	41	1.000
Tatoeba-test-v2021-08-07.ita-rus	46.2	0.65834	10045	65765	0.987
Tatoeba-test-v2021-08-07.ita-ukr	48.1	0.67290	5000	25294	0.997

dataset: opusTCv20210807
model: transformer-big
source language(s): ita
target language(s): bel bel_Latn orv_Cyrl rus ukr
raw source language(s): ita
raw target language(s): bel orv rus ukr
model: transformer-big
pre-processing: normalization + SentencePiece (spm32k,spm32k)
a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
valid language labels:
download: opusTCv20210807_transformer-big_2022-03-23.zip
test set translations: opusTCv20210807_transformer-big_2022-03-23.test.txt
test set scores: opusTCv20210807_transformer-big_2022-03-23.eval.txt

testset	BLEU	chr-F	#sent	#words	BP
Tatoeba-test-v2021-08-07.ita-bel	33.3	0.55571	264	1513	0.993
Tatoeba-test-v2021-08-07.ita-bel_Latn	5.9	0.493	1	8	0.549
Tatoeba-test-v2021-08-07.ita-multi	46.6	0.66149	10000	60666	0.985
Tatoeba-test-v2021-08-07.ita-orv	2.1	0.14466	8	41	1.000
Tatoeba-test-v2021-08-07.ita-rus	46.2	0.65840	10045	65765	0.985
Tatoeba-test-v2021-08-07.ita-ukr	48.3	0.67483	5000	25294	0.995