- dataset: opus1m
- model: transformer
- source language(s): bel orv rue rus ukr
- target language(s): ast cat fra gcf glg ind ita jak lad min mol msa oci pob por ron spa zlm zsm
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - valid language labels: >>spa<< >>fra<< >>pob<< >>ita<< >>ron<< >>por<< >>ind<< >>msa_Latn<< >>cat<< >>glg<< >>zlm_Latn<< >>oci<< >>jak_Latn<< >>ast<< >>zlm<< >>mol<< >>min<<
- download: opus1m-2021-02-17.zip
- test set translations: opus1m-2021-02-17.test.txt
- test set scores: opus1m-2021-02-17.eval.txt
testset | BLEU | chr-F | #sent | #words | BP |
---|---|---|---|---|---|
Tatoeba-test.multi-multi | 44.9 | 0.642 | 10000 | 66633 | 0.973 |
- dataset: opus1m
- model: transformer
- source language(s): bel orv rue rus ukr
- target language(s): ast cat fra gcf glg ind ita jak lad min mol msa oci pob por ron spa zlm zsm
- model: transformer
- pre-processing: normalization + SentencePiece (spm32k,spm32k)
- a sentence initial language token is required in the form of
>>id<<
(id = valid target language ID) - valid language labels: >>spa<< >>fra<< >>pob<< >>ita<< >>ron<< >>por<< >>ind<< >>msa_Latn<< >>cat<< >>glg<< >>zlm_Latn<< >>oci<< >>jak_Latn<< >>ast<< >>zlm<< >>mol<< >>min<<
- download: opus1m-2021-02-18.zip
- test set translations: opus1m-2021-02-18.test.txt
- test set scores: opus1m-2021-02-18.eval.txt
testset | BLEU | chr-F | #sent | #words | BP |
---|---|---|---|---|---|
newstest2012.rus-fra | 21.5 | 0.509 | 3003 | 78011 | 0.998 |
newstest2012.rus-spa | 25.6 | 0.525 | 3003 | 79006 | 0.965 |
newstest2013.rus-fra | 24.9 | 0.528 | 3000 | 70037 | 0.992 |
newstest2013.rus-spa | 27.5 | 0.535 | 3000 | 70528 | 0.959 |
Tatoeba-test.bel-fra | 38.8 | 0.577 | 283 | 2005 | 0.997 |
Tatoeba-test.bel-ita | 39.0 | 0.580 | 264 | 1681 | 0.999 |
Tatoeba-test.bel-lad | 6.3 | 0.186 | 2 | 14 | 1.000 |
Tatoeba-test.bel-msa | 1.1 | 0.156 | 3 | 43 | 1.000 |
Tatoeba-test.bel-por | 19.3 | 0.444 | 3 | 21 | 1.000 |
Tatoeba-test.bel-spa | 39.8 | 0.604 | 205 | 1412 | 1.000 |
Tatoeba-test.multi-multi | 44.9 | 0.642 | 10000 | 66633 | 0.973 |
Tatoeba-test.orv-fra | 8.0 | 0.232 | 37 | 290 | 0.990 |
Tatoeba-test.orv-ita | 4.4 | 0.180 | 8 | 53 | 1.000 |
Tatoeba-test.orv-spa | 7.4 | 0.289 | 33 | 171 | 1.000 |
Tatoeba-test.rue-spa | 28.2 | 0.441 | 97 | 469 | 0.981 |
Tatoeba-test.rus-ast | 23.6 | 0.703 | 1 | 5 | 1.000 |
Tatoeba-test.rus-cat | 36.0 | 0.587 | 185 | 1342 | 0.977 |
Tatoeba-test.rus-fra | 49.7 | 0.660 | 10000 | 70132 | 0.980 |
Tatoeba-test.rus-gcf | 10.7 | 0.128 | 1 | 3 | 1.000 |
Tatoeba-test.rus-glg | 31.8 | 0.560 | 37 | 228 | 1.000 |
Tatoeba-test.rus-ita | 38.9 | 0.612 | 10000 | 71254 | 0.951 |
Tatoeba-test.rus-lad | 14.9 | 0.399 | 18 | 100 | 1.000 |
Tatoeba-test.rus-msa | 17.7 | 0.399 | 88 | 634 | 0.987 |
Tatoeba-test.rus-oci | 2.5 | 0.226 | 84 | 571 | 0.972 |
Tatoeba-test.rus-por | 37.7 | 0.600 | 10000 | 74713 | 0.957 |
Tatoeba-test.rus-ron | 35.9 | 0.595 | 782 | 4768 | 0.953 |
Tatoeba-test.rus-spa | 48.3 | 0.671 | 10000 | 71496 | 0.968 |
Tatoeba-test.ukr-cat | 39.6 | 0.598 | 455 | 2670 | 0.997 |
Tatoeba-test.ukr-fra | 47.3 | 0.644 | 10000 | 62877 | 0.998 |
Tatoeba-test.ukr-ita | 46.4 | 0.671 | 5000 | 27846 | 0.955 |
Tatoeba-test.ukr-lad | 12.6 | 0.320 | 20 | 108 | 1.000 |
Tatoeba-test.ukr-msa | 14.8 | 0.366 | 9 | 79 | 0.987 |
Tatoeba-test.ukr-por | 39.8 | 0.612 | 3372 | 21315 | 0.986 |
Tatoeba-test.ukr-spa | 47.6 | 0.662 | 10000 | 58486 | 0.979 |