Skip to content

Latest commit

 

History

History
 
 

afa-afa

opus-2020-07-06.zip

  • dataset: opus
  • model: transformer
  • source language(s): apc ara arq arz heb kab mlt shy_Latn thv
  • target language(s): apc ara arq arz heb kab mlt shy_Latn thv
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-07-06.zip
  • test set translations: opus-2020-07-06.test.txt
  • test set scores: opus-2020-07-06.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.ara-ara.ara.ara 0.9 0.064
Tatoeba-test.ara-heb.ara.heb 30.7 0.497
Tatoeba-test.ara-kab.ara.kab 0.2 0.108
Tatoeba-test.ara-mlt.ara.mlt 13.6 0.429
Tatoeba-test.ara-shy.ara.shy 1.1 0.042
Tatoeba-test.heb-ara.heb.ara 14.4 0.422
Tatoeba-test.heb-kab.heb.kab 2.2 0.127
Tatoeba-test.kab-ara.kab.ara 0.3 0.094
Tatoeba-test.kab-heb.kab.heb 7.0 0.083
Tatoeba-test.kab-shy.kab.shy 0.8 0.000
Tatoeba-test.kab-tmh.kab.tmh 4.1 0.000
Tatoeba-test.mlt-ara.mlt.ara 27.7 0.425
Tatoeba-test.multi.multi 19.6 0.404
Tatoeba-test.shy-ara.shy.ara 1.3 0.080
Tatoeba-test.shy-kab.shy.kab 1.3 0.013
Tatoeba-test.tmh-kab.tmh.kab 2.8 0.077

opus-2020-07-26.zip

  • dataset: opus
  • model: transformer
  • source language(s): apc ara arq arz heb kab mlt shy_Latn thv
  • target language(s): apc ara arq arz heb kab mlt shy_Latn thv
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-07-26.zip
  • test set translations: opus-2020-07-26.test.txt
  • test set scores: opus-2020-07-26.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.ara-ara.ara.ara 4.3 0.148
Tatoeba-test.ara-heb.ara.heb 31.9 0.525
Tatoeba-test.ara-kab.ara.kab 0.3 0.120
Tatoeba-test.ara-mlt.ara.mlt 14.0 0.428
Tatoeba-test.ara-shy.ara.shy 1.3 0.050
Tatoeba-test.heb-ara.heb.ara 17.0 0.464
Tatoeba-test.heb-kab.heb.kab 1.9 0.104
Tatoeba-test.kab-ara.kab.ara 0.3 0.044
Tatoeba-test.kab-heb.kab.heb 5.1 0.099
Tatoeba-test.kab-shy.kab.shy 2.2 0.009
Tatoeba-test.kab-tmh.kab.tmh 10.7 0.007
Tatoeba-test.mlt-ara.mlt.ara 29.1 0.498
Tatoeba-test.multi.multi 20.8 0.434
Tatoeba-test.shy-ara.shy.ara 1.2 0.053
Tatoeba-test.shy-kab.shy.kab 2.0 0.134
Tatoeba-test.tmh-kab.tmh.kab 0.0 0.047

opus-2020-09-26.zip

  • dataset: opus
  • model: transformer
  • source language(s): acm afb amh apc ara arq ary arz eng hau_Latn heb kab mlt phn_Phnx rif_Latn shy_Latn som syc_Syrc thv tir tmh tmr_Hebr
  • target language(s): acm afb amh apc ara arq ary arz eng hau_Latn heb kab mlt phn_Phnx rif_Latn shy_Latn som syc_Syrc thv tir tmh tmr_Hebr
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-09-26.zip
  • test set translations: opus-2020-09-26.test.txt
  • test set scores: opus-2020-09-26.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.amh-eng.amh.eng 38.1 0.568
Tatoeba-test.ara-ara.ara.ara 6.7 0.223
Tatoeba-test.ara-eng.ara.eng 36.9 0.549
Tatoeba-test.ara-heb.ara.heb 34.2 0.542
Tatoeba-test.ara-kab.ara.kab 0.2 0.126
Tatoeba-test.ara-mlt.ara.mlt 14.2 0.566
Tatoeba-test.ara-shy.ara.shy 1.0 0.075
Tatoeba-test.ara-tmr.ara.tmr 2.6 0.014
Tatoeba-test.eng-amh.eng.amh 10.8 0.524
Tatoeba-test.eng-ara.eng.ara 12.0 0.403
Tatoeba-test.eng-hau.eng.hau 10.9 0.467
Tatoeba-test.eng-heb.eng.heb 30.1 0.529
Tatoeba-test.eng-kab.eng.kab 0.7 0.157
Tatoeba-test.eng-mlt.eng.mlt 16.8 0.551
Tatoeba-test.eng-phn.eng.phn 1.2 0.007
Tatoeba-test.eng-rif.eng.rif 2.2 0.112
Tatoeba-test.eng-shy.eng.shy 0.9 0.096
Tatoeba-test.eng-som.eng.som 16.0 0.211
Tatoeba-test.eng-tir.eng.tir 2.3 0.231
Tatoeba-test.eng-tmr.eng.tmr 0.7 0.007
Tatoeba-test.hau-eng.hau.eng 17.2 0.341
Tatoeba-test.heb-ara.heb.ara 17.9 0.472
Tatoeba-test.heb-eng.heb.eng 40.5 0.575
Tatoeba-test.heb-kab.heb.kab 4.2 0.121
Tatoeba-test.heb-phn.heb.phn 1.2 0.009
Tatoeba-test.heb-syc.heb.syc 2.8 0.000
Tatoeba-test.heb-tmr.heb.tmr 0.5 0.005
Tatoeba-test.kab-ara.kab.ara 0.2 0.079
Tatoeba-test.kab-eng.kab.eng 5.0 0.223
Tatoeba-test.kab-heb.kab.heb 0.0 0.108
Tatoeba-test.kab-shy.kab.shy 1.4 0.128
Tatoeba-test.kab-tmh.kab.tmh 5.5 0.102
Tatoeba-test.kab-tmr.kab.tmr 0.0 0.028
Tatoeba-test.mlt-ara.mlt.ara 17.2 0.376
Tatoeba-test.mlt-eng.mlt.eng 47.4 0.644
Tatoeba-test.multi.multi 20.0 0.409
Tatoeba-test.phn-eng.phn.eng 0.2 0.004
Tatoeba-test.phn-heb.phn.heb 0.9 0.010
Tatoeba-test.phn-tmr.phn.tmr 1.3 0.014
Tatoeba-test.rif-eng.rif.eng 2.3 0.146
Tatoeba-test.shy-ara.shy.ara 0.9 0.091
Tatoeba-test.shy-eng.shy.eng 1.8 0.145
Tatoeba-test.shy-kab.shy.kab 1.8 0.149
Tatoeba-test.som-eng.som.eng 10.7 0.069
Tatoeba-test.syc-heb.syc.heb 2.4 0.000
Tatoeba-test.tir-eng.tir.eng 9.7 0.296
Tatoeba-test.tmh-kab.tmh.kab 2.8 0.062
Tatoeba-test.tmr-ara.tmr.ara 1.8 0.098
Tatoeba-test.tmr-eng.tmr.eng 2.3 0.116
Tatoeba-test.tmr-heb.tmr.heb 1.8 0.129
Tatoeba-test.tmr-kab.tmr.kab 1.9 0.022
Tatoeba-test.tmr-phn.tmr.phn 8.1 0.014

opus-2020-10-04.zip

  • dataset: opus
  • model: transformer
  • source language(s): acm afb amh apc ara arq ary arz eng hau_Latn heb kab mlt phn_Phnx rif_Latn shy_Latn som syc_Syrc thv tir tmh tmr_Hebr
  • target language(s): acm afb amh apc ara arq ary arz eng hau_Latn heb kab mlt phn_Phnx rif_Latn shy_Latn som syc_Syrc thv tir tmh tmr_Hebr
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • download: opus-2020-10-04.zip
  • test set translations: opus-2020-10-04.test.txt
  • test set scores: opus-2020-10-04.eval.txt

Benchmarks

testset BLEU chr-F
Tatoeba-test.amh-eng.amh.eng 38.9 0.566
Tatoeba-test.ara-ara.ara.ara 5.6 0.224
Tatoeba-test.ara-eng.ara.eng 37.6 0.556
Tatoeba-test.ara-heb.ara.heb 34.5 0.546
Tatoeba-test.ara-kab.ara.kab 0.4 0.131
Tatoeba-test.ara-mlt.ara.mlt 14.6 0.565
Tatoeba-test.ara-shy.ara.shy 0.8 0.078
Tatoeba-test.ara-tmr.ara.tmr 2.7 0.014
Tatoeba-test.eng-amh.eng.amh 10.5 0.519
Tatoeba-test.eng-ara.eng.ara 11.9 0.405
Tatoeba-test.eng-hau.eng.hau 11.7 0.447
Tatoeba-test.eng-heb.eng.heb 30.5 0.534
Tatoeba-test.eng-kab.eng.kab 0.8 0.162
Tatoeba-test.eng-mlt.eng.mlt 17.3 0.554
Tatoeba-test.eng-phn.eng.phn 1.1 0.007
Tatoeba-test.eng-rif.eng.rif 1.9 0.104
Tatoeba-test.eng-shy.eng.shy 0.8 0.089
Tatoeba-test.eng-som.eng.som 16.0 0.211
Tatoeba-test.eng-tir.eng.tir 2.5 0.232
Tatoeba-test.eng-tmr.eng.tmr 0.5 0.007
Tatoeba-test.hau-eng.hau.eng 12.5 0.317
Tatoeba-test.heb-ara.heb.ara 17.6 0.474
Tatoeba-test.heb-eng.heb.eng 40.9 0.578
Tatoeba-test.heb-kab.heb.kab 2.8 0.115
Tatoeba-test.heb-phn.heb.phn 1.1 0.009
Tatoeba-test.heb-syc.heb.syc 1.9 0.000
Tatoeba-test.heb-tmr.heb.tmr 0.7 0.005
Tatoeba-test.kab-ara.kab.ara 0.2 0.076
Tatoeba-test.kab-eng.kab.eng 4.7 0.221
Tatoeba-test.kab-heb.kab.heb 0.0 0.115
Tatoeba-test.kab-shy.kab.shy 1.3 0.104
Tatoeba-test.kab-tmh.kab.tmh 5.5 0.121
Tatoeba-test.kab-tmr.kab.tmr 0.0 0.028
Tatoeba-test.mlt-ara.mlt.ara 20.0 0.405
Tatoeba-test.mlt-eng.mlt.eng 45.9 0.625
Tatoeba-test.multi.multi 20.3 0.412
Tatoeba-test.phn-eng.phn.eng 1.0 0.146
Tatoeba-test.phn-heb.phn.heb 1.5 0.032
Tatoeba-test.phn-tmr.phn.tmr 1.6 0.008
Tatoeba-test.rif-eng.rif.eng 2.3 0.139
Tatoeba-test.shy-ara.shy.ara 1.1 0.086
Tatoeba-test.shy-eng.shy.eng 1.5 0.126
Tatoeba-test.shy-kab.shy.kab 2.1 0.139
Tatoeba-test.som-eng.som.eng 21.4 0.289
Tatoeba-test.syc-heb.syc.heb 1.4 0.000
Tatoeba-test.tir-eng.tir.eng 12.7 0.311
Tatoeba-test.tmh-kab.tmh.kab 3.4 0.093
Tatoeba-test.tmr-ara.tmr.ara 0.4 0.080
Tatoeba-test.tmr-eng.tmr.eng 2.4 0.136
Tatoeba-test.tmr-heb.tmr.heb 1.7 0.128
Tatoeba-test.tmr-kab.tmr.kab 1.9 0.022
Tatoeba-test.tmr-phn.tmr.phn 6.6 0.013

opus-2021-02-23.zip

  • dataset: opus
  • model: transformer
  • source language(s): afb apc ara arq arz heb jpa kab mlt oar phn shy syc thv tmh tmr
  • target language(s): afb apc ara arq arz heb jpa kab mlt oar phn shy syc thv tmh tmr
  • model: transformer
  • pre-processing: normalization + SentencePiece (spm32k,spm32k)
  • a sentence initial language token is required in the form of >>id<< (id = valid target language ID)
  • valid language labels: >>eng<< >>ara<< >>heb<< >>mlt<< >>kab<< >>hau_Latn<< >>tir<< >>som<< >>amh<< >>arq<< >>arz<<
  • download: opus-2021-02-23.zip
  • test set translations: opus-2021-02-23.test.txt
  • test set scores: opus-2021-02-23.eval.txt

Benchmarks

testset BLEU chr-F #sent #words BP
Tatoeba-test.ara-ara 5.6 0.224 16 60 1.000
Tatoeba-test.ara-heb 34.5 0.546 1208 6801 0.992
Tatoeba-test.ara-kab 0.4 0.131 147 809 1.000
Tatoeba-test.ara-mlt 14.6 0.565 28 88 1.000
Tatoeba-test.ara-shy 0.8 0.078 15 58 1.000
Tatoeba-test.ara-tmr 2.7 0.014 8 28 1.000
Tatoeba-test.heb-ara 17.6 0.474 1208 6372 0.904
Tatoeba-test.heb-jpa 10.7 0.012 1 4 1.000
Tatoeba-test.heb-kab 2.8 0.115 3 11 1.000
Tatoeba-test.heb-oar 0.2 0.001 8 95 1.000
Tatoeba-test.heb-phn 1.1 0.009 9 47 1.000
Tatoeba-test.heb-syc 1.9 0.000 1 6 1.000
Tatoeba-test.heb-tmr 0.7 0.005 16 94 1.000
Tatoeba-test.jpa-heb 12.7 0.134 1 4 1.000
Tatoeba-test.jpa-tmr 6.6 0.014 1 4 1.000
Tatoeba-test.kab-ara 0.2 0.076 147 736 1.000
Tatoeba-test.kab-heb 0.0 0.115 3 10 0.779
Tatoeba-test.kab-shy 1.3 0.104 3 26 1.000
Tatoeba-test.kab-tmh 5.5 0.121 1 4 1.000
Tatoeba-test.kab-tmr 0.0 0.028 1 2 1.000
Tatoeba-test.mlt-ara 20.0 0.405 28 91 0.989
Tatoeba-test.multi-multi 21.0 0.429 2938 15794 1.000
Tatoeba-test.oar-heb 0.8 0.056 8 82 0.799
Tatoeba-test.oar-syc 3.1 0.005 1 6 1.000
Tatoeba-test.phn-heb 1.5 0.032 9 48 1.000
Tatoeba-test.phn-tmr 1.6 0.008 1 3 1.000
Tatoeba-test.shy-ara 1.1 0.086 15 59 1.000
Tatoeba-test.shy-kab 2.1 0.139 3 26 1.000
Tatoeba-test.syc-heb 1.4 0.000 1 6 1.000
Tatoeba-test.syc-oar 0.8 0.000 1 6 1.000
Tatoeba-test.tmh-kab 3.4 0.093 1 4 1.000
Tatoeba-test.tmr-ara 0.4 0.080 8 24 1.000
Tatoeba-test.tmr-heb 1.7 0.128 16 102 0.908
Tatoeba-test.tmr-jpa 8.1 0.013 1 4 1.000
Tatoeba-test.tmr-kab 1.9 0.022 1 2 1.000
Tatoeba-test.tmr-phn 6.6 0.013 1 4 1.000
tico19-test.eng-amh 3.6 0.220 2100 44943 0.784
tico19-test.eng-ara 14.2 0.458 2100 51336 0.958
tico19-test.eng-som 3.1 0.258 2100 63654 0.833
tico19-test.eng-tir 2.7 0.167 2100 46792 0.934