Skip to content

Commit

Permalink
Release 0.3.0
Browse files Browse the repository at this point in the history
  • Loading branch information
PonteIneptique committed Dec 14, 2020
1 parent 35198a4 commit d30fde4
Show file tree
Hide file tree
Showing 11 changed files with 5,769 additions and 0 deletions.
108 changes: 108 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,114 @@ This mosdel is trained on Pie by Enrique Manjavacas (@emanjavacas) and Mike Kest

## Information about the model

### Scores

<!-- Start Scores -->
More details:
- [lemma](lemma-pos.score.md)
- [POS](lemma-pos.score.md)
- [NOMB](morph/nomb.tar.score.md)
- [DEGRE](morph/degre.tar.score.md)
- [GENRE](morph/genre.tar.score.md)
- [CAS](morph/cas.tar.score.md)
- [TEMPS](morph/temps.tar.score.md)
- [PERS](morph/pers.tar.score.md)
- [MODE](morph/mode.tar.score.md)


#### lemma

| | accuracy | precision | recall | support |
|------------------|----------|-----------|--------|---------|
| all | 0.9747 | 0.7284 | 0.726 | 73507 |
| known-tokens | 0.9831 | 0.8969 | 0.8979 | 71726 |
| unknown-tokens | 0.6367 | 0.4327 | 0.4326 | 1781 |
| ambiguous-tokens | 0.9777 | 0.7723 | 0.7752 | 44987 |
| unknown-targets | 0.0751 | 0.0408 | 0.0408 | 253 |


#### POS

| | accuracy | precision | recall | support |
|------------------|----------|-----------|--------|---------|
| all | 0.9739 | 0.8129 | 0.787 | 73507 |
| known-tokens | 0.9775 | 0.8325 | 0.8098 | 71726 |
| unknown-tokens | 0.8293 | 0.528 | 0.5079 | 1781 |
| ambiguous-tokens | 0.9713 | 0.8126 | 0.8007 | 49942 |


#### NOMB

| | accuracy | precision | recall | support |
|------------------|----------|-----------|--------|---------|
| all | 0.966 | 0.9589 | 0.9576 | 28824 |
| known-tokens | 0.9709 | 0.9635 | 0.9627 | 27536 |
| unknown-tokens | 0.8626 | 0.8505 | 0.7935 | 1288 |
| ambiguous-tokens | 0.9538 | 0.9473 | 0.9481 | 14032 |


#### DEGRE

| | accuracy | precision | recall | support |
|------------------|----------|-----------|--------|---------|
| all | 0.9842 | 0.8266 | 0.7396 | 28824 |
| known-tokens | 0.9865 | 0.8475 | 0.7554 | 27536 |
| unknown-tokens | 0.9356 | 0.5693 | 0.4783 | 1288 |
| ambiguous-tokens | 0.9524 | 0.8443 | 0.7695 | 5863 |


#### GENRE

| | accuracy | precision | recall | support |
|------------------|----------|-----------|--------|---------|
| all | 0.9608 | 0.9276 | 0.8738 | 28824 |
| known-tokens | 0.9671 | 0.9332 | 0.8815 | 27536 |
| unknown-tokens | 0.8245 | 0.6139 | 0.6159 | 1288 |
| ambiguous-tokens | 0.9428 | 0.9156 | 0.863 | 13608 |


#### CAS

| | accuracy | precision | recall | support |
|------------------|----------|-----------|--------|---------|
| all | 0.9457 | 0.9057 | 0.9052 | 28824 |
| known-tokens | 0.9514 | 0.9105 | 0.9106 | 27536 |
| unknown-tokens | 0.8238 | 0.6096 | 0.605 | 1288 |
| ambiguous-tokens | 0.9301 | 0.9036 | 0.9054 | 16647 |


#### TEMPS

| | accuracy | precision | recall | support |
|------------------|----------|-----------|--------|---------|
| all | 0.9854 | 0.945 | 0.9408 | 28824 |
| known-tokens | 0.9886 | 0.9584 | 0.9585 | 27536 |
| unknown-tokens | 0.9169 | 0.8372 | 0.8109 | 1288 |
| ambiguous-tokens | 0.9644 | 0.9331 | 0.9527 | 4638 |


#### PERS

| | accuracy | precision | recall | support |
|------------------|----------|-----------|--------|---------|
| all | 0.9872 | 0.8732 | 0.7848 | 28824 |
| known-tokens | 0.9897 | 0.8782 | 0.7906 | 27536 |
| unknown-tokens | 0.9325 | 0.8832 | 0.8721 | 1288 |
| ambiguous-tokens | 0.9756 | 0.8709 | 0.7839 | 8648 |


#### MODE

| | accuracy | precision | recall | support |
|------------------|----------|-----------|--------|---------|
| all | 0.9897 | 0.9086 | 0.8815 | 28824 |
| known-tokens | 0.9932 | 0.9284 | 0.9066 | 27536 |
| unknown-tokens | 0.9146 | 0.7783 | 0.7381 | 1288 |
| ambiguous-tokens | 0.9722 | 0.8739 | 0.9003 | 4603 |


<!-- End Scores -->

### Corpora

The model was trained on the following corpora :
Expand Down
5,331 changes: 5,331 additions & 0 deletions lemma-pos.score.md

Large diffs are not rendered by default.

37 changes: 37 additions & 0 deletions morph/cas.tar.score.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@

## CAS

| | accuracy | precision | recall | support |
|------------------|----------|-----------|--------|---------|
| all | 0.9457 | 0.9057 | 0.9052 | 28824 |
| known-tokens | 0.9514 | 0.9105 | 0.9106 | 27536 |
| unknown-tokens | 0.8238 | 0.6096 | 0.605 | 1288 |
| ambiguous-tokens | 0.9301 | 0.9036 | 0.9054 | 16647 |


### CAS Classification report

| target | precision | recall | f1-score | support |
|-------------|-----------|--------|----------|---------|
| CAS=i | 0.82 | 0.83 | 0.83 | 588 |
| CAS=n | 0.90 | 0.88 | 0.89 | 4693 |
| CAS=r | 0.92 | 0.92 | 0.92 | 8310 |
| CAS=x | 0.98 | 0.98 | 0.98 | 15233 |
| avg / total | 0.91 | 0.91 | 0.91 | 28824 |

### CAS Confusion Matrix

| Expected | Total Errors | Predictions | Predicted times |
|----------|--------------|-------------|-----------------|
| CAS=r | 673 | CAS=n | 364 |
| | | CAS=x | 236 |
| | | CAS=i | 73 |
| CAS=n | 540 | CAS=r | 443 |
| | | CAS=x | 84 |
| | | CAS=i | 13 |
| CAS=x | 254 | CAS=r | 175 |
| | | CAS=n | 60 |
| | | CAS=i | 19 |
| CAS=i | 98 | CAS=r | 71 |
| | | CAS=n | 21 |
| | | CAS=x | 6 |
43 changes: 43 additions & 0 deletions morph/degre.tar.score.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@

## DEGRE

| | accuracy | precision | recall | support |
|------------------|----------|-----------|--------|---------|
| all | 0.9842 | 0.8266 | 0.7396 | 28824 |
| known-tokens | 0.9865 | 0.8475 | 0.7554 | 27536 |
| unknown-tokens | 0.9356 | 0.5693 | 0.4783 | 1288 |
| ambiguous-tokens | 0.9524 | 0.8443 | 0.7695 | 5863 |


### DEGRE Classification report

| target | precision | recall | f1-score | support |
|-------------|-----------|--------|----------|---------|
| DEGRE=- | 0.89 | 0.93 | 0.91 | 1580 |
| DEGRE=c | 0.78 | 0.56 | 0.65 | 52 |
| DEGRE=p | 0.91 | 0.88 | 0.90 | 968 |
| DEGRE=s | 0.56 | 0.33 | 0.42 | 30 |
| DEGRE=x | 0.99 | 0.99 | 0.99 | 26194 |
| avg / total | 0.83 | 0.74 | 0.77 | 28824 |

### DEGRE Confusion Matrix

| Expected | Total Errors | Predictions | Predicted times |
|----------|--------------|-------------|-----------------|
| DEGRE=x | 192 | DEGRE=- | 120 |
| | | DEGRE=p | 69 |
| | | DEGRE=c | 2 |
| | | DEGRE=s | 1 |
| DEGRE=p | 114 | DEGRE=x | 85 |
| | | DEGRE=- | 28 |
| | | DEGRE=s | 1 |
| DEGRE=- | 107 | DEGRE=x | 83 |
| | | DEGRE=p | 13 |
| | | DEGRE=c | 6 |
| | | DEGRE=s | 5 |
| DEGRE=c | 23 | DEGRE=- | 16 |
| | | DEGRE=x | 4 |
| | | DEGRE=p | 2 |
| | | DEGRE=s | 1 |
| DEGRE=s | 20 | DEGRE=- | 18 |
| | | DEGRE=x | 2 |
37 changes: 37 additions & 0 deletions morph/genre.tar.score.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@

## GENRE

| | accuracy | precision | recall | support |
|------------------|----------|-----------|--------|---------|
| all | 0.9608 | 0.9276 | 0.8738 | 28824 |
| known-tokens | 0.9671 | 0.9332 | 0.8815 | 27536 |
| unknown-tokens | 0.8245 | 0.6139 | 0.6159 | 1288 |
| ambiguous-tokens | 0.9428 | 0.9156 | 0.863 | 13608 |


### GENRE Classification report

| target | precision | recall | f1-score | support |
|-------------|-----------|--------|----------|---------|
| GENRE=f | 0.94 | 0.92 | 0.93 | 3696 |
| GENRE=m | 0.94 | 0.96 | 0.95 | 9413 |
| GENRE=n | 0.85 | 0.64 | 0.73 | 464 |
| GENRE=x | 0.98 | 0.98 | 0.98 | 15251 |
| avg / total | 0.93 | 0.87 | 0.90 | 28824 |

### GENRE Confusion Matrix

| Expected | Total Errors | Predictions | Predicted times |
|----------|--------------|-------------|-----------------|
| GENRE=m | 372 | GENRE=x | 201 |
| | | GENRE=f | 140 |
| | | GENRE=n | 31 |
| GENRE=f | 314 | GENRE=m | 245 |
| | | GENRE=x | 62 |
| | | GENRE=n | 7 |
| GENRE=x | 277 | GENRE=m | 187 |
| | | GENRE=f | 76 |
| | | GENRE=n | 14 |
| GENRE=n | 168 | GENRE=m | 111 |
| | | GENRE=x | 43 |
| | | GENRE=f | 14 |
40 changes: 40 additions & 0 deletions morph/mode.tar.score.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@

## MODE

| | accuracy | precision | recall | support |
|------------------|----------|-----------|--------|---------|
| all | 0.9897 | 0.9086 | 0.8815 | 28824 |
| known-tokens | 0.9932 | 0.9284 | 0.9066 | 27536 |
| unknown-tokens | 0.9146 | 0.7783 | 0.7381 | 1288 |
| ambiguous-tokens | 0.9722 | 0.8739 | 0.9003 | 4603 |


### MODE Classification report

| target | precision | recall | f1-score | support |
|-------------|-----------|--------|----------|---------|
| MODE=con | 0.89 | 0.92 | 0.90 | 61 |
| MODE=imp | 0.82 | 0.70 | 0.75 | 120 |
| MODE=ind | 0.96 | 0.96 | 0.96 | 3320 |
| MODE=sub | 0.88 | 0.83 | 0.86 | 313 |
| MODE=x | 1.00 | 1.00 | 1.00 | 25010 |
| avg / total | 0.91 | 0.88 | 0.89 | 28824 |

### MODE Confusion Matrix

| Expected | Total Errors | Predictions | Predicted times |
|----------|--------------|-------------|-----------------|
| MODE=ind | 127 | MODE=x | 78 |
| | | MODE=sub | 29 |
| | | MODE=imp | 13 |
| | | MODE=con | 7 |
| MODE=x | 77 | MODE=ind | 71 |
| | | MODE=imp | 4 |
| | | MODE=sub | 2 |
| MODE=sub | 53 | MODE=ind | 42 |
| | | MODE=x | 9 |
| | | MODE=imp | 2 |
| MODE=imp | 36 | MODE=ind | 21 |
| | | MODE=x | 12 |
| | | MODE=sub | 3 |
| MODE=con | 5 | MODE=ind | 5 |
30 changes: 30 additions & 0 deletions morph/nomb.tar.score.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@

## NOMB

| | accuracy | precision | recall | support |
|------------------|----------|-----------|--------|---------|
| all | 0.966 | 0.9589 | 0.9576 | 28824 |
| known-tokens | 0.9709 | 0.9635 | 0.9627 | 27536 |
| unknown-tokens | 0.8626 | 0.8505 | 0.7935 | 1288 |
| ambiguous-tokens | 0.9538 | 0.9473 | 0.9481 | 14032 |


### NOMB Classification report

| target | precision | recall | f1-score | support |
|-------------|-----------|--------|----------|---------|
| NOMB.=p | 0.93 | 0.93 | 0.93 | 3953 |
| NOMB.=s | 0.97 | 0.97 | 0.97 | 13452 |
| NOMB.=x | 0.98 | 0.98 | 0.98 | 11419 |
| avg / total | 0.96 | 0.96 | 0.96 | 28824 |

### NOMB Confusion Matrix

| Expected | Total Errors | Predictions | Predicted times |
|----------|--------------|-------------|-----------------|
| NOMB.=s | 448 | NOMB.=x | 233 |
| | | NOMB.=p | 215 |
| NOMB.=p | 286 | NOMB.=s | 241 |
| | | NOMB.=x | 45 |
| NOMB.=x | 245 | NOMB.=s | 199 |
| | | NOMB.=p | 46 |
40 changes: 40 additions & 0 deletions morph/pers.tar.score.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@

## PERS

| | accuracy | precision | recall | support |
|------------------|----------|-----------|--------|---------|
| all | 0.9872 | 0.8732 | 0.7848 | 28824 |
| known-tokens | 0.9897 | 0.8782 | 0.7906 | 27536 |
| unknown-tokens | 0.9325 | 0.8832 | 0.8721 | 1288 |
| ambiguous-tokens | 0.9756 | 0.8709 | 0.7839 | 8648 |


### PERS Classification report

| target | precision | recall | f1-score | support |
|-------------|-----------|--------|----------|---------|
| PERS.=0 | 0.50 | 0.07 | 0.12 | 28 |
| PERS.=1 | 0.95 | 0.94 | 0.94 | 916 |
| PERS.=2 | 0.96 | 0.94 | 0.95 | 770 |
| PERS.=3 | 0.97 | 0.98 | 0.97 | 4979 |
| PERS.=x | 0.99 | 0.99 | 0.99 | 22131 |
| avg / total | 0.87 | 0.78 | 0.80 | 28824 |

### PERS Confusion Matrix

| Expected | Total Errors | Predictions | Predicted times |
|----------|--------------|-------------|-----------------|
| PERS.=x | 126 | PERS.=3 | 80 |
| | | PERS.=1 | 25 |
| | | PERS.=2 | 21 |
| PERS.=3 | 119 | PERS.=x | 96 |
| | | PERS.=1 | 14 |
| | | PERS.=2 | 7 |
| | | PERS.=0 | 2 |
| PERS.=1 | 52 | PERS.=x | 25 |
| | | PERS.=3 | 21 |
| | | PERS.=2 | 6 |
| PERS.=2 | 47 | PERS.=x | 24 |
| | | PERS.=3 | 12 |
| | | PERS.=1 | 11 |
| PERS.=0 | 26 | PERS.=3 | 26 |
44 changes: 44 additions & 0 deletions morph/temps.tar.score.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@

## TEMPS

| | accuracy | precision | recall | support |
|------------------|----------|-----------|--------|---------|
| all | 0.9854 | 0.945 | 0.9408 | 28824 |
| known-tokens | 0.9886 | 0.9584 | 0.9585 | 27536 |
| unknown-tokens | 0.9169 | 0.8372 | 0.8109 | 1288 |
| ambiguous-tokens | 0.9644 | 0.9331 | 0.9527 | 4638 |


### TEMPS Classification report

| target | precision | recall | f1-score | support |
|-------------|-----------|--------|----------|---------|
| TEMPS=fut | 0.90 | 0.93 | 0.92 | 224 |
| TEMPS=ipf | 0.96 | 0.94 | 0.95 | 687 |
| TEMPS=psp | 0.94 | 0.94 | 0.94 | 1020 |
| TEMPS=pst | 0.92 | 0.90 | 0.91 | 1702 |
| TEMPS=x | 0.99 | 0.99 | 0.99 | 25191 |
| avg / total | 0.94 | 0.94 | 0.94 | 28824 |

### TEMPS Confusion Matrix

| Expected | Total Errors | Predictions | Predicted times |
|-----------|--------------|-------------|-----------------|
| TEMPS=pst | 174 | TEMPS=x | 136 |
| | | TEMPS=psp | 27 |
| | | TEMPS=ipf | 6 |
| | | TEMPS=fut | 5 |
| TEMPS=x | 128 | TEMPS=pst | 84 |
| | | TEMPS=psp | 29 |
| | | TEMPS=fut | 9 |
| | | TEMPS=ipf | 6 |
| TEMPS=psp | 66 | TEMPS=x | 26 |
| | | TEMPS=pst | 22 |
| | | TEMPS=ipf | 12 |
| | | TEMPS=fut | 6 |
| TEMPS=ipf | 39 | TEMPS=pst | 19 |
| | | TEMPS=x | 13 |
| | | TEMPS=psp | 5 |
| | | TEMPS=fut | 2 |
| TEMPS=fut | 15 | TEMPS=x | 14 |
| | | TEMPS=psp | 1 |
Loading

0 comments on commit d30fde4

Please sign in to comment.