0.3.5
Added AGR constraints to a number of types to reduce overgeneration.
Added a small testsuite for agreement, including 16 grammatical and 16 ungrammatical items. The overgeneration needs to be assessed in tsdb++ using LKB as the parser. The accuracy is more reliably assessed using a pydelphin script (util/treebanking-scripts/report_stats.py); I can't figure out a reliable way to combine tsdb++ and fftb. I think tsdb++ can report wrong numbers when dealing with fftb-treebanked corpora).
Other changes include:
- All participles now go through a vpart_ilr and then, if they are past participles, through a derivational ppart-lex-rule (of the appropriate kind).
- Ordinal numbers are now recognized as such
- Comparative adverbs now trigger not only adjectival lexical entries but also the adverbial ones
- Added a version of the colon that works like a copula (the type was already there but the lexical rule was not)
- Disabled some of the rules in srules.tdl which were unused at least in the treebanks up to length 12. Simply commented them out; can be added if needed (commit 7892ec3)
Overgeneration:
corpus | 0.3.4 | 0.3.5 |
---|---|---|
agreement | 0.75 | 0 |
Accuracy:
corpus | 0.3.4 | 0.3.5 |
---|---|---|
mrs | 0.81 | 0.95 |
tbdb01 | 1.0 | 1.0 |
tbdb02 | 0.93 | 0.94 |
tbdb03 | 0.88 | 0.91 |
tbdb04 | 0.86 | 0.89 |
tbdb05 | 0.86 | 0.89 |
tbdb06 | 0.82 | 0.88 |
tbdb07 | 0.76 | 0.86 |
tbdb08 | 0.82 | 0.81 |
tbdb09 | 0.77 | 0.79 |
tbdb10 | 0.76 | 0.75 |
tbdb11 | 0.50 | 0.53 |
tbdb12 | 0.65 | 0.64 |
The problem with the treebanks with longer sentences is they are less consistently verified (it's harder to establish that the structure is correct and it's easier to make a mistake). In many cases, the loss of an accepted/verified parse is in fact an improvement in the sense that the previous accepted structure was not correct. On the other hand, I actually think there are a lot more correct structures in e.g. 11-12 but more time is needed to find them. (I'd expect their real accuracy to be more similar to 9-10...)
Performance (assessed with tsdb++, not sure how reliably):
corpus | time compared to 0.3.4 | edges compared to 0.3.4 |
---|---|---|
mrs | -31% | -30% |
tbdb01 | -26% | -15% |
tbdb02 | -12% | -11% |
tbdb03 | -31% | -26% |
tbdb04 | -43% | -34% |
tbdb05 | -46% | -34% |
tbdb06 | -68% | -48% |
tbdb07 | -76% | -59% |
tbdb08 | -67% | -56% |
tbdb09 | -75% | -65% |
tbdb10 | -87% | -68% |
tbdb11 | -72% | -65% |
tbdb12 | -204% | -38% |