-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Suspicious casing while reproducing the conll14 results #6
Comments
Hm, I seem to remember that uppercasing the fist letter of each sentence was part of the pipeline. @snukky is currently travelling, but will probably be able to take a look soon. In the meantime try to apply this script to your output: |
Thanks for the prompt response. I tried restoring casing by aligning the output with the tokenized input, and adding detrucasing in the pipeline: # restore casing and tokenization
run_cmd("cat {pfx}.out.tok" \
" | {scripts}/impose_case.perl {pfx}.in.tok {pfx}.out.tok.aln" \
" | {moses}/scripts/tokenizer/deescape-special-chars.perl" \
" | {scripts}/impose_tok.perl {pfx}.in" \
" | {moses}/scripts/recaser/detruecase.perl" \
" > {pfx}.out"
.format(pfx=prefix, scripts=args.scripts, moses=args.moses)) but the score is even worse:
By comparing the results against yours, the differences are still mostly about casing. While the original
was completely lowercased into:
in the Thanks for your help anyway. Looking forward to getting some hints from @snukky . |
It was done using that custom perl script, not Moses scripts, as we used lazy with LM for truecasing. @shiman I'm not sure where the differences come from, but the script I'll check again when I get back home. |
Hi,
I want to reproduce the same (or at least very similar) m2 scores on the official conll14 test set. Following the README file, I successfully set up the environment and could get some results by the following command:
The output file was supposed to be almost (if not exactly) the same with your submission, and so should the m2 scores be. However, I only got the following m2 scores:
while the reported F0.5 is 0.4893, which is what I was expecting.
I vimdiffed my output against yours, and found that my output contained a few casing mistakes while yours doesn't. For example, in the middle part of sentence 333, my output was:
The bolded tokens look suspicious. Here their first letters are all capitalized, but the original input is not. Your output looks fine, too.
I digged a little into the script:
models/run_gecsmt.py
, and realized maybe there is something wrong during the recasing phase? More specifically, at line 78:baselines-emnlp2016/models/run_gecsmt.py
Lines 77 to 81 in fbdb0e7
It looks like we are recasing the output (tokenized) using the raw input (untokenized) and the alignment file. I suspect this is incorrect because the alignment file is based on the tokenized files, and we should do something like this:
I did try doing so. While I successfully got the correct cases for the example above, now all sentence beginning letters are in lowercase too.
This got me totally confused. How can I get the expected results and scores? What seems to be the problem? Could you shed some light?
For your reference, I also attached my output and logs here.
run.log
conll.out.txt
The text was updated successfully, but these errors were encountered: