forked from redpony/cdec
-
Notifications
You must be signed in to change notification settings - Fork 0
Decoder, aligner, and model optimizer for statistical machine translation and other structured prediction models based on (mostly) context-free formalisms
License
jweese/cdec
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
cdec is a fast decoder. SPEED COMPARISON ------------------------------------------------------------------------------ Here is a comparison with a couple of other decoders doing SCFG decoding: Decoder Lang. BLEU Run-Time Memory cdec c++ 31.47 0.37 sec/sent 1.0-1.1GB Joshua Java 31.55 2.34 sec/sent 4.0-4.8GB Hiero Python 31.22 27.2 sec/sent 1.7-1.9GB The maximum number of pops from candidate heap at each node is k=30, no other pruning, 3gm LM, Chinese-English translation task. GETTING STARTED ------------------------------------------------------------------------------ See the BUILDING file for instructions on how to build the software. To explore the decoder's features, the best way to get started is to look at cdec's command line options or to have a look at the test cases in the tests/system_tests/ directory. Each of these can be run with a command like ./cdec -c cdec.ini -i input.txt -w weights . The files should be self explanatory. EXTRACTING A SYNCHRONOUS GRAMMAR / PHRASE TABLE ------------------------------------------------------------------------------ cdec does not include code for generating grammars. To build these, you will need to write your own software or use an existing package like Joshua, Hiero, or Moses. OPTIMIZING / TRAINING MODELS ------------------------------------------------------------------------------ cdec does include code for optimizing models, according to a number of training criteria, including training models as CRFs (with latent derivation variables), MERT (over hypergraphs) to opimize BLEU, TER, etc. Eventually, I will provide documentation for this. ALIGNMENT / SYNCHRONOUS PARSING / CONSTRAINED DECODING ------------------------------------------------------------------------------ cdec can be used as an aligner. For examples, see the test cases. COPYRIGHT AND LICENSE ------------------------------------------------------------------------------ Copyright (c) 2009 by Chris Dyer <[email protected]> See the file LICENSE.txt for the licensing terms that this software is released under. This software also includes the file m4/boost.m4 which is licensed under the LGPL v3, for more information refer to the comments in that file.
About
Decoder, aligner, and model optimizer for statistical machine translation and other structured prediction models based on (mostly) context-free formalisms
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published
Languages
- C++ 53.2%
- C 24.8%
- Java 9.8%
- Perl 8.1%
- Python 3.2%
- D 0.6%
- Shell 0.3%