Skip to content

Latest commit

 

History

History
88 lines (70 loc) · 4.05 KB

index.md

File metadata and controls

88 lines (70 loc) · 4.05 KB
layout title permalink icon menu
docs
Examples
/examples/
fa-cogs
4

Examples

  • {% github_link "Basic example for training" marian-examples/training-basics %}: The scripts for training a Edinburgh's WMT16 system adapted from the Romanian-English sample from https://github.com/rsennrich/wmt16-scripts. The resulting system should be competitive or even slightly better than reported in that paper.

  • {% github_link "Training a transformer model" marian-examples/transformer %}: An example for training a Google-style transformer model introduced in Attention is all you need, Vaswani et al., 2017.

  • {% github_link "Training on raw texts with built-in SentencePiece" marian-examples/training-basics-sentencepiece %}: The example shows how to use Taku Kudo's SentencePiece and Matt Post's SacreBLEU to greatly simplify the training and evaluation process by providing ways to have reversible hidden preprocessing and repeatable evaluation.

  • {% github_link "Reconstructing Edinburgh's WMT17 English-German system" marian-examples/wmt2017-uedin %}: The scripts show how to train a complete WMT-grade system based on Edinburgh's WMT submission description for en-de.

  • {% github_link "Reconstructing top WMT17 system with Marian's Transformer model" marian-examples/wmt2017-transformer %}: The scripts show how to train a complete better than (!) WMT-grade system based on Google's Transformer model and Edinburgh's WMT submission description for en-de. This example is a combination of reproducing Edinburgh's WMT2017 system for en-de with Marian and the example for Transformer training.

  • {% github_link "Translating with Amun" marian-examples/translating-amun %}: The scripts demonstrate how to translate with Amun using Edinburgh's German-English WMT2016 single model and ensemble.

Tutorials

MT Marathon 2019 Efficiency

The Machine Translation Marathon 2019 Tutorial shows how to do efficient neural machine translation using the Marian toolkit by optimizing the speed, accuracy and use of resources for training and decoding of NMT models.

MT Marathon 2018 Intro

The Machine Translation Marathon 2018 Labs is a Marian tutorial that covers topics like downloading and compiling Marian, translating with a pretrained model, preparing training data and training a basic NMT model, and contains list of exercises introducing different features and model architectures available in Marian.

MT Marathon 2017 Tutorial

  • Part 1: First steps with Marian: Downloading and compiling Marian. Translation with a pretrained model. Preparing a parallel corpus for training. Training a shallow encoder-decoder model with attention.
  • Part 2: Complex models: Here we take a look at more complex models, for instance deeper models or multi-encoder models.
  • Part 3: Coding tutorial: Code a custom model, here a simple Sutskever-style model without attention.

Use cases