Releases · awslabs/sockeye

10 Mar 09:14

fhieber

3.1.4

edac700

3.1.4

[3.1.4]

Added

Added support for the use of adding target prefix and target prefix factors to the input in JSON format during inference.

Assets 2

28 Feb 09:43

fhieber

3.1.3

ea08143

3.1.3

[3.1.3]

Added

Added support for the use of adding source prefixes to the input in JSON format during inference.

[3.1.2]

Changed

Optimized creation of source length mask by using expand instead of repeat_interleave.

[3.1.1]

Changed

Updated torch dependency to 1.10.x (torch>=1.10.0,<1.11.0)

Assets 2

11 Feb 09:25

fhieber

3.1.0

cc7922e

3.1.0

[3.1.0]

Sockeye is now exclusively based on Pytorch.

Changed

Renamed x_pt modules to x. Updated entry points in setup.py.

Removed

Removed MXNet from the codebase
Removed device locking / GPU acquisition logic. Removed dependency on portalocker.
Removed arguments --softmax-temperature, --weight-init-*, --mc-dropout, --horovod, --device-ids
Removed all MXNet-related tests

Assets 2

09 Feb 19:13

fhieber

3.0.15

9e11f7b

3.0.15

[3.0.15]

Fixed

Fixed GPU-based scoring by copying to cpu tensor first before converting to numpy.

[3.0.14]

Added

Added support for Translation Error Rate (TER) metric as implemented in sacrebleu==1.4.14.
Checkpoint decoder metrics will now include TER scores and early stopping can be determined
via TER improvements (--optimized-metric ter)

Assets 2

03 Feb 12:08

fhieber

3.0.13

cc96656

3.0.13

[3.0.13]

Changed

use expand instead of repeat for attention masks to not allocate additional memory
avoid repeated transpose for initializing cached encoder-attention states in the decoder.

[3.0.12]

Removed

Removed unused code for Weight Normalization. Minor code cleanups.

[3.0.11]

Fixed

Fixed training with a single, fixed learning rate instead of a rate scheduler (--learning-rate-scheduler none --initial-learning-rate ...).

Assets 2

19 Jan 07:19

fhieber

3.0.10

bd1c091

3.0.10

[3.0.10]

Changed

End-to-end trace decode_step of the Sockeye model. Creates less overhead during decoding and a small speedup.

[3.0.9]

Fixed

Fixed not calling the traced target embedding module during inference.

[3.0.8]

Changed

Add support for JIT tracing source/target embeddings and JIT scripting the output layer during inference.

Assets 2

20 Dec 09:59

fhieber

3.0.7

6905c78

3.0.7

[3.0.7]

Changed

Improve training speed by usingtorch.nn.functional.multi_head_attention_forward for self- and encoder-attention
during training. Requires reorganization of the parameter layout of the key-value input projections,
as the current Sockeye attention interleaves for faster inference.
Attention masks (both for source masking and autoregressive masks need some shape adjustments as requirements
for the fused MHA op differ slightly).
- Non-interleaved format for joint key-value input projection parameters:
  in_features=hidden, out_features=2*hidden -> Shape: (2*hidden, hidden)
- Interleaved format for joint-key-value input projection stores key and value parameters, grouped by heads:
  Shape: ((num_heads * 2 * hidden_per_head), hidden)
- Models save and load key-value projection parameters in interleaved format.
- When model.training == True key-value projection parameters are put into
  non-interleaved format for torch.nn.functional.multi_head_attention_forward
- When model.training == False, i.e. model.eval() is called, key-value projection
  parameters are again converted into interleaved format in place.

[3.0.6]

Fixed

Fixed checkpoint decoder issue that prevented using bleu as --optimized-metric for distributed training (#995).

[3.0.5]

Fixed

Fixed data download in multilingual tutorial.

Assets 2

13 Dec 17:39

fhieber

3.0.4

8e5033b

3.0.4

[3.0.4]

Make sure data permutation indices are in int64 format (doesn't seem to be the case by default on all platforms).

[3.0.3]

Fixed

Fixed ensemble decoding for models without target factors.

[3.0.2]

Changed

sockeye-translate: Beam search now computes and returns secondary target factor scores. Secondary target factors
do not participate in beam search, but are greedily chosen at every time step. Accumulated scores for secondary factors
are not normalized by length. Factor scores are included in JSON output (--output-type json).
sockeye-score now returns tab-separated scores for each target factor. Users can decide how to combine factor scores
depending on the downstream application. Score for the first, primary factor (i.e. output words) are normalized,
other factors are not.

[3.0.1]

Fixed

Parameter averaging (sockeye-average) now always uses the CPU, which enables averaging parameters from GPU-trained models on CPU-only hosts.

Assets 2

30 Nov 09:48

fhieber

3.0.0

c44f126

3.0.0

[3.0.0] Sockeye 3: Fast Neural Machine Translation with PyTorch

Sockeye is now based on PyTorch.
We maintain backwards compatibility with MXNet models in version 2.3.x until 3.1.0.
If MXNet 2.x is installed, Sockeye can run both with PyTorch or MXNet but MXNet is no longer strictly required.

Added

Added model converter CLI sockeye.mx_to_pt that converts MXNet models to PyTorch models.
Added --apex-amp training argument that runs entire model in FP16 mode, replaces --dtype float16 (requires Apex).
Training automatically uses Apex fused optimizers if available (requires Apex).
Added training argument --label-smoothing-impl to choose label smoothing implementation (default of mxnet uses the same logic as MXNet Sockeye 2).

Changed

CLI names point to the PyTorch code base (e.g. sockeye-train etc.).
MXNet-based CLIs are now accessible via sockeye-<name>-mx.
MXNet code requires MXNet >= 2.0 since we adopted the new numpy interface.
sockeye-train now uses PyTorch's distributed data-parallel mode for multi-process (multi-GPU) training. Launch with: torchrun --no_python --nproc_per_node N sockeye-train --dist ...
Updated the quickstart tutorial to cover multi-device training with PyTorch Sockeye.
Changed --device-ids argument (plural) to --device-id (singular). For multi-GPU training, see distributed mode noted above.
Updated default value: --pad-vocab-to-multiple-of 8
Removed --horovod argument used with horovodrun (use --dist with torchrun).
Removed --optimizer-params argument (use --optimizer-betas, --optimizer-eps).
Removed --no-hybridization argument (use PYTORCH_JIT=0, see Disable JIT for Debugging).
Removed --omp-num-threads argument (use --env=OMP_NUM_THREADS=N).

Removed

Removed support for constrained decoding (both positive and negative lexical constraints)
Removed support for beam histories
Removed --amp-scale-interval argument.
Removed --kvstore argument.
Removed arguments: --weight-init, --weight-init-scale --weight-init-xavier-factor-type, --weight-init-xavier-rand-type
Removed --decode-and-evaluate-device-id argument.
Removed arguments: --monitor-pattern', --monitor-stat-func
Removed CUDA-specific requirements files in requirements/

Assets 2

05 Nov 09:28

fhieber

2.3.24

35dd717

2.3.24

[2.3.24]

Added

Use of the safe yaml loader for the model configuration files.

[2.3.23]

Changed

Do not sort BIAS_STATE in beam search. It is constant across decoder steps.

Assets 2

Releases: awslabs/sockeye

3.1.4

[3.1.4]

Added

3.1.3

[3.1.3]

Added

[3.1.2]

Changed

[3.1.1]

Changed

3.1.0

[3.1.0]

Changed

Removed

3.0.15

[3.0.15]

Fixed

[3.0.14]

Added

3.0.13

[3.0.13]

Changed

[3.0.12]

Removed

[3.0.11]

Fixed

3.0.10

[3.0.10]

Changed

[3.0.9]

Fixed

[3.0.8]

Changed

3.0.7

[3.0.7]

Changed

[3.0.6]

Fixed

[3.0.5]

Fixed

3.0.4

[3.0.4]

[3.0.3]

Fixed

[3.0.2]

Changed

[3.0.1]

Fixed

3.0.0

[3.0.0] Sockeye 3: Fast Neural Machine Translation with PyTorch

Added

Changed

Removed

2.3.24

[2.3.24]

Added

[2.3.23]

Changed