Skip to content

Latest commit

 

History

History
54 lines (47 loc) · 3.86 KB

EDATT_SIMULST_AGENT_ACL2023.md

File metadata and controls

54 lines (47 loc) · 3.86 KB

EDAtt agent for Simultaneous Speech Translation (ACL 2023)

Code for the paper: "Attention as a Guide for Simultaneous Speech Translation" published at ACL 2023.

📎 Requirements

To run the agent, please make sure that SimulEval v1.0.2 (commit d1a8b2f) is installed and set --port accordingly.

📌 Pre-trained offline models

We release the offline ST models used for EDAtt simultaneous inference.

🤖 Inference

Set --source, --target, and --config as described in the Fairseq Simultaneous Translation repository. --model-path is the path to the offline ST model checkpoint (either en-de or en-es), --attn-threshold is the value of alpha used for the inference (alpha=[0.6, 0.4, 0.2, 0.1, 0.05, 0.03] in the paper).
The output will be saved in --output.

simuleval \
    --agent ${FBK_FAIRSEQ_ROOT}/examples/speech_to_text/simultaneous_translation/agents/v1_0/simul_offline_edatt.py \
    --source ${SRC_LIST_OF_AUDIO} \
    --target ${TGT_FILE} \
    --data-bin ${DATA_ROOT} \
    --config config_simul.yaml \
    --model-path ${ST_SAVE_DIR}/checkpoint_avg7.pt \
    --extract-attn-from-layer 3 \
    --frame-num 2 --attn-threshold $ALPHA \
    --speech-segment-factor 20 \
    --output ${OUT_DIR} \
    --port ${PORT} \
    --gpu \
    --scores

💬 Outputs

To ensure complete reproducibility, we also release the outputs obtained by EDAtt using SimulEval 1.0.2:

📍Citation

@inproceedings{papi-et-al-2023-edatt,
title = "Attention as a Guide for Simultaneous Speech Translation",
author = {Papi, Sara and Negri, Matteo and Turchi, Marco},
booktitle = "Proceedings of the 61th Annual Meeting of the Association for Computational Linguistics",
year = "2023",
address = "Toronto, Canada",
publisher = "Association for Computational Linguistics"
}