Skip to content

Commit

Permalink
[!144][RELEASE] Release of the INES evaluation (WMT2023)
Browse files Browse the repository at this point in the history
# Which work do we release?

"Test Suites Task: Evaluation of Gender Fairness in MT with MuST-SHE and INES" in WMT2023.

# What changes does this release refer to?

aebf19f029acc2516e67b6d4fd71e9673ee1ae33 3ca5b2666bc82d8902eb823435ffd1a39ede82e1
  • Loading branch information
bsavoldi authored and mgaido91 committed Oct 18, 2023
1 parent 74dc0fb commit b2f67b5
Show file tree
Hide file tree
Showing 2 changed files with 52 additions and 0 deletions.
2 changes: 2 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@ This repository contains the open source code by the MT unit of FBK.
Dedicated README for each work can be found in the `fbk_works` directory.

### 2023

- [[WMT 2023] **Test Suites Task: Evaluation of Gender Fairness in MT with MuST-SHE and INES**](fbk_works/INES_eval.md)
- [[ASRU 2023] **No Pitch Left Behind: Addressing Gender Unbalance in Automatic Speech Recognition Through Pitch Manipulation**](fbk_works/PITCH_MANIPULATION_ASR.md)
- [[TACL 2023] **Direct Speech Translation for Automatic Subtitling**](fbk_works/DIRECT_SUBTITLING.md)
- [[INTERSPEECH 2023] **AlignAtt: Using Attention-based Audio-Translation Alignments as a Guide for Simultaneous Speech Translation**](fbk_works/ALIGNATT_SIMULST_AGENT_INTERSPEECH2023.md)
Expand Down
50 changes: 50 additions & 0 deletions fbk_works/INES_eval.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# INES Test Suite Evaluation (WMT 2023)
Code to evaluate MT systems on the INclusive Evaluation Suite (INES).

## INES Evaluation

We release the code the FBK participation to the WMT Test Suite shared subtask: [**INES_eval.py**](../examples/speech_to_text/scripts/gender/INES_eval.py).
It allows to assess the ability of MT systems to generate inclusive language forms over non-inclusive ones when translating from German into English on the [**INES test set**](https://mt.fbk.eu/ines/).


For systems run on the INES test suite, the evaluation script "INES_eval.py" computes:

* **inclusivity_index** scores (INES official metric)

* **terms coverage** and **gender accuracy** scores (additional metrics)


### Usage

To work correctly, the script requires Python 3.

The script requires two mandatory arguments:

--input FILE
--tsv-definition FILE

Namely, the output of the system you want to evaluate and the [**INES.tsv**](https://mt.fbk.eu/ines/) file (the Gold Standard). Note that the output must be tokenized (e.g. with Moses' tokenizer.perl)

You can run "INES_eval.py --help" to get a list of the parameters taken by the script.
The script computes terms coverage and gender accuracy if requested as facultative argument.

Example Usage

python3 INES_eval.py --input MT OUTPUT FILE --tsv-definition INES.tsv


## 📍Citation

If you use this code and for more information, please refer to:

```bibtex
@inproceedings{savoldi-etal-2023-test,
title = {{Test Suite Task: Evaluation of Gender Fairness in MT with MuST-SHE and INES}},
author = {Savoldi, Beatrice and Gaido, Marco and Negri, Matteo and Bentivogli, Luisa},
booktitle = {Proceedings of the 8th International Conference on Machine Translation (WMT 2023)},
month = dec,
year = "2023",
address = "Singapore",
publisher = "Association for Computational Linguistics",
}
```

0 comments on commit b2f67b5

Please sign in to comment.