inclure

Automatic translation from Standard to Inclusive French, and vice-versa.

Source code and data for the paper INCLURE: a Dataset and Toolkit for Inclusive French Translation (Lerner and Grouin, 2024).

getting the INCLURE data

Using datasets.load_dataset, e.g. datasets.load_dataset("PaulLerner/oscar_inclure"): https://huggingface.co/datasets/PaulLerner/oscar_inclure

OOD test set https://huggingface.co/datasets/PaulLerner/cfi_sents_v_taln_2022/

experiments

trained models

train your own

python -m inclure.train experiments/inclure/train_config.json

python -m inclure.train experiments/exclure/train_config.json

evaluate

python -m inclure.train /path/to/test_config.json

one of experiments/*/*/test_config.json

reproduce/extend INCLURE

Get OSCAR 22.01 from https://huggingface.co/datasets/oscar-corpus/OSCAR-2201

By any chance, if you have access to Jean Zay, use $DSDIR/OSCAR/fr_meta

python -m inclure.x /path/to/oscar/fr_meta /output/folder

python -m inclure.data /output/folder

reference

If you use our code or dataset, please cite

@inproceedings{lerner:hal-04531938,
  TITLE = {{INCLURE: a Dataset and Toolkit for Inclusive French Translation}},
  AUTHOR = {Lerner, Paul and Grouin, Cyril},
  URL = {https://hal.science/hal-04531938},
  BOOKTITLE = {{The 17th Workshop on Building and Using Comparable Corpora (BUCC @ LREC 2024)}},
  ADDRESS = {Turin, Italy},
  YEAR = {2024},
  KEYWORDS = {Inclusive French ; Gender-neutral Language ; Parallel Corpus ; Neural Machine Translation},
  PDF = {https://hal.science/hal-04531938/file/bucc_lrec_2024_inclure%283%29.pdf},
  HAL_ID = {hal-04531938},
  HAL_VERSION = {v1},
}

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
data		data
experiments		experiments
inclure		inclure
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

inclure

getting the INCLURE data

experiments

trained models

train your own

evaluate

reproduce/extend INCLURE

reference

About

Releases 1

Packages

Languages

License

PaulLerner/inclure

Folders and files

Latest commit

History

Repository files navigation

inclure

getting the INCLURE data

experiments

trained models

train your own

evaluate

reproduce/extend INCLURE

reference

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages