Post MTPE tasks

Analyze PE effort

Getting started

Clone this repo, create a virtual environment, activate it and install all dependencies:

python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

Make sure you use the python version indicated in file .python-version (if you use pyenv, it should take care of the this for you).

One additional dependency must be installed differently:

git clone https://github.com/capstanlqc/tmx2dataframe.git clone/tmx2dataframe
cd clone/tmx2dataframe
pip install .

Getting the data

The data to be analyzed will be extracted from OmegaT projects after the PE task is concluded.

Clone all team projects or unpack all the projects you want to analyze in a folder, let's call it repos. Copy file config.json" to that folder to.

Running

Run the script as:

python code/corr-score-edit.py --parent /path/to/repos --template "ysc-strata_*_omt"

where the two arguments are the following:

--parent: absolute path to the folder where all project have been cloned or unpacked
--template: pattern including wildcards to match the project folders that should be considered

Discussion

Background

What is this threshold: @Laura: how did you determine the threshold?

The threshold draws the line between the translations that require PE (those which have a QE score below the threshold) and those that require no or minimal PE (those which have a QE score above the threshold).

The ideal threshold would split all translations in two groups: high confidence translations (above the threshold, which have good quality without PE) and low confidence translations (below the threshold, which require PE).

Some fine-tuning is necessary to find the ideal threshold.

Goal

The goal of this analysis of PE effort is to ascertain to what degree the threshold used was ideal or far from ideal and how much fine-tuning is needed.

The general hypothesis is that there is a linear relationship between the QE score assigned to MT translations and the PE effort used to improve them.

Data

For each language pair (where English is always the source language), we collected the following data (for each segment in the project):

source sentence
initial MT version
PE version
edit distance (number of edits)
similarity ratio
QE score (Comet)
confidence category
deviation of QE score from threshold

The data sample includes the whole population. The two variables measured (similarity ratio and deviation of QE score from threshold) are paired. No data clean-up was done, so data might include outliers.

We also computed the average similarity ratio between the initial MT version and the PE version for each of the two categories (high confidence and low confidence).

Analysis

We analyzed correlation between two variables:

the quality evaluation (QE) score of each MT version
the post-editing (PE) distance to turn the initial MT version into the PE one

We have used two metrics to calculate the correlation coefficient (which measures the extent to which two variables tend to change together):

Pearson correlatoin coefficient
Spearman correlation coeficient

We have visualized and examined the relationship between the two variables with a scatterplot.

Results

Locale	Threshold	Pearson coefficient	Pearson P-value	Spearman coefficient	Spearman P-value
pt-PT	0.84	0.373566088788977	3.02915498707086E-131	0.270483131207319	2.73690796762657E-67
pl-PL	0.87	0.394038614190113	4.28759280116097E-147	0.372445306402085	2.0727032912648E-130
fr-FR	0.87	0.340461984939205	6.16300411319614E-108	0.194437929986979	5.27246757422632E-35
de-DE	0.86	0.468501065695487	4.58098741147011E-215	0.352645304844544	3.42173552600613E-116
ro-RO	0.9	0.305989685321915	1.62160426139411E-86	0.344508570234265	1.22620953749372E-110
es-ES	0.88	0.345382876098042	3.15963260617931E-111	0.339930141525419	1.38620442294041E-107

Here are the scatter plots for visual insight:

See below average similarity ratio between the initial MT version and the PE version calculated for each of the two categories (high confidence and low confidence).

Locale	High confidence	Low confidence
pt-PT	0.97	0.90
pl-PL	0.94	0.89
fr-FR	0.94	0.89
de-DE	0.95	0.88
ro-RO	0.91	0.82
es-ES	0.88	0.79

Interpretation

Variables: The two variables whose relationship is measured are the PE distance and the QE score.

Metrics: The Pearson correlation coefficient measures the linear relationship between two variables and the Spearman correlation coefficient measures the monotonic relationship (not limited to linear) between two variables.

Strength: All Pearson coefficients range between 0.3 and 0.46, indicating weak to moderate positive linear relationships. Spearman coefficients range between 0.19 and 0.37, showing weaker but still statistically significant monotonic relationships.

In most locales, Pearson coefficients are higher than Spearman coefficients, suggesting that the relationships are more linear than monotonic.

Significance: All Pearson p-values are extremely small, indicating that the linear relationships are statistically significant beyond any reasonable doubt, ruling out the possibility of random noise driving these results.

de-DE exhibits the strongest linear relationship, while pl-PL shows the strongest monotonic relationship. fr-FR has weaker relationships overall, suggesting potential differences in the underlying data trends for this locale.

Follow-up

We could explore why some locales show weaker relationships.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
analysis		analysis
user		user
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
corr-score-edit.py		corr-score-edit.py
corr-thrshld-deviation-vs-edit.py		corr-thrshld-deviation-vs-edit.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Post MTPE tasks

Analyze PE effort

Getting started

Getting the data

Running

Discussion

Background

Goal

Data

Analysis

Results

Interpretation

Follow-up

References

About

Releases

Packages

Languages

capstanlqc/mtpe-tasks

Folders and files

Latest commit

History

Repository files navigation

Post MTPE tasks

Analyze PE effort

Getting started

Getting the data

Running

Discussion

Background

Goal

Data

Analysis

Results

Interpretation

Follow-up

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages