Skip to content

Commit

Permalink
docs: rejected proposal the fuzzy strings
Browse files Browse the repository at this point in the history
  • Loading branch information
OmarIthawi committed Nov 4, 2023
1 parent 4541357 commit 0f23139
Show file tree
Hide file tree
Showing 6 changed files with 160 additions and 26 deletions.
21 changes: 5 additions & 16 deletions .github/workflows/automerge-transifex-app-prs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,26 +15,15 @@ jobs:
uses: actions/checkout@v3
with:
fetch-depth: 0
- name: merge pull request
uses: nick-fields/retry@v2
id: mergePR

- name: Auto-merge pull request
env:
# secrets can't be used in job conditionals, so we set them to env here
TRANSIFEX_APP_ACTOR_NAME: "${{ secrets.TRANSIFEX_APP_ACTOR_NAME }}"
TRANSIFEX_APP_ACTOR_ID: "${{ secrets.TRANSIFEX_APP_ACTOR_ID }}"
# This token requires Write access to the openedx-translations repo
GITHUB_TOKEN: ${{ secrets.EDX_TRANSIFEX_BOT_GITHUB_TOKEN }}
PR_NUMBER: ${{ github.event.number }}
if: "${{ github.actor == env.TRANSIFEX_APP_ACTOR_NAME && github.actor_id == env.TRANSIFEX_APP_ACTOR_ID }}"
with:
retry_wait_seconds: 60
max_attempts: 5
timeout_minutes: 15
retry_on: error
command: |
# Attempt to merge the PR
gh pr merge ${{ github.head_ref }} --rebase --auto
# The `fromdate | todate` are used merge to validate that `mergedAt` isn't null
# therefore verifying that the pull request was merged successfully.
gh pr view "$PR_NUMBER" --json mergedAt --jq '.mergedAt | fromdate | todate'
run: |
# Add the pull request to the merge queue with --rebase commit strategy
gh pr merge ${{ github.head_ref }} --rebase --auto
2 changes: 1 addition & 1 deletion .github/workflows/validate-translation-files.yml
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ jobs:
:warning: There are errors in the translation files:
```
${{ steps.validate_translation_files.outputs.VALIDATION_ERRORS }}
${{ steps.validate_translation_files.outputs.VALIDATION_ERRORS || 'No errors were reported.' }}
```
This comment has been posted by the `validate-translation-files.yml` GitHub Actions workflow.
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,7 @@ test_requirements: ## Installs test.txt requirements
test: ## Run scripts tests
pytest -v -s --cov=. --cov-report=term-missing --cov-report=html scripts/tests

validate_translation_files: ## Run basic validation to ensure files are compilable
validate_translation_files: ## Run basic validation to ensure translation files are valid
python scripts/validate_translation_files.py

sync_translations: ## Syncs from the old projects to the new openedx-translations project
Expand Down
126 changes: 126 additions & 0 deletions docs/decisions/0002-mark-fuzzy-entries.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
Mark invalid entries in po files as fuzzy
#########################################

Status
******

Rejected.

Context
*******
As of the `OEP-58`_, the Transifex GitHub App is used to sync Translations.
These translations are validated by `LexiQA`_, built-in Transifex quality
checks, and human reviewers. During this synchronization process, the
`Transifex GitHub App`_ creates pull requests to this repository. The
translations in the pull requests are then validated by ``msgfmt`` in CI.

Problem
*******
There are times when the translations being synchronized fail ``msgfmt``
validation. This prevents the pull requests from being merged.

Proposed Design Decision [Rejected]
***********************************

Rejection Reason
****************

This design decision was rejected because there were multiple issues that
had no feasible solution within the original scope of the `OEP-58`_
proposal. These issues are listed below:

- The workflow script edits the translations and effectively hide the
errors, which can be confusing for the translators. Therefore,
a good notification should be
sent to translators to avoid silencing errors.
- Writing to the pull request will not trigger the GitHub Actions workflow,
therefore the pull request cannot be merged unless the commit status
is overridden manually via `GitHub "create a commit status" API`_.
- The solution is ``.po`` file specific. And there's no similar solution for
``.json``, ``.yaml``, iOS, Android or other translations files.

Please refer to the original pull request which contains the rejected
implementation: `mark invalid entries as fuzzy | FC-0012`_.


Proposal Design
***************

A GitHub Actions workflow will be implemented to mark invalid entries in
synchronized ``.po`` files as ``fuzzy``. This will update pull requests
created by the `Transifex GitHub App`_.

To ensure a safe and reliable workflow, the following workflows will be
combined into one single workflow with multiple jobs:

#. Run ``python-tests.yml`` to validate the python code
#. Then, run ``validate-translation-files.yml`` which performs the following:

#. Validate the po-files using ``msgfmt``
#. Notify the translators about the invalid entries via the preferred
communication channel (Slack, Transifex, GitHub)
#. Edit the po files to mark invalid entries as ``fuzzy``, so it's
excluded from ``.mo`` files
#. Revalidate the files

#. Commit the updated entries and push to the PR branch
#. Automatically merge the PR


Informing the Translators
*************************
Translators can be notified about invalid translations via Slack, Transifex
or GitHub issues depending on the community's preference.

Pros and Cons
*************

**Pros**

* New valid strings would make it into the ``.mo`` files
* There's no need for manual intervention, therefore it's fast and won't
create a backlog of pull requests.
* Rejected strings are easily identifiable by looking in the code, so it's
easy to fix them.
* Translators can be notified about invalid translations via Slack, Transifex,
GitHub depending on the community's preference.


**Cons**

* The workflow script runs and edits the pull request, which can be
confusing for the reviewers.
* Previously valid entries are going to be overwritten with new ``fuzzy``
strings which will make those entries to be shown in source language
(English).

Rejected Alternative: Keep pull requests open
*********************************************

- **Create a Transifex issue only:** It would be possible to identify the
faulty entries and create a Transifex issue via API to the faulty entries.
However, this would require attention from the translators
team, which can take a rather long time therefore creating a backlog of
invalid entries. Additionally, Transifex issues are not understood or used
by most of the Open edX community.

- **Post errors on Slack only:** Post the errors on Slack and ask the
translators
to fix them. Same as the previous option, not all translators are on Slack.
Additionally, this option would have a slow feedback loop causing the pull
requests backlog to build-up.

- **Mark faulty entries as unreviewed only:** Use the Transifex API to mark
the the invalid entries as unreviewed, then pull only
reviewed entries into this repository.
This option would require an extensive use the of the Transifex API,
and pull again which can be complex to implement. Additionally, this option
would have a slow feedback loop causing the pull requests backlog to
build-up.


.. _OEP-58: https://open-edx-proposals.readthedocs.io/en/latest/architectural-decisions/oep-0058-arch-translations-management.html
.. _LexiQA: https://help.transifex.com/en/articles/6219179-lexiqa
.. _Transifex GitHub App: https://github.com/apps/transifex-integration
.. _GitHub "create a commit status" API: https://docs.github.com/en/rest/commits/statuses?apiVersion=2022-11-28#create-a-commit-status
.. _mark invalid entries as fuzzy | FC-0012: https://github.com/openedx/openedx-translations/pull/1944
10 changes: 5 additions & 5 deletions scripts/tests/test_validate_translation_files.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,11 @@
"""

import os.path
import re

from ..validate_translation_files import (
get_translation_files,
main,
validate_translation_files,
)

SCRIPT_DIR = os.path.dirname(__file__)
Expand Down Expand Up @@ -35,7 +36,7 @@ def test_main_on_invalid_files(capsys):
Integration test for the `main` function on some invalid files.
"""
mock_translations_dir = os.path.join(SCRIPT_DIR, 'mock_translations_dir')
exit_code = main(mock_translations_dir)
exit_code = validate_translation_files(mock_translations_dir)
out, err = capsys.readouterr()

assert 'VALID:' in out, 'Valid files should be printed in stdout'
Expand All @@ -44,8 +45,7 @@ def test_main_on_invalid_files(capsys):
assert 'hi/LC_MESSAGES/django.po' not in out, 'Invalid file should be printed in stderr'
assert 'en/LC_MESSAGES/django.po' not in out, 'Source file should not be validated'

assert 'INVALID:' in err
assert 'hi/LC_MESSAGES/django.po' in err
assert re.match(r'INVALID: .*hi/LC_MESSAGES/django.po', err)
assert '\'msgstr\' is not a valid Python brace format string, unlike \'msgid\'' in err
assert 'FAILURE: Some translations are invalid.' in err

Expand All @@ -57,7 +57,7 @@ def test_main_on_valid_files(capsys):
Integration test for the `main` function but only for the Arabic translations which is valid.
"""
mock_translations_dir = os.path.join(SCRIPT_DIR, 'mock_translations_dir/demo-xblock/conf/locale/ar')
exit_code = main(mock_translations_dir)
exit_code = validate_translation_files(mock_translations_dir)
out, err = capsys.readouterr()

assert 'VALID:' in out, 'Valid files should be printed in stdout'
Expand Down
25 changes: 22 additions & 3 deletions scripts/validate_translation_files.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,15 @@
import sys
"""
Validate translation files using GNU gettext `msgfmt` command.
This script is used to validate translation files in the Open edX platform and mark
invalid entries as fuzzy.
"""

import argparse
import os
import os.path
import subprocess
import sys


def get_translation_files(translation_directory):
Expand Down Expand Up @@ -39,7 +47,9 @@ def validate_translation_file(po_file):
}


def main(translations_dir='translations'):
def validate_translation_files(
translations_dir='translations',
):
"""
Run GNU gettext `msgfmt` and print errors to stderr.
Expand Down Expand Up @@ -81,5 +91,14 @@ def main(translations_dir='translations'):
return exit_code


def main(): # pragma: no cover
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument('--dir', action='store', type=str, default='translations')
args = parser.parse_args()
sys.exit(validate_translation_files(
translations_dir=args.dir,
))


if __name__ == '__main__':
sys.exit(main()) # pragma: no cover
main() # pragma: no cover

0 comments on commit 0f23139

Please sign in to comment.