Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better Intelligibility metric for CAD2 Task1 #419

Merged
merged 44 commits into from
Oct 15, 2024
Merged

Conversation

groadabike
Copy link
Contributor

In CAD2 task1, the original intelligibility metric was using whisper for transcription and jiwer for computing correctness.
However, jiwer was used without any normalisation resulting in lower score.
For example, capitalised vs non-capitalised words, punctuations not excluded.

The changes are:

Replace jiwer by alt-eval which performs several normalisation before calling jiwer.
Add verbose option True or False for MSGB HL model to reduce unnecessary prints. Warnings will still be printed regardless the verbose option.

pre-commit-ci bot and others added 30 commits July 29, 2024 23:08
updates:
- [github.com/asottile/pyupgrade: v3.16.0 → v3.17.0](asottile/pyupgrade@v3.16.0...v3.17.0)
- [github.com/pre-commit/mirrors-mypy: v1.10.1 → v1.11.0](pre-commit/mirrors-mypy@v1.10.1...v1.11.0)
- [github.com/astral-sh/ruff-pre-commit: v0.5.1 → v0.5.5](astral-sh/ruff-pre-commit@v0.5.1...v0.5.5)
- [github.com/pycqa/pylint: v3.2.5 → v3.2.6](pylint-dev/pylint@v3.2.5...v3.2.6)
Add CAD2 to the README
…nfig

[pre-commit.ci] pre-commit-autoupdate
updates:
- [github.com/psf/black: 24.4.2 → 24.8.0](psf/black@24.4.2...24.8.0)
- [github.com/pycqa/flake8.git: 7.1.0 → 7.1.1](https://github.com/pycqa/flake8.git/compare/7.1.0...7.1.1)
- [github.com/nbQA-dev/nbQA: 1.8.5 → 1.8.7](nbQA-dev/nbQA@1.8.5...1.8.7)
- [github.com/pre-commit/mirrors-mypy: v1.11.0 → v1.11.1](pre-commit/mirrors-mypy@v1.11.0...v1.11.1)
- [github.com/astral-sh/ruff-pre-commit: v0.5.5 → v0.5.7](astral-sh/ruff-pre-commit@v0.5.5...v0.5.7)
…nfig

[pre-commit.ci] pre-commit-autoupdate
Signed-off-by: Gerardo Roa <[email protected]>
…everal-errors-in-jupyter-notebooks

Place imports at top of cell
updates:
- [github.com/DavidAnson/markdownlint-cli2: v0.13.0 → v0.14.0](DavidAnson/markdownlint-cli2@v0.13.0...v0.14.0)
- [github.com/pre-commit/mirrors-mypy: v1.11.1 → v1.11.2](pre-commit/mirrors-mypy@v1.11.1...v1.11.2)
- [github.com/astral-sh/ruff-pre-commit: v0.5.7 → v0.6.4](astral-sh/ruff-pre-commit@v0.5.7...v0.6.4)
- [github.com/pycqa/pylint: v3.2.6 → v3.2.7](pylint-dev/pylint@v3.2.6...v3.2.7)
Signed-off-by: Gerardo Roa <[email protected]>
Signed-off-by: Gerardo Roa <[email protected]>
Signed-off-by: Gerardo Roa <[email protected]>
…ith-notebooks

New errors in Jupyter Notebook
…nfig

[pre-commit.ci] pre-commit-autoupdate
updates:
- [github.com/astral-sh/ruff-pre-commit: v0.6.4 → v0.6.5](astral-sh/ruff-pre-commit@v0.6.4...v0.6.5)
groadabike and others added 13 commits September 20, 2024 10:47
change version of test
…nfig

[pre-commit.ci] pre-commit-autoupdate
updates:
- [github.com/pre-commit/pre-commit-hooks: v4.6.0 → v5.0.0](pre-commit/pre-commit-hooks@v4.6.0...v5.0.0)
- [github.com/psf/black: 24.8.0 → 24.10.0](psf/black@24.8.0...24.10.0)
- [github.com/astral-sh/ruff-pre-commit: v0.6.5 → v0.6.9](astral-sh/ruff-pre-commit@v0.6.5...v0.6.9)
- [github.com/pycqa/pylint: v3.2.7 → v3.3.1](pylint-dev/pylint@v3.2.7...v3.3.1)
…nfig

[pre-commit.ci] pre-commit-autoupdate
updates:
- [github.com/asottile/pyupgrade: v3.17.0 → v3.18.0](asottile/pyupgrade@v3.17.0...v3.18.0)
…nfig

[pre-commit.ci] pre-commit-autoupdate
Signed-off-by: Gerardo Roa <[email protected]>
Signed-off-by: Gerardo Roa <[email protected]>
Signed-off-by: Gerardo Roa <[email protected]>
@groadabike groadabike marked this pull request as ready for review October 15, 2024 14:30
@groadabike groadabike marked this pull request as draft October 15, 2024 14:31
@groadabike groadabike changed the base branch from main to v0.6 October 15, 2024 14:31
@groadabike groadabike marked this pull request as ready for review October 15, 2024 14:36
@groadabike groadabike merged commit 85ed818 into v0.6 Oct 15, 2024
1 check was pending
@sgraetzer
Copy link
Contributor

In CAD2 task1, the original intelligibility metric was using whisper for transcription and jiwer for computing correctness. However, jiwer was used without any normalisation resulting in lower score. For example, capitalised vs non-capitalised words, punctuations not excluded.

The changes are:

Replace jiwer by alt-eval which performs several normalisation before calling jiwer. Add verbose option True or False for MSGB HL model to reduce unnecessary prints. Warnings will still be printed regardless the verbose option.

Excellent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants