Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include Dictionary metric #6

Open
M3ssman opened this issue Nov 8, 2022 · 1 comment
Open

Include Dictionary metric #6

M3ssman opened this issue Nov 8, 2022 · 1 comment
Assignees
Labels
enhancement New feature or request

Comments

@M3ssman
Copy link
Member

M3ssman commented Nov 8, 2022

Description

Include Dictionary metric as textual OCR metric.

Example Implementation: https://github.com/ulb-sachsen-anhalt/ocr-pipeline/blob/master/lib/ocr_step.py#L337 ff

@M3ssman M3ssman added the enhancement New feature or request label Nov 8, 2022
@einspunktnull einspunktnull self-assigned this May 24, 2023
@M3ssman
Copy link
Member Author

M3ssman commented Jun 14, 2023

Please add parameter for

  • language (i.e., from metadata ger => de-DE, fra => fr, en => en-GB )
  • DictLT => call to languagetool (default connection: localhost:8010/v2/check, default container name digital-eval-languagetool )
  • DictXX => call to XX (default connection param)

Dictionary-endpoint (connection URL, i.e. like for current language-tool implementation, or local call to spellchecker like hunspell) is currently set as default on module level.

Please add communication for unavailable endpoints (i.e., service is down, container not running, tool not installed, etc.)

Please add communication for unsupported languages related to specific endpoint (cf. ulb-ocr-odem) .
Recent languages seem to include beside German and French also Arabic and Persian.

Please review test data and fixtures from ocr-pipeline

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants