Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

document_type for DocumentMetrics #343

Merged
merged 10 commits into from
Sep 14, 2023
Merged

Conversation

ArneBinder
Copy link
Owner

@ArneBinder ArneBinder commented Sep 14, 2023

Similar to TaskModules, DocumentMetrics require documents of a certain type as input. This PR adds the functionality to let DocumentMetricss signal what document type they need.

In detail:

  • create RequiresDocumentTypeMixin that defines the class variable DOCUMENT_TYPE and the property document_type (which returns DOCUMENT_TYPE per default). It also defines the method convert_dataset() that checks for several edge cases before calling dataset.to_document_type(self.document_type)
  • use RequiresDocumentTypeMixin for DocumentMetric and also for TaskModule
  • add the parameter document_type to DocumentStatistics that will be returned when calling DocumentStatistic.document_type (it overwrites DOCUMENT_TYPE)
  • adjust the logic of (Iterable)Dataset(Dict).to_document_type(): we now also allow converters that are registered for document types that are subclasses of the requested type (e.g. if we have a converter for DocWithEntitiesAndRelations, but just need DocWithEntities, we still use that converter)

@ArneBinder ArneBinder force-pushed the document_type_for_metrics branch from e9d9906 to bd8f485 Compare September 14, 2023 15:08
@ArneBinder ArneBinder merged commit 4f26eca into main Sep 14, 2023
6 checks passed
@ArneBinder ArneBinder deleted the document_type_for_metrics branch September 14, 2023 17:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant