Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New Term - verbatimMeasurementType #518

Open
sformel-usgs opened this issue Jul 10, 2024 · 7 comments
Open

New Term - verbatimMeasurementType #518

sformel-usgs opened this issue Jul 10, 2024 · 7 comments

Comments

@sformel-usgs
Copy link

New term

  • Submitter: Stephen Formel

  • Efficacy Justification (why is this term necessary?):

TL;DR: to preserve linkage to a bespoke MOF term when mapping to a controlled MOF term. Very similar to the discussion in #181.

One of the challenges of effectively implementing the MeasurementOrFact and extendedMeasurementOrFact extensions is reducing the noise across datasets by mapping bespoke terms to preferred terms (e.g. the OBIS community's preference for BODC terms). If the preferred terms aren't used in original data, then the original term is lost in the mapping. That loss can be difficult for data providers to be at peace with, and they will hesitate to accept the extensions as effective solutions for publishing their data. We think it's logical to ask data providers to map to standardized MOF terms during publishing, like we do for taxonomy and georeferencing. But we also think we should try to preserve the original term, so downstream users can have additional metadata that might provide context when examining a published paper, or raw version of the data.

  • Demand Justification (name at least two organizations that independently need this term):

    • USGS (GBIF-US/OBIS-USA)
    • SCAR Antarctic Biodiversity Portal
  • Stability Justification (what concerns are there that this might affect existing implementations?): None.

  • Implications for dwciri: namespace (does this change affect a dwciri term version)?: None

Proposed attributes of the new term:

  • Term name (in lowerCamelCase for properties, UpperCamelCase for classes): verbatimMeasurementType
  • Term label (English, not normative): Verbatim Measurement Type
  • Organized in Class (e.g., Occurrence, Event, Location, Taxon): MeasurementOrFact
  • Definition of the term (normative):
    • A string representing the measurement, or fact, type as it appeared in the original record.
  • Usage comments (recommendations regarding content, etc., not normative):
    • This term is meant to allow the capture of an unaltered original name for a measurement or fact type. This term is meant to be used in addition to dwc:measurementType, not instead of it.
  • Examples (not normative):
verbatimMeasurementType measurementType measurementTypeID
water_temp Temperature of the water body http://vocab.nerc.ac.uk/collection/P01/current/TEMPPR01/
Fish biomass Wet weight biomass of biological entity specified elsewhere per unit area of the bed http://vocab.nerc.ac.uk/collection/P01/current/SDBIOL05/
sampling net mesh size Mesh size of sample collector http://vocab.nerc.ac.uk/collection/P01/current/MSHSIZE1/
  • Refines (identifier of the broader term this term refines; normative): None
  • Replaces (identifier of the existing term that would be deprecated and replaced by this term; normative): None
  • ABCD 2.06 (XPATH of the equivalent term in ABCD or EFG; not normative): None
@jdpye
Copy link
Member

jdpye commented Jul 10, 2024

I like this proposal from my data curator perspective! This would be helpful to record original names of biological measurements which have evolved within regional or cultural groups and map well to BODC, but their original names are relevant and might be integral to the information as it was collected.

@ymgan
Copy link

ymgan commented Jul 11, 2024

Thank you so much Steve! We would like to support this proposal because we want to use a general vocabulary for certain measurements in measurementType and measurementTypeID while keeping the nuance in verbatimMeasurementType.

Our current challenge

At this moment, we (the antarctic OBIS/GBIF node) are placing the verbatim under measurementType which can be different from what is shown in the source of measurementTypeID. This is because the wordings of the same measurementType can be slightly different than what our data provider use in their report/paper and often even contain important details.

measurementType measurementTypeID
The δ13C measured in the considered sample, expressed in per mille and relative to the international reference Vienna Pee Dee Belemnite. https://vocab.nerc.ac.uk/collection/P01/current/C13BTX01/

Cleaner solution

verbatimMeasurementType measurementType measurementTypeID
The δ13C measured in the tegument of the considered sea star specimen, expressed in per mille and relative to the international reference Vienna Pee Dee Belemnite. Enrichment with respect to Vienna Pee Dee Belemnite (VPDB) of carbon-13 {13C CAS 14762-74-4} {delta(13)C} in biota {biological entity specified elsewhere} by mass spectrometry http://vocab.nerc.ac.uk/collection/P01/current/C13BTX01/
The δ13C measured in the adductor muscle of the considered mussel specimen, expressed in per mille and relative to the international reference Vienna Pee Dee Belemnite. Enrichment with respect to Vienna Pee Dee Belemnite (VPDB) of carbon-13 {13C CAS 14762-74-4} {delta(13)C} in biota {biological entity specified elsewhere} by mass spectrometry http://vocab.nerc.ac.uk/collection/P01/current/C13BTX01/

We think that having verbatimMeasurementType will be cleaner as this present consistent information for measurementType and measurementTypeID while allowing the details to be kept under verbatimMeasurementType.

This will help us so much as:

@sformel-usgs
Copy link
Author

Just updating to say that the OBIS community is doing a lot of discussion of this suggestion. We plan on discussing it more formally at the next OBIS Vocabulary meeting on Sept 18th and will update this issue with the conclusions.

@rubenpp7
Copy link

Hi everyone,

I totally understand the need of having a place to store the verbatim MeasurementType. Up to this moment in EurOBIS we have been storing the verbatim data under the measurementType field as well, letting it being different to the "name" of the BODC term used.

Please let me know if my understanding is correct, the proposal is to add the BODC term "name" of a concept exactly as it comes in BODC under the measurementType field.

If I got that right and we expect data providers to add this extra value in their submissions, I think that it may suppose an extra amount of work to the data creator that actually belongs more to the data services creators. I see how it is more convenient for us data managers to have these 3 columns close to each other in a table for quality control and filtering data purposes but I wonder if it's really needed to have it at the data standard level.

An alternative would be to let the data services (QC tools, data portals) get used to extract information from BODC just like they do with other vocabulary systems (e.g. WoRMS) in order to quality check and filter data.
To me, the point of these vocabularies is to use the ID of a concept to extract all the other information (only) when needed.

Sorry for playing the devil's advocate here, I just think that if we add the "name", what stop us from adding everything else? For example, the deprecated label is also quite relevant. My reasoning is that we should only add new terms to the standard in the case that there is some information that is not being/could not be captured otherwise.

Cheers!

@sformel-usgs
Copy link
Author

Just adding a note to say that we had a good discussion on this yesterday but were not able to come to a conclusion. We will continue the discussion in October. Suffice to say there is no clear best way, but the conversation is helping us focus our thoughts on how the current situation is perceived by different OBIS contributors and users.

@sformel-usgs
Copy link
Author

An additional note to say that the OBIS vocabulary discussion group continues to explore this. We would like to keep it open, but don't see an immediate solution. Please feel free to continue to comment and discuss in this space if you have thoughts, especially outside of the OBIS needs.

@tucotuco
Copy link
Member

Of potential interest is the solution we are hoping to use in an evolution of Darwin Core based on work on the GBIF Unified Model. In a Darwin Core version 2 publishing model, MeasurementsOfFacts would be called Assertions. Assertions support an assertionType, which is where data that one might put into this verbatimMeasurementType would go. Formal vocabulary values could also go there, but would only be stood as values from a controlled vocabulary if that vocabulary was also referenced, in a term called assertionTypeVocabulary. In addition to these options, a term called assertionTypeIRI would support a controlled value term directly if that term had an IRI that could be resolved for its definition.

The proposed Assertion class would look like this:

assertionID
assertionTargetID
assertionTargetType
assertionType
assertionTypeIRI
assertionTypeVocabulary
assertionMadeDate
assertionEffectiveDate
assertionValue
assertionValueNumeric
assertionUnit
assertionUnitIRI
assertionUnitVocabulary
assertionBy
assertionByID
assertionProtocolID
assertionProtocol
assertionCitation
assertionRemarks

where most of that is probably straightforward, but assertionTargetType would declare what the Assertion was about ('Event', 'Occurrence', 'Material Entity', 'Media', 'Agent', 'Identification', .... - not just Event or Occurrence), and assertionTargetID would the the identifier for the record the Assertion is about (eventID, occurrenceID, materialEntityID, mediaID, agentID, identificationID, ...).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants