Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filter on dataset name does not work #190

Open
andersfi opened this issue Oct 14, 2024 · 10 comments
Open

filter on dataset name does not work #190

andersfi opened this issue Oct 14, 2024 · 10 comments
Labels
help wanted Extra attention is needed

Comments

@andersfi
Copy link
Member

We use the term dataset name to register project data from NBICs "artsprosjekt" so that NBIC can verify that the data have been published according to the agreement. It is hence important that this workflow actually works. We have at NTNU-VM run into a bug here.

The filtering on dataset name in the GBIF portal seems not to be working. For example, the occurrence https://www.gbif.org/occurrence/4948367391 have the dataset name " Artsdatabanken Artsprosjekt_7-20_Rotifers - small coastal pounds from Agder, Trøms and Finnmark". Filtering for this on https://www.gbif.org/occurrence/search?publishing_org=a8144f37-5ff7-4137-9400-94b5b2ea4ec4&advanced=1&dataset_name=Artsdatabanken%20Artsprosjekt_7-20_Rotifers%20-%20small%20 does yield any hits on the portal.

This error occurred after renaming and republishing the dataset.

@andersfi andersfi added the help wanted Extra attention is needed label Oct 14, 2024
@aaltenburger2
Copy link
Collaborator

Related to the original question, I think the Provenance section, and therein the term project, would be a more natural place to include this information.
image
I can't find either the section or the term at https://dwc.tdwg.org/terms/. Do we have a field in MusIT that gets exported to the project field?

@dagendresen
Copy link
Member

I believe that these terms are from EML (Ecological Metadata Language) -- mixed with some terms created by GBIF on dataset level only. And thus not available at record level. There has been a well documented request for projectID, projectName, project funder, etc for Darwin Core (and record-level documentation of project data) -- but somebody needs to make the effort to make the term request and maybe create a TDWG Task Group to develop such terms.

@aaltenburger2
Copy link
Collaborator

I am happy to contribute requesting those terms. Can you share what has been documented already?

@dagendresen
Copy link
Member

dagendresen commented Oct 14, 2024

We can try to collect some relevant references and GitHub issues together? I mean to recall that at least Sharon Grant (Field Museum) and Ming (AntaBIF) have been posting requests and suggestions. We can search for these on the GBIF and TDWG GitHub. I suggest searching GitHub repositories under GBIF and TDWG for search keywords "datasetID" and "datasetName".

@aaltenburger2
Copy link
Collaborator

Thank you Dag! I see, it has been discussed previously, and a workaround was implemented by allowing the projectID metadata field to accept multiple values (gbif/pipelines#836). However, this solution still does not support projectIDs on individual records. Instead of creating a new issue, should we reopen the above GitHub issue #836 and request that projectID be added to DwC at the record level? I anticipate this might create issues with the projectID field in the GBIF metadata. What I want/need/suggest are "project name," "projectID," "funder name," and "funder ID" as DwC terms at the record level.

@dagendresen
Copy link
Member

dagendresen commented Oct 15, 2024

Hi, I think that the appropriate chain of actions would be to (1) introduce project terms to Darwin Core (or another data standard) and then (2) introduce the TDWG terms to the GBIF application profiles we use with the IPT etc. Sort of top-dowm, letting the data standards rule applications.

Jumping in at the GBIF issue 836 would mean minting temporary (?) project terms in the GBIF namespace that have not passed TDWG standardization. Sort of bottom-up letting practice rule the data standards... (if somebody makes the effort to integrate such practice into the data standards...)

Both paths are of course possible :-)

Then there is of course also the possibility to FIND the project terms in other non-TDWG data standards - and to promote using such standardized terms in the GBIF application profiles... Developing a new TDWG standard or addition to DwC would of course also involve exploring how other data standards describe projects!!!

@aaltenburger2
Copy link
Collaborator

I agree with your top down approach. I couldn't find a discussion about it on the TDWG github. Should I start one?

@dagendresen
Copy link
Member

I suspect that there might be plenty already on various TDWG repositories... :-)
There are always the DwC-QnA (FAQ) threads to explore more and to post follow.up questions to...?
https://github.com/tdwg/dwc-qa/issues?q=is%3Aissue+is%3Aopen+project
tdwg/dwc-qa#37
tdwg/dwc-qa#83
tdwg/dwc-qa#100
tdwg/dwc-qa#199

@aaltenburger2
Copy link
Collaborator

tdwg/dwc#527

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants