Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explain how to programatically access a term description #502

Open
nickynicolson opened this issue Nov 3, 2023 · 2 comments
Open

Explain how to programatically access a term description #502

nickynicolson opened this issue Nov 3, 2023 · 2 comments

Comments

@nickynicolson
Copy link
Member

I have a SIMPLE_DWC format download from GBIF and I want to programmatically access the dwc term description for each column header, eg for the term basisOfRecord, I would like the definition "The specific nature of the data record." as given in the human readable page here: https://dwc.tdwg.org/list/#dwc_basisOfRecord
The resources listed under getting started don't show an easy way to do this - they appear either human readable (not designed for programmatic access), include the complete term version history (I just want the latest version) or are aimed at people encoding data in dwc, not consuming it ("distribution documents").

It may be that content negotiation could give me a structured version of the terms and definitions but it doesn't appear to be covered here - could the documentation please be revised to explain this use case?

@baskaufs
Copy link

baskaufs commented Nov 4, 2023

Hi @nickynicolson . I think what you are looking for is on the page http://rs.tdwg.org/index, which is now accessible under the TDWG website Technical menu as the "Accessing standards metadata" item. On that page, Section 3, which describes how to retrieve machine-readable metadata about pretty much every part of every TDWG standard.

If you don't want to go the content negotiation/RDF route, section 3.2 describes the primary CSV files from which all documents and machine-readable documents are derived. So, for example, the table in section 4.1 says that the metadata for literal-value Darwin Core terms are in the "terms" directory. Knowing that, Section 3.2 says that the primary metadata about Darwin Core literal-value terms is in the "terms.csv" file in the "terms" directory of the rs.tdwg.org repo. Thus you can go to the CSV table https://github.com/tdwg/rs.tdwg.org/blob/master/terms/terms.csv to get the authoritative metadata about the terms.

These tables include the metadata about the most recent versions of ALL terms, so you would want to filter out the ones that have true in the term_deprecated column.

As I noted above, all of the human-readable documents (like the List of Terms and Quick Reference Guide) and machine readable representations (RDF/XML, Turtle, and JSON-LD) are generated from this table, so they all should provide the same term definitions.

@tucotuco Can we update the documentation on the Getting Started page to point to this page?

@tucotuco
Copy link
Member

tucotuco commented Mar 1, 2024

Yes. I have tagged this for implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants