Skip to content

Supporting non rdf local vocabularies

wu-lee edited this page Nov 9, 2022 · 4 revisions

There is quite an overhead in complexity required to support a full rdf vocabulary. Specifically: someone who understands SKOS needs to write a Turtle file defining all the vocabulary terms, version-number it, commit to the map-sse repository and and deploy it on the vocabs domain. Then, the se-open-data codebase needs to be adapted to include this vocabulary, allowing the sausage machine to be modified to refer to it. Updates to this vocab need to be made carefully in a backward-compatible way, particularly when more than one project uses it.

We often wish to support a vocabulary that is simple, is very local to an individual project where complex metadata is not required. It would be nice for our architecture to support this, for rapid, ?code free? addition of these vocabs.

For example, the OBO project has started classifying initiatives it works with by how they relate to it. In a field called Type of Relationships the following terms are supported:

  • Partner project
  • ObO Staff Team
  • Consortium Partner
  • Anchor Institution
  • Community Activist or Innovator
  • Community Anchor Organisation
  • Funder
  • Steering Group
  • Support Organisation
  • Supporter or Ambassador
  • Enterprise of Initiative

It is unlikely that this specific set of classifications will be used by other projects in the near future, and it would be nice to be able to quickly start supporting it. (If this changes and new projects do take it up, we can then upgrade it to full rdf support.)

The simplest level of support for this, is to start allowing arbitrarily named fields of arbitrary strings to be included in source csv files. They will get passed through to the triple store with no metadata. These fields could not be supported by sameas merging, (which is fine as they should only occur in single sources of data) and there would be restrictions on how they could be used within mykomap. For example that field couldn't be used to indicate that a certain type of marker is used to display it. It would violate multi-language support. It may not be possible to filter using it.

A slightly better solution would be to capture a little structure in the mykomaps's config file, essentially by limiting the range of the terms useable in the field, with an enumerated type say.

This enumerated type would be made available when filling in field entries for the initiative. In the OBO case, this would be in the drop down for the options available when populating the Type of Relationship field in the Organisations Obo table in Airtable. This enum mapping would also need to be added to the config file for de-referencing when the mykomap loads the initiative data. This would make it possible for this field to be used nearly as flexibly withing the mykomap as the rdf defined vocabs. It would probably still break multilanguage support, (without a few more extensions,) but we are thinking that this is for very local vocabularies, so this issue may not arise. (For enum field to support multiple languages, these would need to be configured in mykomap.)

Clone this wiki locally