-
Notifications
You must be signed in to change notification settings - Fork 8
Simple Darwin Core
This text has been adapted from the documentation by TDWG Darwin Core Task Group, 2015, [http://rs.tdwg.org/dwc/terms/simple/index.htm] (http://rs.tdwg.org/dwc/terms/simple/index.htm).
- What is it? And why is it “simple”?
- But, there must be some rules to use Simple Darwin Core, right?
- Simple Darwin Core vs. Extensions
- Darwin Core Extensions - just a brief intro
Simple Darwin Core is a subset of the terms contained in the Darwin Core standard, those which are of common use across a wide variety of biodiversity applications. We can think of using Simple Darwin Core as when using a plain spreadsheet, or a single database table, where we have rows and columns. Each row corresponds to a record, and each column corresponds to a particular term (or “field”). So, we can think of term names as field names. Therefore, Simple Darwin Core is simple in that it assumes (and allows) no structure beyond the concept of rows and columns. We can see an example in the figure below.
Also, Simple Darwin Core has minimal restrictions on which fields are required: it requires none. You might argue that there should be certain required fields, that there isn't anything useful you can do without them. And that might be partially true, as a record with no fields in it wouldn't be very interesting. But then, requiring a particular field would mean that every record must have it, and that can be cumbersome, or even impossible in some cases. By having no required field restriction, Simple Darwin Core can be used to share any meaningful combination of fields - for example, to share "just names", or "just places", or observations of individuals detected in the wild at a given place and time following a method (a Darwin Core Occurrence). This flexibility promotes the reuse of the terms and sharing mechanisms for a wide variety of services.
Well, to keep it simple, there are just a few general guiding principles on how to make the best use of Simple Darwin Core:
A. Structuring:
-
Any Darwin Core term name can be used as a field name.
-
No field name may be repeated in a record.
-
Do not use a Class term (Occurrence, [Organism] (http://rs.tdwg.org/dwc/terms/index.htm#Event), MaterialSample, LivingSpecimen, PreservedSpecimen, FossilSpecimen, Event, HumanObservation, MachineObservation, Location, GeologicalContext, Identification, Taxon) as a field. Classes in Darwin Core are categories used group of fields that describe a further broader concept.
-
Provide data in as many fields as you can. For example, if in one record you have information about the occurrence of Vulpes zerda (Zimmermann, 1780), do not limit yourself to provide the scientificName field, you can certainly provide genus, specificEpithet and scientificNameAuthorship as well.
B. Some fields you should always try to use:
-
Use the type field to provide the name of what Dublin Core type class (PhysicalObject, StillImage, MovingImage, Sound, Text) the record represents. Do not mistake this field with typeStatus, which refers to the nomenclatural types (holotype, paratype, lectotype, etc.)
-
Use the basisOfRecord field to provide the name of the most specific Darwin Core class (LivingSpecimen, PreservedSpecimen, FossilSpecimen, MaterialSample, HumanObservation, MachineObservation, Event, Occurrence, Taxon, Identification, Organism, Location, GeologicalContext, MeasurementOrFact, ResourceRelationship) the record represents.
C. Some best practices when providing content:
-
Populate fields with data that match the definition of the field.
-
Use controlled vocabularies for the values of fields that recommend them.
-
If data are withheld, use the [informationWithheld] (http://rs.tdwg.org/dwc/terms/index.htm#informationWithheld) field to say so.
-
If data are shared in lower quality than the original, use [dataGeneralizations] (http://rs.tdwg.org/dwc/terms/index.htm#dataGeneralizations) to say so.
As we were saying before, Simple Darwin Core does not allow two fields to be called the same. This can represent a problem when we have different values of one same attribute for the same record. Let’s put this in two examples (and you can take a look at the graph below as well):
a. Suppose we have a record which contains a total length measurement of some creature (total length = 194 cm). Where would we capture that information? Well, there are a couple of Simple Darwin Core terms that serve that purpose: measurementType, measurementValue, measurementUnit. Perfect, then we would use those three fields and populate them with the kind of measurement (“total length”), the actual value (194) and the units (cm), respectively.
b. Now, suppose we have a record that contains both information about the total length of a creature and the length of its tail. In Simple Darwin Core, since we cannot use two fields called “measurementType”, we would not be able to capture both lengths (total and tail), and same situation we would face with the corresponding values and units. So… what do we do? Do we capture only one length and forget the rest? No, of course not! We need what we call an Extension.
Darwin Core Extensions are simply that: extensions, sets of fields added to Simple Darwin Core that allow us to capture information that we would not be able to capture if we only used the terms included in Simple Darwin Core. Examples of extensions are “Extended Measurement Or Facts”, “Vernacular Names”, “Identification History”, “GGBN DNA Cloning”, among many others.
“Hmmm…”, you may be thinking, “Then, is there an extension for every field that I cannot capture using Simple Darwin Core?” The answer is actually "No", for two reasons:
a. An extension is a set of fields that serves a particular purpose. This implies that a new field alone does not constitute an extension. In other words, we don’t build extensions for a single field, but rather to take care of several related fields.
b. Although there are many extensions available, extensions are built upon necessity and it could happen that none of the existing ones covers your needs. In this case, you are of course welcome to propose and test a new extension to address a particular problem (this is indeed done all the time).
If you want to take a glance at available extensions, you may visit the GBIF site http://tools.gbif.org/dwca-validator/extensions.do, where you can find a list with brief descriptions of what each extension addresses along with links to the details of each extension.
For a short video explaining the basics of Extensions, you can check our Extensions in Brief Webinar.