ELViS will handle various visit, loan, and digitisation requests. In order to do that it needs at least three types of datasets:
- Institutional data (location, description, facilities etc)
- Researcher/expert profile data
- Collection Index.
In this test and example, we focus on 1) and 2). The collectin index prototype is currently under construction.
The main goal is to gather and link relevant information for various authoritative data sources (such as GBIF, CETAF).
We also need to verify the data coming from different sources. For example, trusted data provider list?
(and related user stories)
- Provide researcher/expert/staff profile
DiSSCo#82 'Know who is in charge of the collection'
- Borrower profile
DiSSCo#10 DiSSCo#26 'Assess and validate the request for registration'
- Search and query by name, ORCID DiSSCo#138 'Search for experts'
- Link the person with the institution.
- Link the person with various research output (such as publication, specimen identification).
- Provide facility data
DiSSCo#90 'Indicate what facilities one like to use during a visit' DiSSCo#6 'Visit a relevant institution for a schedule/taxonomic group'
- Minimum items needed: Linkable ID, name, description.
- API (if not API some way to import/export data)
- Standard parsable output (such as JSON, csv)
- Linkable items (such as institutionCode, occurrenceID, facility id, lab id, ORCID).
- For People profile we need
- Linkable institutionCode
- Verification method? Trust?
- Speciality, expertise (such as how many publications, research topic etc).
- Data Verification method? Trust?
- Not all data sources provide API.
- How do we link different sources? What to link. For example, 'Institution code' in GBIF and CETAF are not the same.
Steps:
- Using the Bloodhound web service grab a list of public profile (includes ORCID and Wikidata Identifiers)
- For each of these profile we get a json data (grabbed from GBIF) dump regarding the specimen record associated with that person. For example, https://bloodhound-tracker.net/0000-0001-7618-5230/specimens.json provides us with a list of recods associated with an ORCID. Using a simple python json parser we can generated a item like this which links to a specific gbif record.
id|https://orcid.org/0000-0001-7618-5230
givenName|David Peter
identifiedBy|Shorthouse, D.
decimalLatitude|56.839
occurrenceID|urn:catalog:UASM:UASM329573
family|Linyphiidae
countryCode|CA
sameAs|https://gbif.org/occurrence/769279710
country|Canada
institutionCode|University of Alberta Museums (UAM)
catalogNumber|UASM329573
typeStatus|None
collectionCode|UASM
eventDate|2004-07
decimalLongitude|-118.340
scientificName|Oreonetides vaginatus
dateIdentified|2010
year|None
recordedBy|Pinzon, J.
@id|https://gbif.org/occurrence/769279710
@type|PreservedSpecimen
Ideally we want to link the 'institutionCode' (UASM) to GBIF registry which has an unique id: https://www.gbif.org/grscicoll/institution/fb10ac30-e517-4f2b-8d11-c6be465c38a5
And bring the facilities data from CETAF.
- How many laboratories are in use in your institution?
- List of laboratories
- Number of permanent exhibitions
- List of permanent exhibitions
- Exhibition URL
- Number of recent exhibitions
- Recent Temporary Exhibitions
- Number of current exhibitions
- Current Temporary Exhibitions
- Number of future exhibitions
- Future exhibitions
This is just a simple proof of concept to utilise the bloodhound API. https://github.com/DiSSCo/user-stories/blob/master/generate-elvis-profile-data.py
$ python generate-elvis-profile-data.py 0000-0001-7618-5230
Shorthouse, D. Oreonetides vaginatus urn:catalog:UASM:UASM329573 University of Alberta Museums (UAM) UASM329573 https://gbif.org/occurrence/769279710
Shorthouse, D. Tunagyna debilis urn:catalog:UASM:UASM329612 University of Alberta Museums (UAM) UASM329612 https://gbif.org/occurrence/769279986
Shorthouse, D. Oreonetides vaginatus urn:catalog:UASM:UASM329574 University of Alberta Museums (UAM) UASM329574 https://gbif.org/occurrence/769281222