Skip to content

Commit

Permalink
Merge pull request #11 from jdvorak001/support-multiple-versions
Browse files Browse the repository at this point in the history
Support multiple versions of the OpenAIRE CRIS Guidelines
  • Loading branch information
jdvorak001 authored Jul 6, 2023
2 parents 5737b1a + fbb98f6 commit 2370001
Show file tree
Hide file tree
Showing 73 changed files with 12,641 additions and 134 deletions.
6 changes: 3 additions & 3 deletions .github/workflows/maven.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,11 @@ jobs:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v2
- uses: actions/checkout@v3
- name: Checkout the OpenAIRE Guidelines for CRIS Managers repo
run: git clone --branch=master https://github.com/openaire/guidelines-cris-managers.git ../guidelines-cris-managers
run: git clone --branch=v1.2 https://github.com/euroCRIS/guidelines-cris-managers.git ../guidelines-cris-managers
- name: Set up JDK 17
uses: actions/setup-java@v2
uses: actions/setup-java@v3
with:
java-version: '17'
distribution: 'adopt'
Expand Down
36 changes: 27 additions & 9 deletions CHECKS.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,28 +2,46 @@

The meaning of the SHALL keyword is specified in [RFC 2119](https://www.ietf.org/rfc/rfc2119.txt).

(0) Any XML response returned by the endpoint to the requests specified below SHALL validate with respect to the following XML Schemas:
<http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd> for the namespace <http://www.openarchives.org/OAI/2.0/>,
<http://www.openarchives.org/OAI/2.0/oai-identifier.xsd> for the namespace <http://www.openarchives.org/OAI/2.0/oai-identifier> and
<https://raw.githubusercontent.com/openaire/guidelines-cris-managers/master/schemas/openaire-cerif-profile.xsd> for the namespace <https://www.openaire.eu/cerif-profile/1.1/>.
Upon startup, the validator constructs the table of supported OpenAIRE CRIS metadata schemas (the *TSOACMS* in the sequel).
It currently includes the following schemas:
| XML namespace URI | XML Schema source location |
|--------------------------------------------|-----------------------------------------------------------------------------------|
| https://www.openaire.eu/cerif-profile/1.1/ | [src/main/resources/schemas/cerif_profile_1_1/openaire-cerif-profile.xsd](./src/main/resources/schemas/cerif_profile_1_1/openaire-cerif-profile.xsd) |
| https://www.openaire.eu/cerif-profile/1.2/ | [schemas/openaire-cerif-profile.xsd](../../../../openaire/guidelines-cris-managers/blob/v1.2/schemas/openaire-cerif-profile.xsd) in the [OpenAIRE CRIS Guidelines project](../../../../openaire/guidelines-cris-managers) |

(0) Any XML response returned by the endpoint to the requests specified below SHALL validate with respect to the XML Schemas from the *TSOACMS* and the following XML Schemas:
| XML namespace URI | XML Schema source location |
|-------------------|----------------------------|
| http://www.openarchives.org/OAI/2.0/ | http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd |
| http://www.openarchives.org/OAI/2.0/oai-identifier | http://www.openarchives.org/OAI/2.0/oai-identifier.xsd |

(1) The response to an `Identify` request SHALL include:
(a) exactly one `description` element that contains a `Service` element from namespace <https://www.openaire.eu/cerif-profile/1.1/> and
(a) exactly one `description` element that contains a `Service` element from a namespace from the *TSOACMS* and
(b) exactly one `description` element that contains an `oai-identifier` element from namespace <http://www.openarchives.org/OAI/2.0/oai-identifier>.
The `oai-identifier/repositoryIdentifier` from (b) will be refered to as `{CRIS_identifier}` in the sequel.
(c) The `Service/Acronym` from (a) SHALL be equal to the `{CRIS_identifier}`.
(d) The `baseURL` from the `Identify` response is equal to the base URL of the CRIS.

(2) The list of supported metadata formats returned by the general `ListMetadataFormats` request (i.e., no `identifier` parameter specified) SHALL include
`oai_cerif_openaire` with namespace <https://www.openaire.eu/cerif-profile/1.1/>
as per [specification](http://openaire-guidelines-for-cris-managers.readthedocs.io/en/latest/implementation.html#metadata-format-and-prefix).
(2) As per the [specification](http://openaire-guidelines-for-cris-managers.readthedocs.io/en/latest/implementation.html#metadata-format-and-prefix),
the list of supported metadata formats returned by the general `ListMetadataFormats` request (i.e., no `identifier` parameter specified)
(a) SHALL include at least one prefix starting with `oai_cerif_openaire`;
(b) if the metadata prefix starts with `oai_cerif_openaire`, the corresponding XML namespace URI SHALL start with <https://www.openaire.eu/cerif-profile/>.
(c) If XML namespace URI starts with <https://www.openaire.eu/cerif-profile/>, the metadata prefix SHALL start `oai_cerif_openaire`.
(d) The metadata prefixes SHALL be unique.
(e) The XML namespace URIs SHALL be unique.
(f) The XML Schema location URLs SHALL be unique.
(g) If the XML namespace starts with <https://www.openaire.eu/cerif-profile/>, it shall be found in the *TSOACMS*.
(h) If the XML namespace starts with <https://www.openaire.eu/cerif-profile/>, the XML Schema location shall be under <https://www.openaire.eu/schema/cris/>.
(i) If the XML namespace starts with <https://www.openaire.eu/cerif-profile/>, the filename of the XML Schema shall be `openaire-cerif-profile.xsd`.
(j) The targetNamespace of a referenced XML Schema SHALL be the same as the one advertised for the metadata format in the `ListMetadataFormats` response.
(k) If the Identify response in (1) includes a Service/Compatibility of a certain version of these Guidelines, then the ListMetadataFormats response SHALL include the corresponding namespace.

(3) The list of supported sets returned by the `ListSets` request SHALL include
all of the sets as per the [specification](http://openaire-guidelines-for-cris-managers.readthedocs.io/en/latest/implementation.html#openaire-oai-pmh-sets).

(4) - removed

(5) When all objects from the sets as per the [specification](http://openaire-guidelines-for-cris-managers.readthedocs.io/en/latest/implementation.html#openaire-oai-pmh-sets)
(5) When all objects from the sets and metadata formats as per the [specification](http://openaire-guidelines-for-cris-managers.readthedocs.io/en/latest/implementation.html#openaire-oai-pmh-sets)
are retrieved using the `ListRecords` requests and put together, the following statements SHALL hold:
(a) Any `id` attribute in the CERIF XML markup points at an OAI record with identifier constructed as per [specification](http://openaire-guidelines-for-cris-managers.readthedocs.io/en/latest/implementation.html#oai-identifiers), including the `{CRIS_identifier}`.
(b) CERIF XML markup contains no conflicts in properties: where a property value is given, the value does not differ from that in other places where the value of the same property is given.
Expand Down
14 changes: 7 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# OpenAIRE CRIS validator

A tool to assess whether an OAI-PMH endpoint can provide research information
complying with the [OpenAIRE Guidelines for CRIS Managers 1.1](https://github.com/openaire/guidelines-cris-managers).
complying with the [OpenAIRE Guidelines for CRIS Managers](https://github.com/openaire/guidelines-cris-managers) versions 1.1 and 1.2.
It covers [these checks](CHECKS.md).

This is a command-line Java tool that is organized as a [JUnit](https://junit.org/junit4/) test suite.
Expand All @@ -12,7 +12,7 @@ This is Open Source software, available under the terms of the [Apache 2.0 Licen


![CI workflow](https://github.com/euroCRIS/openaire-cris-validator/actions/workflows/maven.yml/badge.svg)
← checking if the software builds and runs on the [example files from the standard](https://github.com/openaire/guidelines-cris-managers/tree/master/samples).
← checking if the software builds and runs on the [example files](./samples).

## Usage

Expand All @@ -23,7 +23,7 @@ Then do:

mvn clean package

We compile for Java 17 by default, but you can switch to 11 or 1.8 [in the POM file](https://github.com/EuroCRIS/openaire-cris-validator/blob/master/pom.xml#L16).
We compile for Java 17 by default, but you can switch to 11 or 1.8 [in the POM file](./pom.xml#L16).

### Run

Expand Down Expand Up @@ -51,17 +51,17 @@ The validator copies the responses to the requests it makes into files in the `d
[CRISValidator](./src/main/java/org/eurocris/openaire/cris/validator/CRISValidator.java) is the main validator class. It is the JUnit4 test suite.
As it reads the metadata records from the CRIS:
* it does simple checks on the fly (using [CheckingIterable](./src/main/java/org/eurocris/openaire/cris/validator/util/CheckingIterable.java)); and
* it builds an internal representation: a [HashMap](https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html) of trees that consist of [CERIFNode](./src/main/java/org/eurocris/openaire/cris/validator/tree/CERIFNode.java)s. The last test, `check990_CheckReferentialIntegrityAndFunctionalDependency`, works on this internal representation.
* it builds an internal representation: a [HashMap](https://devdocs.io/openjdk~17/java.base/java/util/hashmap) of trees that consist of [CERIFNode](./src/main/java/org/eurocris/openaire/cris/validator/tree/CERIFNode.java)s. The last test, `check990_CheckReferentialIntegrityAndFunctionalDependency`, works on this internal representation.

[OAIPMHEndpoint](./src/main/java/org/eurocris/openaire/cris/validator/OAIPMHEndpoint.java) is an independent implementation
of an [OAI-PMH 2.0](https://www.openarchives.org/OAI/openarchivesprotocol.html) client in Java.
While it uses JAXB to map the OAI-PMH 2.0 markup to Java objects, any metadata payload is opaque to it.
For requests that list objects (i.e., `ListIdentifiers`, `ListRecords` or `ListSets`) an [Iterable](https://docs.oracle.com/javase/8/docs/api/java/lang/Iterable.html) is returned
For requests that list objects (i.e., `ListIdentifiers`, `ListRecords` or `ListSets`) an [Iterable](https://devdocs.io/openjdk~17/java.base/java/lang/iterable) is returned
that uses the protocol's resumption token mechanism to fetch successive chunks of objects.
This is entirely transparent to the class user.

If the OAI-PMH 2.0 data provider advertises support for a compression, the endpoint client object will use it.
[CompressionHandlingHttpURLConnectionAdapter](./src/main/java/org/eurocris/openaire/cris/validator/http/CompressionHandlingHttpURLConnectionAdapter.java) is a transparent compression-handling wrapper around an [HttpURLConnection](https://docs.oracle.com/javase/8/docs/api/java/net/HttpURLConnection.html).
[CompressionHandlingHttpURLConnectionAdapter](./src/main/java/org/eurocris/openaire/cris/validator/http/CompressionHandlingHttpURLConnectionAdapter.java) is a transparent compression-handling wrapper around an [HttpURLConnection](https://devdocs.io/openjdk~17/java.base/java/net/httpurlconnection).


## Feedback
Expand All @@ -72,7 +72,7 @@ Please submit a [github issue](https://github.com/euroCRIS/openaire-cris-validat

## License

Copyright 2018–2022 Jan Dvořák <a href="https://orcid.org/0000-0001-8985-152X" target="orcid.widget" rel="noopener noreferrer" style="vertical-align:top;"><img src="https://orcid.org/sites/default/files/images/orcid_16x16.png" style="width:1em;margin-right:.5em;" alt="ORCID iD icon"https://orcid.org/0000-0001-8985-152X</a> and other contributors
Copyright 2018–2023 Jan Dvořák <a href="https://orcid.org/0000-0001-8985-152X" target="orcid.widget" rel="noopener noreferrer" style="vertical-align:top;"><img src="https://orcid.org/sites/default/files/images/orcid_16x16.png" style="width:1em;margin-right:.5em;" alt="ORCID iD icon"https://orcid.org/0000-0001-8985-152X</a> and other contributors

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
Expand Down
77 changes: 61 additions & 16 deletions pom.xml
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
<modelVersion>4.0.0</modelVersion>
<groupId>org.eurocris</groupId>
<artifactId>openaire-cris-validator</artifactId>
<version>1.4.1-SNAPSHOT</version>
<version>2.0.0-SNAPSHOT</version>
<name>OpenAIRE CRIS Prototype Validator</name>
<description>A basic prototype validator to assess whether an OAI-PMH endpoint complies with the requirements set by the OpenAIRE Guidelines for CRIS Manager 1.1.</description>
<description>A basic prototype validator to assess whether an OAI-PMH endpoint complies with the requirements set by the OpenAIRE Guidelines for CRIS Manager 1.1 or higher.</description>
<organization>
<name>euroCRIS</name>
<url>https://www.eurocris.org/</url>
Expand Down Expand Up @@ -50,7 +50,6 @@
<encoding>UTF-8</encoding>
</configuration>
</plugin>

<plugin>
<groupId>org.jvnet.jaxb2.maven2</groupId>
<artifactId>maven-jaxb2-plugin</artifactId>
Expand All @@ -59,7 +58,7 @@
<dependency>
<groupId>org.glassfish.jaxb</groupId>
<artifactId>jaxb-runtime</artifactId>
<version>2.3.3</version>
<version>2.3.8</version>
</dependency>
</dependencies>
<executions>
Expand All @@ -72,11 +71,10 @@
</executions>
<configuration>
<extension>true</extension>
<schemaDirectory>${guidelines.project.dir}/schemas</schemaDirectory>
<schemaDirectory>src/main/xjc/schemas</schemaDirectory>
<schemaIncludes>
<schemaInclude>cached/oai-identifier.xsd</schemaInclude>
<schemaInclude>cached/OAI-PMH.xsd</schemaInclude>
<!-- schemaInclude>openaire-cerif-profile.xsd</schemaInclude -->
<include>cached/oai-identifier.xsd</include>
<include>cached/OAI-PMH.xsd</include>
</schemaIncludes>
<bindingDirectory>src/main/xjc</bindingDirectory>
<bindingIncludes>
Expand Down Expand Up @@ -120,24 +118,60 @@
<artifactId>maven-resources-plugin</artifactId>
<version>3.3.1</version>
<executions>
<execution>
<id>default-resources</id>
<phase>none</phase>
</execution>
<execution>
<id>copy-resources</id>
<phase>generate-sources</phase>
<goals>
<goal>copy-resources</goal>
</goals>
<configuration>
<outputDirectory>${basedir}/target/classes</outputDirectory>
<outputDirectory>${basedir}/target/classes/schemas</outputDirectory>
<resources>
<resource>
<directory>src/main/resources/schemas/</directory>
<directory>src/main/resources/schemas/cached</directory>
<include>**/*.xsd</include>
<targetPath>schemas</targetPath>
<targetPath>cached</targetPath>
</resource>
<resource>
<directory>${guidelines.project.dir}/schemas/</directory>
<directory>${guidelines.project.dir}/schemas/cached</directory>
<include>**/*.xsd</include>
<targetPath>schemas</targetPath>
<targetPath>cached</targetPath>
</resource>
<resource>
<directory>src/main/resources/schemas</directory>
<include>**/openaire-cerif-profile.xsd</include>
<include>**/includes/**/*.xsd</include>
<include>**/vocabularies/**/*.xsd</include>
<exclude>cached</exclude>
<targetPath>original</targetPath>
</resource>
<resource>
<directory>${guidelines.project.dir}/schemas</directory>
<include>openaire-cerif-profile.xsd</include>
<include>includes/**/*.xsd</include>
<include>vocabularies/**/*.xsd</include>
<targetPath>original/current</targetPath>
</resource>
</resources>
</configuration>
</execution>
<execution>
<id>copy-test-resources</id>
<phase>generate-sources</phase>
<goals>
<goal>testResources</goal>
</goals>
<configuration>
<outputDirectory>${basedir}/target</outputDirectory>
<resources>
<resource>
<directory>src/test/java</directory>
<include>**/_verb*.xml</include>
<targetPath>test-classes</targetPath>
</resource>
</resources>
</configuration>
Expand All @@ -153,24 +187,35 @@
<goals>
<goal>transform</goal>
</goals>
<phase>process-resources</phase>
</execution>
</executions>
<configuration>
<forceCreation>true</forceCreation>
<transformationSets>
<transformationSet>
<dir>${basedir}/target/classes/schemas</dir>
<dir>src/main/resources/schemas</dir>
<includes>
<include>**/openaire-cerif-profile.xsd</include>
<include>**/includes/**/*.xsd</include>
<include>**/vocabularies/**/*.xsd</include>
</includes>
<outputDir>${basedir}/target/classes/schemas/relaxed</outputDir>
<stylesheet>src/tools/xslt/relax-xml-schema-for-first-validation.xslt</stylesheet>
</transformationSet>
<transformationSet>
<dir>${guidelines.project.dir}/schemas</dir>
<includes>
<include>openaire-cerif-profile.xsd</include>
<include>includes/**/*.xsd</include>
<include>vocabularies/**/*.xsd</include>
</includes>
<outputDir>${basedir}/target/classes/schemas/relaxed</outputDir>
<outputDir>${basedir}/target/classes/schemas/relaxed/current</outputDir>
<stylesheet>src/tools/xslt/relax-xml-schema-for-first-validation.xslt</stylesheet>
</transformationSet>
</transformationSets>
<catalogs>
<catalog>${guidelines.project.dir}/schemas/cached/catalog.xml</catalog>
<catalog>file:./schemas/cached/catalog.xml</catalog>
<catalog>${guidelines.project.dir}/schemas/catalog.xml</catalog>
</catalogs>
</configuration>
Expand Down
1 change: 0 additions & 1 deletion samples/_verb=ListMetadataFormats.xml

This file was deleted.

28 changes: 28 additions & 0 deletions samples/_verb=ListMetadataFormats.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
<?xml version="1.0" encoding="UTF-8"?>
<!--
Examples of metadata records in OpenAIRE CERIF XML format as OAI-PMH payload.
OAI-PMH response to the ListMetadataFormats request.
Provided as accompanying material to the "OpenAIRE Guidelines for CRIS Managers" version 1.2
-->
<OAI-PMH xmlns="http://www.openarchives.org/OAI/2.0/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.openarchives.org/OAI/2.0/ http://www.openarchives.org/OAI/2.0/OAI-PMH.xsd">

<responseDate>2023-06-18T10:27:20Z</responseDate>
<request verb="ListMetadataFormats">http://cris.example.org/openaire/connector</request>

<ListMetadataFormats>

<metadataFormat>
<metadataPrefix>oai_cerif_openaire_v1_2</metadataPrefix>
<schema>https://www.openaire.eu/schema/cris/1.2/openaire-cerif-profile.xsd</schema>
<metadataNamespace>https://www.openaire.eu/cerif-profile/1.2/</metadataNamespace>
</metadataFormat>

<metadataFormat>
<metadataPrefix>oai_cerif_openaire_v1_1</metadataPrefix>
<schema>https://www.openaire.eu/schema/cris/1.1/openaire-cerif-profile.xsd</schema>
<metadataNamespace>https://www.openaire.eu/cerif-profile/1.1/</metadataNamespace>
</metadataFormat>

</ListMetadataFormats>

</OAI-PMH>
Loading

0 comments on commit 2370001

Please sign in to comment.