Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow links between dataset types and metadata blocks #11001

Open
wants to merge 11 commits into
base: develop
Choose a base branch
from
12 changes: 12 additions & 0 deletions doc/release-notes/10519-dataset-types.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
## Dataset Types can be linked to Metadata Blocks

Metadata blocks (e.g. "CodeMeta") can now be linked to dataset types (e.g. "software") using new superuser APIs.

This will have the following effects for the APIs used by the new Dataverse UI ( https://github.com/IQSS/dataverse-frontend ):

- The list of fields shown when creating a dataset will include fields marked as "displayoncreate" (in the tsv/database) for metadata blocks (e.g. "CodeMeta") that are linked to the dataset type (e.g. "software") that is passed to the API.
- The metadata blocks shown when editing a dataset will include metadata blocks (e.g. "CodeMeta") that are linked to the dataset type (e.g. "software") that is passed to the API.

The CodeMeta metadata block is now available in the Dockerized development environment.

For more information, see the guides and #10519.
39 changes: 39 additions & 0 deletions doc/sphinx-guides/source/api/native-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -529,6 +529,8 @@ The fully expanded example above (without environment variables) looks like this

curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -X DELETE "https://demo.dataverse.org/api/dataverses/root/assignments/6"

.. _list-metadata-blocks-for-a-collection:

List Metadata Blocks Defined on a Dataverse Collection
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down Expand Up @@ -556,6 +558,9 @@ This endpoint supports the following optional query parameters:

- ``returnDatasetFieldTypes``: Whether or not to return the dataset field types present in each metadata block. If not set, the default value is false.
- ``onlyDisplayedOnCreate``: Whether or not to return only the metadata blocks that are displayed on dataset creation. If ``returnDatasetFieldTypes`` is true, only the dataset field types shown on dataset creation will be returned within each metadata block. If not set, the default value is false.
- ``datasetType``: Whether or not to return additional fields from metadata blocks that are linked with a particular dataset type.

are displayed on dataset creation. If ``returnDatasetFieldTypes`` is true, only the dataset field types shown on dataset creation will be returned within each metadata block. If not set, the default value is false.

An example using the optional query parameters is presented below:

Expand All @@ -573,6 +578,8 @@ The fully expanded example above (without environment variables) looks like this

curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" "https://demo.dataverse.org/api/dataverses/root/metadatablocks?returnDatasetFieldTypes=true&onlyDisplayedOnCreate=true"

.. _define-metadata-blocks-for-a-dataverse-collection:

Define Metadata Blocks for a Dataverse Collection
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand All @@ -598,6 +605,8 @@ The fully expanded example above (without environment variables) looks like this

curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -X POST -H "Content-type:application/json" --upload-file define-metadatablocks.json "https://demo.dataverse.org/api/dataverses/root/metadatablocks"

An alternative to defining metadata blocks at a collection level is to create and use a dataset type. See :ref:`api-link-dataset-type`.

Determine if a Dataverse Collection Inherits Its Metadata Blocks from Its Parent
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down Expand Up @@ -3249,6 +3258,36 @@ The fully expanded example above (without environment variables) looks like this

curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -X DELETE "https://demo.dataverse.org/api/datasets/datasetTypes/3"

.. _api-link-dataset-type:

Link Dataset Type with Metadata Blocks
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Linking a dataset type with one or more metadata blocks results in additional fields from those blocks appearing in the output from the :ref:`list-metadata-blocks-for-a-collection` API endpoint. The new frontend for Dataverse (https://github.com/IQSS/dataverse-frontend) uses the JSON output from this API endpoint to construct the page that users see when creating or editing a dataset. Once the frontend has been updated to pass in the dataset type (https://github.com/IQSS/dataverse-client-javascript/issues/210), specifying a dataset type in this way can be an alternative way to display additional metadata fields than the traditional method, which is to enabled a metadata block at the collection level (see :ref:`define-metadata-blocks-for-a-dataverse-collection`).

For example, a superuser could create a type called "software" and link it to the "CodeMeta" metadata block (this example is below). Then, once the new front end allows it, the user can specify that they want to create a dataset of type software and see the additional metadata fields from the CodeMeta block when creating or editing their dataset.

This API endpoint is superuser only.

.. code-block:: bash

export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export TYPE=software
export JSON='["codeMeta20"]'

curl -H "X-Dataverse-key:$API_TOKEN" -H "Content-Type: application/json" "$SERVER_URL/api/datasets/datasetTypes/$TYPE" -X POST -d $JSON

The fully expanded example above (without environment variables) looks like this:

.. code-block:: bash

curl -H "X-Dataverse-key:xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" -H "Content-Type: application/json" "https://demo.dataverse.org/api/datasets/datasetTypes/software" -X POST -d '["codeMeta20"]'

To update the blocks that are linked, send an array with those blocks.

To remove all links to blocks, send an empty array.

Files
-----

Expand Down
6 changes: 4 additions & 2 deletions doc/sphinx-guides/source/user/dataset-management.rst
Original file line number Diff line number Diff line change
Expand Up @@ -790,21 +790,23 @@ If you deaccession the most recently published version of the dataset but not al
Dataset Types
=============

.. note:: Development of the dataset types feature is ongoing. Please see https://github.com/IQSS/dataverse-pm/issues/307 for details.

Out of the box, all datasets have a dataset type of "dataset". Superusers can add additional types such as "software" or "workflow" using the :ref:`api-add-dataset-type` API endpoint.

Once more than one type appears in search results, a facet called "Dataset Type" will appear allowing you to filter down to a certain type.

If your installation is configured to use DataCite as a persistent ID (PID) provider, the appropriate type ("Dataset", "Software", "Workflow") will be sent to DataCite when the dataset is published for those three types.

Currently, the dataset type can only be specified via API and only when the dataset is created. For details, see the following sections of the API guide:
Currently, specifying a type for a dataset can only be done via API and only when the dataset is created. The type can't currently be changed afterward. For details, see the following sections of the API guide:

- :ref:`api-create-dataset-with-type` (Native API)
- :ref:`api-semantic-create-dataset-with-type` (Semantic API)
- :ref:`import-dataset-with-type`

Dataset types can be listed, added, or deleted via API. See :ref:`api-dataset-types` in the API Guide for more.

Development of the dataset types feature is ongoing. Please see https://github.com/IQSS/dataverse/issues/10489 for details.
Dataset types can be linked with metadata blocks to make fields from those blocks available when datasets of that type are created or edited. See :ref:`api-link-dataset-type` for details.

.. |image1| image:: ./img/DatasetDiagram.png
:class: img-responsive
Expand Down
2 changes: 2 additions & 0 deletions docker-compose-dev.yml
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,8 @@ services:
- dev
networks:
- dataverse
volumes:
- ./docker-dev-volumes/solr/data:/var/solr

dev_dv_initializer:
container_name: "dev_dv_initializer"
Expand Down
9 changes: 9 additions & 0 deletions modules/container-configbaker/scripts/bootstrap/dev/init.sh
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,15 @@ export API_TOKEN
# ${ENV_OUT} comes from bootstrap.sh and will expose the saved information back to the host if enabled.
echo "API_TOKEN=${API_TOKEN}" >> "${ENV_OUT}"

echo "Loading CodeMeta metadata block (needed for API tests)..."
curl "${DATAVERSE_URL}/api/admin/datasetfield/load" -X POST --data-binary @/scripts/bootstrap/base/data/metadatablocks/codemeta.tsv -H "Content-type: text/tab-separated-values"

echo "Fetching Solr schema from Dataverse and running update-fields.sh..."
curl "${DATAVERSE_URL}/api/admin/index/solr/schema" | /scripts/update-fields.sh /var/solr/data/collection1/conf/schema.xml

echo "Reloading Solr..."
curl "http://solr:8983/solr/admin/cores?action=RELOAD&core=collection1"

echo "Publishing root dataverse..."
curl -H "X-Dataverse-key:$API_TOKEN" -X POST "${DATAVERSE_URL}/api/dataverses/:root/actions/:publish"

Expand Down
81 changes: 67 additions & 14 deletions src/main/java/edu/harvard/iq/dataverse/DatasetFieldServiceBean.java
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
package edu.harvard.iq.dataverse;

import edu.harvard.iq.dataverse.dataset.DatasetType;
import java.io.IOException;
import java.io.StringReader;
import java.net.URI;
Expand Down Expand Up @@ -871,15 +872,15 @@ public List<DatasetFieldType> findAllDisplayedOnCreateInMetadataBlock(MetadataBl
Root<MetadataBlock> metadataBlockRoot = criteriaQuery.from(MetadataBlock.class);
Root<DatasetFieldType> datasetFieldTypeRoot = criteriaQuery.from(DatasetFieldType.class);

Predicate requiredInDataversePredicate = buildRequiredInDataversePredicate(criteriaBuilder, datasetFieldTypeRoot);
Predicate fieldRequiredInTheInstallation = buildFieldRequiredInTheInstallationPredicate(criteriaBuilder, datasetFieldTypeRoot);

criteriaQuery.where(
criteriaBuilder.and(
criteriaBuilder.equal(metadataBlockRoot.get("id"), metadataBlock.getId()),
datasetFieldTypeRoot.in(metadataBlockRoot.get("datasetFieldTypes")),
criteriaBuilder.or(
criteriaBuilder.isTrue(datasetFieldTypeRoot.get("displayOnCreate")),
requiredInDataversePredicate
fieldRequiredInTheInstallation
)
)
);
Expand All @@ -890,16 +891,39 @@ public List<DatasetFieldType> findAllDisplayedOnCreateInMetadataBlock(MetadataBl
return typedQuery.getResultList();
}

public List<DatasetFieldType> findAllInMetadataBlockAndDataverse(MetadataBlock metadataBlock, Dataverse dataverse, boolean onlyDisplayedOnCreate) {
public List<DatasetFieldType> findAllInMetadataBlockAndDataverse(MetadataBlock metadataBlock, Dataverse dataverse, boolean onlyDisplayedOnCreate, DatasetType datasetType) {
if (!dataverse.isMetadataBlockRoot() && dataverse.getOwner() != null) {
return findAllInMetadataBlockAndDataverse(metadataBlock, dataverse.getOwner(), onlyDisplayedOnCreate);
return findAllInMetadataBlockAndDataverse(metadataBlock, dataverse.getOwner(), onlyDisplayedOnCreate, datasetType);
}

CriteriaBuilder criteriaBuilder = em.getCriteriaBuilder();
CriteriaQuery<DatasetFieldType> criteriaQuery = criteriaBuilder.createQuery(DatasetFieldType.class);

Root<MetadataBlock> metadataBlockRoot = criteriaQuery.from(MetadataBlock.class);
Root<DatasetFieldType> datasetFieldTypeRoot = criteriaQuery.from(DatasetFieldType.class);

// Build the main predicate to include fields that belong to the specified dataverse and metadataBlock and match the onlyDisplayedOnCreate value.
Predicate fieldPresentInDataverse = buildFieldPresentInDataversePredicate(dataverse, onlyDisplayedOnCreate, criteriaQuery, criteriaBuilder, datasetFieldTypeRoot, metadataBlockRoot);

// Build an additional predicate to include fields from the datasetType, if the datasetType is specified and contains the given metadataBlock.
Predicate fieldPresentInDatasetType = buildFieldPresentInDatasetTypePredicate(datasetType, criteriaQuery, criteriaBuilder, datasetFieldTypeRoot, metadataBlockRoot, onlyDisplayedOnCreate);

// Build the final WHERE clause by combining all the predicates.
criteriaQuery.where(
criteriaBuilder.equal(metadataBlockRoot.get("id"), metadataBlock.getId()), // Match the MetadataBlock ID.
datasetFieldTypeRoot.in(metadataBlockRoot.get("datasetFieldTypes")), // Ensure the DatasetFieldType is part of the MetadataBlock.
criteriaBuilder.or(
fieldPresentInDataverse,
fieldPresentInDatasetType
)
);

criteriaQuery.select(datasetFieldTypeRoot);

return em.createQuery(criteriaQuery).getResultList();
}

private Predicate buildFieldPresentInDataversePredicate(Dataverse dataverse, boolean onlyDisplayedOnCreate, CriteriaQuery<DatasetFieldType> criteriaQuery, CriteriaBuilder criteriaBuilder, Root<DatasetFieldType> datasetFieldTypeRoot, Root<MetadataBlock> metadataBlockRoot) {
Root<Dataverse> dataverseRoot = criteriaQuery.from(Dataverse.class);

// Join Dataverse with DataverseFieldTypeInputLevel on the "dataverseFieldTypeInputLevels" attribute, using a LEFT JOIN.
Expand Down Expand Up @@ -930,7 +954,7 @@ public List<DatasetFieldType> findAllInMetadataBlockAndDataverse(MetadataBlock m
Predicate hasNoInputLevelPredicate = criteriaBuilder.not(criteriaBuilder.exists(subquery));

// Define a predicate to include the required fields in Dataverse.
Predicate requiredInDataversePredicate = buildRequiredInDataversePredicate(criteriaBuilder, datasetFieldTypeRoot);
Predicate fieldRequiredInTheInstallation = buildFieldRequiredInTheInstallationPredicate(criteriaBuilder, datasetFieldTypeRoot);

// Define a predicate for displaying DatasetFieldTypes on create.
// If onlyDisplayedOnCreate is true, include fields that:
Expand All @@ -941,28 +965,57 @@ public List<DatasetFieldType> findAllInMetadataBlockAndDataverse(MetadataBlock m
? criteriaBuilder.or(
criteriaBuilder.or(
criteriaBuilder.isTrue(datasetFieldTypeRoot.get("displayOnCreate")),
requiredInDataversePredicate
fieldRequiredInTheInstallation
),
requiredAsInputLevelPredicate
)
: criteriaBuilder.conjunction();

// Build the final WHERE clause by combining all the predicates.
criteriaQuery.where(
// Combine all the predicates.
return criteriaBuilder.and(
criteriaBuilder.equal(dataverseRoot.get("id"), dataverse.getId()), // Match the Dataverse ID.
criteriaBuilder.equal(metadataBlockRoot.get("id"), metadataBlock.getId()), // Match the MetadataBlock ID.
metadataBlockRoot.in(dataverseRoot.get("metadataBlocks")), // Ensure the MetadataBlock is part of the Dataverse.
datasetFieldTypeRoot.in(metadataBlockRoot.get("datasetFieldTypes")), // Ensure the DatasetFieldType is part of the MetadataBlock.
criteriaBuilder.or(includedAsInputLevelPredicate, hasNoInputLevelPredicate), // Include DatasetFieldTypes based on the input level predicates.
displayedOnCreatePredicate // Apply the display-on-create filter if necessary.
);
}

criteriaQuery.select(datasetFieldTypeRoot).distinct(true);

return em.createQuery(criteriaQuery).getResultList();
private Predicate buildFieldPresentInDatasetTypePredicate(DatasetType datasetType,
CriteriaQuery<DatasetFieldType> criteriaQuery,
CriteriaBuilder criteriaBuilder,
Root<DatasetFieldType> datasetFieldTypeRoot,
Root<MetadataBlock> metadataBlockRoot,
boolean onlyDisplayedOnCreate) {
Predicate datasetTypePredicate = criteriaBuilder.isFalse(criteriaBuilder.literal(true)); // Initialize datasetTypePredicate to always false by default
if (datasetType != null) {
// Create a subquery to check for the presence of the specified metadataBlock within the datasetType
Subquery<Long> datasetTypeSubquery = criteriaQuery.subquery(Long.class);
Root<DatasetType> datasetTypeRoot = criteriaQuery.from(DatasetType.class);

// Define a predicate for displaying DatasetFieldTypes on create.
// If onlyDisplayedOnCreate is true, include fields that are either marked as displayed on create OR marked as required.
// Otherwise, use an always-true predicate (conjunction).
Predicate displayedOnCreatePredicate = onlyDisplayedOnCreate ?
criteriaBuilder.or(
criteriaBuilder.isTrue(datasetFieldTypeRoot.get("displayOnCreate")),
buildFieldRequiredInTheInstallationPredicate(criteriaBuilder, datasetFieldTypeRoot)
)
: criteriaBuilder.conjunction();

datasetTypeSubquery.select(criteriaBuilder.literal(1L))
.where(
criteriaBuilder.equal(datasetTypeRoot.get("id"), datasetType.getId()), // Match the DatasetType ID.
metadataBlockRoot.in(datasetTypeRoot.get("metadataBlocks")), // Ensure the metadataBlock is included in the datasetType's list of metadata blocks.
displayedOnCreatePredicate
);

// Now set the datasetTypePredicate to true if the subquery finds a matching metadataBlock
datasetTypePredicate = criteriaBuilder.exists(datasetTypeSubquery);
}
return datasetTypePredicate;
}

private Predicate buildRequiredInDataversePredicate(CriteriaBuilder criteriaBuilder, Root<DatasetFieldType> datasetFieldTypeRoot) {
private Predicate buildFieldRequiredInTheInstallationPredicate(CriteriaBuilder criteriaBuilder, Root<DatasetFieldType> datasetFieldTypeRoot) {
// Predicate to check if the current DatasetFieldType is required.
Predicate isRequired = criteriaBuilder.isTrue(datasetFieldTypeRoot.get("required"));

Expand Down
Loading
Loading