Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IQSS 9375 - Retention period #10336

Merged
merged 46 commits into from
May 1, 2024
Merged
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
5d07f7c
Implemented Retention class and it's persistency in the database
PaulBoon Feb 20, 2024
1cd5968
Implemented set and unset retention API
PaulBoon Feb 21, 2024
c113e66
Implemented minimal retention output on the GUI
PaulBoon Feb 21, 2024
fd4a628
Initial implementation for the retention dialog
PaulBoon Feb 22, 2024
06c7551
Some fixes for the retention dialog
PaulBoon Feb 26, 2024
fa0ddf8
Added missing property for file.assignedRetention.success
PaulBoon Feb 27, 2024
677704b
Default Retention constructor now has 1000 years period
PaulBoon Feb 27, 2024
a353418
Remove referenced retentions with file deletion
PaulBoon Feb 27, 2024
f571f68
Initial implementation for making files unavailable after the retenti…
PaulBoon Feb 29, 2024
8a720a1
Disable access request for files after the retention period
PaulBoon Feb 29, 2024
013eee2
Disable access for superusers after the retention period
PaulBoon Mar 5, 2024
07c944c
Changed retention wording; files are unavailable when the retention h…
PaulBoon Mar 5, 2024
cc4c42a
Initial documentation for the retention
PaulBoon Mar 6, 2024
4a55b0a
Fixed isPubliclyDownloadable for the retention
PaulBoon Mar 7, 2024
d77c28f
Implemented unit tests for the file retention
PaulBoon Mar 7, 2024
3432afb
Added minimal integration test for the file retention
PaulBoon Mar 12, 2024
2be3c0a
Added file retention info to metadata export for OAI_ORE and datavers…
PaulBoon Mar 12, 2024
5b2f397
Use minimal retention period to initialise the date unavailable on th…
PaulBoon Mar 12, 2024
7d51ec4
Do not allow full-text indexing for files with a retention, even if i…
PaulBoon Mar 12, 2024
1077a4f
Merged with upstream develop
PaulBoon Mar 12, 2024
52c5be5
Undo unwanted change in astrophysics.properties
PaulBoon Mar 20, 2024
c03e635
Improved retention section of the dataset management guide
PaulBoon Mar 20, 2024
ada8e0f
Improved retention section of the native api guide
PaulBoon Mar 20, 2024
005c823
Fixe file indexing with embargo and or retention dates
PaulBoon Mar 20, 2024
ac96deb
Added searching with the RetentionPeriodExpired file access status
PaulBoon Mar 20, 2024
3257b87
Added RetentionPeriodExpired to the native api guide
PaulBoon Mar 20, 2024
21db794
Change 'Retention Expired' to 'Retention Period Expired'
PaulBoon Mar 21, 2024
81cd03d
Fix metadata display of end dates of embargo and retention
PaulBoon Mar 25, 2024
6005657
File status background red (label-danger) for retention expired
PaulBoon Mar 25, 2024
d95cd7f
Changed the retention metadata GUI text
PaulBoon Mar 26, 2024
9587c1f
minor text changes and removing extra changes in schema.xml
qqmyers Mar 27, 2024
a760bb2
Fixing file publication date metadata field display
PaulBoon Mar 27, 2024
e29fcfc
Merge pull request #7 from GlobalDataverseCommunityConsortium/Retenti…
PaulBoon Mar 27, 2024
83ab52a
Removed unused embargoDate.label from the Bundle.properties
PaulBoon Mar 27, 2024
0c665eb
Improved file publication date metadata field display
PaulBoon Mar 28, 2024
d5b5331
Use label-warning for status labels at dataset level and label-danger…
PaulBoon Mar 28, 2024
8826e7c
Improved file publication date metadata field display
PaulBoon Mar 28, 2024
857833f
Merge branch 'develop' into RetentionPeriod
PaulBoon Apr 15, 2024
00f9dc1
Renamed flyway SQL upgrade script for the retention
PaulBoon Apr 15, 2024
8758557
Added release notes for the retention period
PaulBoon Apr 18, 2024
35f9da0
Merge branch 'develop' of github.com:IQSS/dataverse into RetentionPeriod
PaulBoon Apr 29, 2024
3a3f6cd
Changed selected file download 'warning' messages to include the expi…
PaulBoon Apr 30, 2024
43fbbf2
Improve messages for the set-retention and unset-retention API
PaulBoon May 1, 2024
3cdd3c6
Improve input validation with response messages for the set-retention…
PaulBoon May 1, 2024
639c162
More improvements on input validation with response messages for the …
PaulBoon May 1, 2024
8f65a46
Improvements on input validation with response messages for the unset…
PaulBoon May 1, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 13 additions & 11 deletions conf/solr/9.3.0/schema.xml
Original file line number Diff line number Diff line change
Expand Up @@ -157,7 +157,9 @@
<field name="publicationStatus" type="string" stored="true" indexed="true" multiValued="true"/>
<field name="externalStatus" type="string" stored="true" indexed="true" multiValued="false"/>
<field name="embargoEndDate" type="plong" stored="true" indexed="true" multiValued="false"/>


<field name="retentionEndDate" type="plong" stored="true" indexed="true" multiValued="false"/>

<field name="subtreePaths" type="string" stored="true" indexed="true" multiValued="true"/>

<field name="fileName" type="text_en" stored="true" indexed="true" multiValued="true"/>
Expand Down Expand Up @@ -288,12 +290,12 @@
<field name="coverage.Temporal.StopTime" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="dataCollectionSituation" type="text_en" multiValued="false" stored="true" indexed="true"/>
<field name="dataCollector" type="text_en" multiValued="false" stored="true" indexed="true"/>
<field name="dataSources" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="datasetContact" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="datasetContactAffiliation" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="datasetContactEmail" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="datasetContactName" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="datasetLevelErrorNotes" type="text_en" multiValued="false" stored="true" indexed="true"/>
<field name="dataSources" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="dateOfCollection" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="dateOfCollectionEnd" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="dateOfCollectionStart" type="text_en" multiValued="true" stored="true" indexed="true"/>
Expand Down Expand Up @@ -379,13 +381,13 @@
<field name="studyAssayOrganism" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="studyAssayOtherMeasurmentType" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="studyAssayOtherOrganism" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="studyAssayPlatform" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="studyAssayOtherPlatform" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="studyAssayTechnologyType" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="studyAssayOtherTechnologyType" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="studyAssayPlatform" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="studyAssayTechnologyType" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="studyDesignType" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="studyOtherDesignType" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="studyFactorType" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="studyOtherDesignType" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="studyOtherFactorType" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="subject" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="subtitle" type="text_en" multiValued="false" stored="true" indexed="true"/>
Expand All @@ -397,10 +399,10 @@
<field name="timePeriodCoveredEnd" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="timePeriodCoveredStart" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="title" type="text_en" multiValued="false" stored="true" indexed="true"/>
<field name="topicClassification" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="topicClassValue" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="topicClassVocab" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="topicClassVocabURI" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="topicClassification" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="unitOfAnalysis" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="universe" type="text_en" multiValued="true" stored="true" indexed="true"/>
<field name="weighting" type="text_en" multiValued="false" stored="true" indexed="true"/>
Expand Down Expand Up @@ -527,12 +529,12 @@
<copyField source="coverage.Temporal.StopTime" dest="_text_" maxChars="3000"/>
<copyField source="dataCollectionSituation" dest="_text_" maxChars="3000"/>
<copyField source="dataCollector" dest="_text_" maxChars="3000"/>
<copyField source="dataSources" dest="_text_" maxChars="3000"/>
<copyField source="datasetContact" dest="_text_" maxChars="3000"/>
<copyField source="datasetContactAffiliation" dest="_text_" maxChars="3000"/>
<copyField source="datasetContactEmail" dest="_text_" maxChars="3000"/>
<copyField source="datasetContactName" dest="_text_" maxChars="3000"/>
<copyField source="datasetLevelErrorNotes" dest="_text_" maxChars="3000"/>
<copyField source="dataSources" dest="_text_" maxChars="3000"/>
<copyField source="dateOfCollection" dest="_text_" maxChars="3000"/>
<copyField source="dateOfCollectionEnd" dest="_text_" maxChars="3000"/>
<copyField source="dateOfCollectionStart" dest="_text_" maxChars="3000"/>
Expand Down Expand Up @@ -618,13 +620,13 @@
<copyField source="studyAssayOrganism" dest="_text_" maxChars="3000"/>
<copyField source="studyAssayOtherMeasurmentType" dest="_text_" maxChars="3000"/>
<copyField source="studyAssayOtherOrganism" dest="_text_" maxChars="3000"/>
<copyField source="studyAssayPlatform" dest="_text_" maxChars="3000"/>
<copyField source="studyAssayOtherPlatform" dest="_text_" maxChars="3000"/>
<copyField source="studyAssayTechnologyType" dest="_text_" maxChars="3000"/>
<copyField source="studyAssayOtherTechnologyType" dest="_text_" maxChars="3000"/>
<copyField source="studyAssayPlatform" dest="_text_" maxChars="3000"/>
<copyField source="studyAssayTechnologyType" dest="_text_" maxChars="3000"/>
<copyField source="studyDesignType" dest="_text_" maxChars="3000"/>
<copyField source="studyOtherDesignType" dest="_text_" maxChars="3000"/>
<copyField source="studyFactorType" dest="_text_" maxChars="3000"/>
<copyField source="studyOtherDesignType" dest="_text_" maxChars="3000"/>
<copyField source="studyOtherFactorType" dest="_text_" maxChars="3000"/>
<copyField source="subject" dest="_text_" maxChars="3000"/>
<copyField source="subtitle" dest="_text_" maxChars="3000"/>
Expand All @@ -636,10 +638,10 @@
<copyField source="timePeriodCoveredEnd" dest="_text_" maxChars="3000"/>
<copyField source="timePeriodCoveredStart" dest="_text_" maxChars="3000"/>
<copyField source="title" dest="_text_" maxChars="3000"/>
<copyField source="topicClassification" dest="_text_" maxChars="3000"/>
<copyField source="topicClassValue" dest="_text_" maxChars="3000"/>
<copyField source="topicClassVocab" dest="_text_" maxChars="3000"/>
<copyField source="topicClassVocabURI" dest="_text_" maxChars="3000"/>
<copyField source="topicClassification" dest="_text_" maxChars="3000"/>
<copyField source="unitOfAnalysis" dest="_text_" maxChars="3000"/>
<copyField source="universe" dest="_text_" maxChars="3000"/>
<copyField source="weighting" dest="_text_" maxChars="3000"/>
Expand Down
33 changes: 32 additions & 1 deletion doc/sphinx-guides/source/api/native-api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2498,7 +2498,38 @@ The API call requires a Json body that includes the list of the fileIds that the
export JSON='{"fileIds":[300,301]}'

curl -H "X-Dataverse-key: $API_TOKEN" -H "Content-Type:application/json" "$SERVER_URL/api/datasets/:persistentId/files/actions/:unset-embargo?persistentId=$PERSISTENT_IDENTIFIER" -d "$JSON"


Set a Retention on Files in a Dataset
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

``/api/datasets/$dataset-id/files/actions/:set-retention`` can be used to set a retention on one or more files in a dataset. Retentions can be set on files that are only in a draft dataset version (and are not in any previously published version) by anyone who can edit the dataset. The same API call can be used by a superuser to add a retention to files that have already been released as part of a previously published dataset version.

The API call requires a Json body that includes the retention's end date (dateUnavailable), a short reason (optional), and a list of the fileIds that the retention should be set on. The dateUnavailable must be after the current date and the duration (dateUnavailable - today's date) must be more than the value specified by the :ref:`:MinRetentionDurationInMonths` setting. All files listed must be in the specified dataset. For example:

.. code-block:: bash

export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export PERSISTENT_IDENTIFIER=doi:10.5072/FK2/7U7YBV
export JSON='{"dateUnavailable":"2051-12-31", "reason":"Standard project embargo", "fileIds":[300,301,302]}'
PaulBoon marked this conversation as resolved.
Show resolved Hide resolved

curl -H "X-Dataverse-key: $API_TOKEN" -H "Content-Type:application/json" "$SERVER_URL/api/datasets/:persistentId/files/actions/:set-retention?persistentId=$PERSISTENT_IDENTIFIER" -d "$JSON"

Remove a Retention on Files in a Dataset
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

``/api/datasets/$dataset-id/files/actions/:unset-retention`` can be used to remove a retention on one or more files in a dataset. Retentions can be removed from files that are only in a draft dataset version (and are not in any previously published version) by anyone who can edit the dataset. The same API call can be used by a superuser to remove retentions from files that have already been released as part of a previously published dataset version.

The API call requires a Json body that includes the list of the fileIds that the retention should be removed from. All files listed must be in the specified dataset. For example:

.. code-block:: bash

export API_TOKEN=xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx
export SERVER_URL=https://demo.dataverse.org
export PERSISTENT_IDENTIFIER=doi:10.5072/FK2/7U7YBV
export JSON='{"fileIds":[300,301]}'

curl -H "X-Dataverse-key: $API_TOKEN" -H "Content-Type:application/json" "$SERVER_URL/api/datasets/:persistentId/files/actions/:unset-retention?persistentId=$PERSISTENT_IDENTIFIER" -d "$JSON"

.. _Archival Status API:

Expand Down
12 changes: 12 additions & 0 deletions doc/sphinx-guides/source/installation/config.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4316,6 +4316,18 @@ can enter for an embargo end date. This limit will be enforced in the popup dial

``curl -X PUT -d 24 http://localhost:8080/api/admin/settings/:MaxEmbargoDurationInMonths``

.. _:MinRetentionDurationInMonths:

:MinRetentionDurationInMonths
+++++++++++++++++++++++++++++

This setting controls whether retentions are allowed in a Dataverse instance and can limit the minimum duration users are allowed to specify. A value of 0 months or non-existent
setting indicates retentions are not supported. A value of -1 allows retentions of any length. Any other value indicates the minimum number of months (from the current date) a user
can enter for a retention end date. This limit will be enforced in the popup dialog in which users enter the retention date. For example, to set a ten year minimum:
PaulBoon marked this conversation as resolved.
Show resolved Hide resolved

``curl -X PUT -d 120 http://localhost:8080/api/admin/settings/:MaxEmbargoDurationInMonths``


:DataverseMetadataValidatorScript
+++++++++++++++++++++++++++++++++

Expand Down
8 changes: 8 additions & 0 deletions doc/sphinx-guides/source/user/dataset-management.rst
Original file line number Diff line number Diff line change
Expand Up @@ -735,6 +735,14 @@ Once a dataset with embargoed files has been published, no further action is nee

As the primary use case of embargoes is to make the existence of data known now, with a promise (to a journal, project team, etc.) that the data itself will become available at a given future date, users cannot change an embargo once a dataset version is published. Dataverse instance administrators do have the ability to correct mistakes and make changes if/when circumstances warrant.

Retentions
PaulBoon marked this conversation as resolved.
Show resolved Hide resolved
==========

Support for file-level retentions can also be configured in a Dataverse instance. Retentions make file content inaccessible after the retention end date. This means that file previews and the ability to download files will be blocked. The effect is similar to when a file is restricted except that the retention will end at the specified date without further action and after the retention, requests for file access cannot be made.

Retentions are intended to support use cases where files must be made unavailable (or even destroyed) after a certain period or date.
PaulBoon marked this conversation as resolved.
Show resolved Hide resolved
Actual destruction is not automatically handled, but would have to be done on the storage if needed.
PaulBoon marked this conversation as resolved.
Show resolved Hide resolved

Dataset Versions
================

Expand Down
12 changes: 12 additions & 0 deletions src/main/java/edu/harvard/iq/dataverse/DataFile.java
Original file line number Diff line number Diff line change
Expand Up @@ -242,6 +242,18 @@ public void setEmbargo(Embargo embargo) {
this.embargo = embargo;
}

@ManyToOne
@JoinColumn(name="retention_id")
private Retention retention;

public Retention getRetention() {
return retention;
}

public void setRetention(Retention retention) {
this.retention = retention;
}

public DataFile() {
this.fileMetadatas = new ArrayList<>();
initFileReplaceAttributes();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -1365,7 +1365,10 @@ public Embargo findEmbargo(Long id) {
DataFile d = find(id);
return d.getEmbargo();
}


public boolean isRetentionExpired(FileMetadata fm) {
return FileUtil.isRetentionExpired(fm);
}
/**
* Checks if the supplied DvObjectContainer (Dataset or Collection; although
* only collection-level storage quotas are officially supported as of now)
Expand Down
Loading
Loading