-
Notifications
You must be signed in to change notification settings - Fork 493
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IQSS 9375 - Retention period #10336
IQSS 9375 - Retention period #10336
Conversation
src/main/java/edu/harvard/iq/dataverse/DataFileServiceBean.java
Outdated
Show resolved
Hide resolved
…s has not expired
src/main/java/edu/harvard/iq/dataverse/search/IndexServiceBean.java
Outdated
Show resolved
Hide resolved
@sekmiller hope it's OK now. |
Thanks @PaulBoon. I was away yesterday, but I will make time today to take a look. |
@PaulBoon One thing that I noticed - if I use the multiple file download button when one of the files I have selected is no longer available due to the retention expiration the warning message refers to "Restricted Files" |
Also can you add a release note that briefly describes the feature and notes that the schema.xml must be updated in the upgrade process. Thanks |
@sekmiller The best solution I can think of is to change these messages in a way that they also cover these files with an expired retention. A multiple selection can contain both types and it would be difficult to adapt the message to that dynamically. |
FWIW: There's a similar issue w.r.t. files that are only available for Globus transfer and I've made changes to these messages and display logic in #10451. I'd suggest adding retention period expiration into those changes messages rather than tweaking the existing ones. (That PR is also in QA - not sure if it can get merged and picked up here in a reasonable time or not.) |
@PaulBoon - FYI: we had a discussion in the daily standup mtg and think the way to go is to get #10451 merged and have you update this PR to include it from develop, and you or I can then make changes. For the changes themselves, I think the idea was to go with the simpler title of "Inaccessible Files Selected" and then include retention period expired as one of the options in the descriptive text. |
@qqmyers Just added the release notes, and as with most of this PR I reused yours. I will wait for that other PR to be merged while I am on a short vacation (just a week). |
…red retention period as possibility
@PaulBoon Hi Paul, Thanks for the updates.
|
This is strange, because I don't see that happening. Maybe your CSS is not the same? If I inspect the page I see that this 'Draft: .." string is inside a |
@sekmiller About the observations:
|
@sekmiller Just commited additions to the input handling of that set-retention API end point.
|
@PaulBoon Thanks for the updates they look good. Your error message for improper json led me to try leaving some data out of a well-formed json object. This is what I found: If I use the api with a json object that doesn't include "fileIds", I get a success message and a record is written to the retention table, but of course there are no records in the datafile table with that retention_id. If I leave out "reason" I get an empty response {} and nothing seems to happen to the retention table. If I leave out "dateUnavailable" I get a proper error message - no change needed here. |
At some point the Dataverse APIs should have some formal specification which we can hopefully use to do some automated 'schema' validation against. This sort of 'manually' handling of all possible issues with the input is cumbersome. But until then... |
Thanks for hanging in there and making the updates. I will merge it shortly. |
What this PR does / why we need it:
The minimal implementation of the retention feature, which will make a file unavailable after the retention period (or end date) has expired. This is opposite to the embargo feature, because that makes it available after the embargo expires.
Which issue(s) this PR closes:
Closes #9375
Special notes for your reviewer:
After the retention expires any (optional) full-text indexing will remain, potentially exposing information via searching that should not be exposed anymore. This PR prevents this by not allowing full-text indexing for files with a retention, even if is has not expired. This is similar to how restricted files are handled. This is not optimal, but save and simple.
Suggestions on how to test this:
The Solr schema.xml has changed, so you need to update this in your existing dev machine.
Copy the schema.xml file; with the docker dev setup this wil be
docker cp conf/solr/9.3.0/schema.xml 3bfcc68124cb:/var/solr/data/collection1/conf/schema.xml
.When you have custom metadata blocks you then need to run the update script, which is in
conf/solr/9.3.0/update-fields.sh
.Let Solr reload that changed schema.xml
curl "http://localhost:8983/solr/admin/cores?action=RELOAD&core=collection1"
Configure the system to allow retention:
Set the minimum at 10 years for example:
curl -X PUT -d 120 http://localhost:8080/api/admin/settings/:MinRetentionDurationInMonths
Add a retention to a file
Make the retention period expire, by changing the database record
On the docker dev box, the command is
psql -U dataverse -W
and the password is 'secret'.Change the 'dateunavailable' to somtime in the past; for example:
UPDATE retention SET dateunavailable='2015-02-28' WHERE id='1';
The id is '1' in this example, but you could check the retention_id field in the datafile table for that specific file.
Does this PR introduce a user interface change? If mockups are available, please link/include them here:
There is a dialog to set the Retention:
When the retention has expired, the Dataset result shows that as a label, like with embargoed.:
Note that this test dataset also has an embargoed file.
When trying to download it will not provide it, but instead show that the retention has expired:
Is there a release notes update needed for this change?:
Yes, The Solr schema must be updated
Additional documentation:
NOTE the flyway script
src/main/resources/db/migration/V6.1.0.99__xxxx-retention.sql
should be renamed depending on the last number when merging.