Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: issues in exporters and citations for PermaLink/non-DOI PIDs #10790

Open
wants to merge 16 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 8 additions & 7 deletions src/main/java/edu/harvard/iq/dataverse/DataCitation.java
Original file line number Diff line number Diff line change
Expand Up @@ -293,11 +293,13 @@ public void writeAsBibtexCitation(OutputStream os) throws IOException {
out.write("version = {");
out.write(version);
out.write("},\r\n");
out.write("doi = {");
out.write(persistentId.getAuthority());
out.write("/");
out.write(persistentId.getIdentifier());
out.write("},\r\n");
if("doi".equals(persistentId.getProtocol())) {
out.write("doi = {");
out.write(persistentId.getAuthority());
out.write("/");
out.write(persistentId.getIdentifier());
out.write("},\r\n");
}
out.write("url = {");
out.write(persistentId.asURL());
out.write("}\r\n");
Expand Down Expand Up @@ -619,8 +621,7 @@ private void createEndNoteXML(XMLStreamWriter xmlw) throws XMLStreamException {
}
if (persistentId != null) {
xmlw.writeStartElement("electronic-resource-num");
String electResourceNum = persistentId.getProtocol() + "/" + persistentId.getAuthority() + "/"
+ persistentId.getIdentifier();
String electResourceNum = persistentId.asString();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has it been verified that an initial : after protocol is OK for EndNote import? It seems like the original code used / because ICPSR did. Can someone look at what EndNote exports and make sure we match it here? @adam3smith says he can check this next week if no one else has access.

Copy link
Contributor Author

@vera vera Aug 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I wondered about that slash there. I tried to find some info on the EndNote XML format but found only this schema, which didn't specify anything about the format of electronic-resource-num, so I thought it might have been a mistake to use the slash. It would be great if someone could check.

xmlw.writeCharacters(electResourceNum);
xmlw.writeEndElement();
}
Expand Down
16 changes: 15 additions & 1 deletion src/main/java/edu/harvard/iq/dataverse/DvObject.java
Original file line number Diff line number Diff line change
Expand Up @@ -144,12 +144,14 @@ public String visit(DataFile df) {
@Column(insertable = false, updatable = false) private String dtype;

/*
* Add DOI related fields
* Add PID related fields
*/

private String protocol;
private String authority;

private String separator;

@Temporal(value = TemporalType.TIMESTAMP)
private Date globalIdCreateTime;

Expand Down Expand Up @@ -324,6 +326,16 @@ public void setAuthority(String authority) {
globalId=null;
}

public String getSeparator() {
return separator;
}

public void setSeparator(String separator) {
this.separator = separator;
//Remove cached value
globalId=null;
}

public Date getGlobalIdCreateTime() {
return globalIdCreateTime;
}
Expand Down Expand Up @@ -354,11 +366,13 @@ public void setGlobalId( GlobalId pid ) {
if ( pid == null ) {
setProtocol(null);
setAuthority(null);
setSeparator(null);
setIdentifier(null);
} else {
//These reset globalId=null
setProtocol(pid.getProtocol());
setAuthority(pid.getAuthority());
setSeparator(pid.getSeparator());
setIdentifier(pid.getIdentifier());
}
}
Expand Down
4 changes: 4 additions & 0 deletions src/main/java/edu/harvard/iq/dataverse/GlobalId.java
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,10 @@ public String getAuthority() {
return authority;
}

public String getSeparator() {
return separator;
}

public String getIdentifier() {
return identifier;
}
Expand Down
11 changes: 10 additions & 1 deletion src/main/java/edu/harvard/iq/dataverse/api/dto/DatasetDTO.java
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ public class DatasetDTO implements java.io.Serializable {
private String identifier;
private String protocol;
private String authority;
private String separator;
private String globalIdCreateTime;
private String publisher;
private String publicationDate;
Expand Down Expand Up @@ -51,6 +52,14 @@ public void setAuthority(String authority) {
this.authority = authority;
}

public String getSeparator() {
return separator;
}

public void setSeparator(String separator) {
this.separator = separator;
}

public String getGlobalIdCreateTime() {
return globalIdCreateTime;
}
Expand Down Expand Up @@ -94,7 +103,7 @@ public void setPublicationDate(String publicationDate) {

@Override
public String toString() {
return "DatasetDTO{" + "id=" + id + ", identifier=" + identifier + ", protocol=" + protocol + ", authority=" + authority + ", globalIdCreateTime=" + globalIdCreateTime + ", datasetVersion=" + datasetVersion + ", dataFiles=" + dataFiles + '}';
return "DatasetDTO{" + "id=" + id + ", identifier=" + identifier + ", protocol=" + protocol + ", authority=" + authority + ", separator=" + separator + ", globalIdCreateTime=" + globalIdCreateTime + ", datasetVersion=" + datasetVersion + ", dataFiles=" + dataFiles + '}';
}

public void setMetadataLanguage(String metadataLanguage) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -162,11 +162,12 @@ private static void createStdyDscr(XMLStreamWriter xmlw, DatasetDTO datasetDto)

String persistentAuthority = datasetDto.getAuthority();
String persistentId = datasetDto.getIdentifier();
String persistentSeparator = datasetDto.getSeparator();

String pid = persistentProtocol + ":" + persistentAuthority + "/" + persistentId;
String pid = persistentProtocol + ":" + persistentAuthority + persistentSeparator + persistentId;
String pidUri = pid;
//Some tests don't send real PIDs - don't try to get their URL form
if(!pidUri.equals("null:null/null")) {
if(!pidUri.equals("null:nullnullnull")) {
pidUri= PidUtil.parseAsGlobalID(persistentProtocol, persistentAuthority, persistentId).asURL();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you avoid line 167 by checking for parseAsGlobalId returning null here and setting pid/pidUri using GlobalId if it exists and to the constant "null:nullnullnull" if it doesn't? (Mostly a minor style thing but it avoids writing out unexpected invalid identifier entries as well.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

}
// The "persistentAgency" tag is used for the "agency" attribute of the
Expand Down Expand Up @@ -337,14 +338,15 @@ private static void writeDocDescElement (XMLStreamWriter xmlw, DatasetDTO datase

String persistentAuthority = datasetDto.getAuthority();
String persistentId = datasetDto.getIdentifier();
String persistentSeparator = datasetDto.getSeparator();

xmlw.writeStartElement("docDscr");
xmlw.writeStartElement("citation");
xmlw.writeStartElement("titlStmt");
writeFullElement(xmlw, "titl", dto2Primitive(version, DatasetFieldConstant.title), datasetDto.getMetadataLanguage());
xmlw.writeStartElement("IDNo");
writeAttribute(xmlw, "agency", persistentAgency);
xmlw.writeCharacters(persistentProtocol + ":" + persistentAuthority + "/" + persistentId);
xmlw.writeCharacters(persistentProtocol + ":" + persistentAuthority + persistentSeparator + persistentId);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same - can you parse the authority/separator/identifier and use GlobalId.asString() ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

xmlw.writeEndElement(); // IDNo
xmlw.writeEndElement(); // titlStmt
xmlw.writeStartElement("distStmt");
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -397,6 +397,7 @@ public static JsonObjectBuilder json(Dataset ds, Boolean returnOwners) {
.add("persistentUrl", ds.getPersistentURL())
.add("protocol", ds.getProtocol())
.add("authority", ds.getAuthority())
.add("separator", ds.getSeparator())
.add("publisher", BrandingUtil.getInstallationBrandName())
.add("publicationDate", ds.getPublicationDateFormattedYYYYMMDD())
.add("storageIdentifier", ds.getStorageIdentifier());
Expand Down
3 changes: 3 additions & 0 deletions src/main/resources/db/migration/V6.3.0.2.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
ALTER TABLE dvobject ADD COLUMN IF NOT EXISTS separator character varying(255) DEFAULT '';

UPDATE dvobject SET separator='/' WHERE protocol = 'doi' OR protocol = 'hdl';
4 changes: 2 additions & 2 deletions src/test/java/edu/harvard/iq/dataverse/DataCitationTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -266,7 +266,7 @@ public void testToEndNoteString_withTitleAndAuthor() throws ParseException {
"<edition>V1</edition>" +
"<publisher>LibraScholar</publisher>" +
"<urls><related-urls><url>https://doi.org/10.5072/FK2/LK0D1H</url></related-urls></urls>" +
"<electronic-resource-num>doi/10.5072/FK2/LK0D1H</electronic-resource-num>" +
"<electronic-resource-num>doi:10.5072/FK2/LK0D1H</electronic-resource-num>" +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's weird that this old test expected doi/10... instead of doi:10.... It feels like this PR is cleaning things up! ❤️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@qqmyers commented above that this slash might be required by EndNote, although it does look weird to me 🤔

"</record>" +
"</records>" +
"</xml>";
Expand Down Expand Up @@ -295,7 +295,7 @@ public void testToEndNoteString_withoutTitleAndAuthor() throws ParseException {
"<edition>V1</edition>" +
"<publisher>LibraScholar</publisher>" +
"<urls><related-urls><url>https://doi.org/10.5072/FK2/LK0D1H</url></related-urls></urls>" +
"<electronic-resource-num>doi/10.5072/FK2/LK0D1H</electronic-resource-num>" +
"<electronic-resource-num>doi:10.5072/FK2/LK0D1H</electronic-resource-num>" +
"</record>" +
"</records>" +
"</xml>";
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
"persistentUrl": "https://doi.org/10.5072/FK2/WKUKGV",
"protocol": "doi",
"authority": "10.5072/FK2",
"separator": "/",
"publisher": "Root",
"publicationDate": "2020-02-19",
"datasetVersion": {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
"persistentUrl": "https://doi.org/10.5072/FK2/PCA2E3",
"protocol": "doi",
"authority": "10.5072/FK2",
"separator": "/",
"metadataLanguage": "en",
"datasetVersion": {
"id": 2,
Expand Down
Loading