Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task: Evaluate QDR's HEAL metadata block for possible HDV integration #235

Open
Tracked by #174
cmbz opened this issue Apr 30, 2024 · 7 comments
Open
Tracked by #174

Task: Evaluate QDR's HEAL metadata block for possible HDV integration #235

cmbz opened this issue Apr 30, 2024 · 7 comments
Assignees
Labels
GREI 2 Consistent Metadata Project: NIH GREI Tasks related to the NIH GREI project

Comments

@cmbz
Copy link
Contributor

cmbz commented Apr 30, 2024

Overview

  • Evaluate the existing HEAL metadata block from QDR for possible HDV integration

Resources

@cmbz cmbz added Project: NIH GREI Tasks related to the NIH GREI project GREI 2 Consistent Metadata labels Apr 30, 2024
@cmbz
Copy link
Contributor Author

cmbz commented May 7, 2024

Status: May 2024

  • @jggautier started gathering information from stakeholders, including managers of Harvard Dataverse and QDR, to learn more about their work with representatives from HEAL and about how they plan to use the metadata block.
  • @landreev created a Dataverse instance with the metadata block so that @sbarbosadataverse and anyone else can review it.

@cmbz
Copy link
Contributor Author

cmbz commented May 28, 2024

Status: June 2024

@jggautier
Copy link

jggautier commented Jun 18, 2024

@sbarbosadataverse asked me to summarize what we know and what we need to learn about the goals of the HEAL metadata block and how QDR and Harvard Dataverse plan to support those goals. Most of this is based on what I could find in the HEAL Data Platform site (https://healdata.org/landing) and from an email conversation with Sebastian Karcher, Jim Myers, Sonia and Ceilyn:

  • The HEAL Data Platform indexes data from other repositories to improve discovery and access to the data of studies that are funded by HEAL or relevant to HEAL.
  • It also supports the discovery of "registered studies", which I'm taking to mean published plans to do research, so there may not be research data, yet, and the data might be published in a different research object and with a different persistent ID.
  • HEAL's selection guide page at https://www.healdatafair.org/resources/guidance/selection lists Harvard Dataverse and QDR as HEAL-complaint repositories, which it defines as "a data repository that is NIH-supported and ideally has an API for metadata and data permissions calls"
  • Sonia's spoken with folks from HEAL.
  • Sebastian wrote that the metadata block was created "mostly automatically", using a script that uses the HEAL Study Metadata JSON to create the metadata block
  • The script, info about how it works, and the metadata block TSV file is in QDR's GitHub repository at https://github.com/QualitativeDataRepository/heal-json
  • A Dataverse test instance with the metadata block has been at http://ec2-18-209-237-44.compute-1.amazonaws.com since mid May 2024. Anyone can create an account there and create a dataset to see what the metadata block looks like.
  • Sebastian wrote that they "found some glitches with existing HEAL metadata which doesn’t appear to follow its own schema" and that "the metadatablock doesn’t look great when published: The only way to make that happen is to get rid of a lot of the hierarchies that are built into HEAL, which we decided to keep to facilitate conversion – but there are definitely alternative approaches to take that will look nicer on DV"
  • As far as I could tell, QDR hadn't used the metadata block yet to publish any outputs from research that is funded by HEAL or relevant to HEAL
  • Sonia's said that she's talking with people who are interested in using Harvard Dataverse to publish their HEAL-funded or HEAL-relevant research outputs. So Harvard Dataverse is working directly with researchers who need to publish the outputs of studies that are funded by HEAL or relevant to HEAL
  • Sonia wrote that HEAL will likely publish content across all NIH-GREI repositories
  • The HEAL Platform Documentation includes a Data File Download page about how they're hosting some dataset files, too, and making those available through their portal as long as the byte size of each file in the dataset is under 250 MB

Questions

  • Does QDR and Harvard Dataverse plan to support the publication of registered studies, which I'm taking to mean published plans to do research?
  • Has QDR or Harvard Dataverse spoken with HEAL about how they plan to or might index outputs published in those repositories. For example, have they spoken about the use of Dataverse APIs for getting the metadata published in QDR or Harvard Dataverse?
  • Does HEAL plan to download files from QDR and Harvard Dataverse to make them available on the HEAL Platform?

@sbarbosadataverse
Copy link

SB and Julian followed up on this issue. SB will follow up with HEAL and QDR individually on the metadata block and with HEAL on the other questions from June 18th comment.

@sbarbosadataverse
Copy link

We are waiting on a meeting, likely for October, to discuss a HEAL collection in HDV as well as discussion on the metadata Block.

@cmbz
Copy link
Contributor Author

cmbz commented Oct 10, 2024

October: 2024

  • Met with members of the HEAL team to gain insight into their metadata and data support needs. They plan to review the Dataverse API and other documentation, then make a proposal for how we could proceed with support.

@qqmyers
Copy link
Member

qqmyers commented Dec 20, 2024

FYI: One thing I've noticed about the HEAL block from QDR is that is allows multiple values in child fields which aren't really supported in the display page. The ones I've seen are controlled vocab, and on the edit pane you can select more than one value. However, on the display page, the combined display for the parent field only shows one value per child field. This probably needs to be fixed, assuming the requirement is real. Probably easy to just update the display code to support multiple values (versus trying to restructure the block to only use multiple vals in primitive fields).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GREI 2 Consistent Metadata Project: NIH GREI Tasks related to the NIH GREI project
Projects
None yet
Development

No branches or pull requests

4 participants