Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build scheduled Lambda function to search NIH/DOE for related works #60

Open
briri opened this issue Oct 2, 2023 · 3 comments
Open
Assignees

Comments

@briri
Copy link
Collaborator

briri commented Oct 2, 2023

Need to build a fairly generic lambda that can be used to schedule runs for our different API searches (aka ApiScheduler).

Consider having this Lambda fetch DMP IDs that are scan candidates

  • Ones with no grant id (no grant_id and funding_status: 'planned'
  • Ones approaching their project end date (or n-months past the start date?)

Then have this Lambda use EventBridge messaging to run scans async to a 2nd Lambda function (aka ApiScanner). This one will perform the following:

  • Call the specified API for the designated DMP IDs
  • If the item has an opportunity number that was clearly awarded to another project, mark it as funding_status: 'rejected'
  • Add grant_id and change funding_status: 'granted' for plans
  • Add dmproadmap_related_identifiers
  • Add PIs if not already present

All of the above changes should be placed into the dmphub_modifications section of the record!

@briri
Copy link
Collaborator Author

briri commented Oct 5, 2023

Update the dmphub_modifications array so that it includes a confidence score as well as a note about how the match was determined (e.g. 'matched on grant_id and PI names')

@mariapraetzellis let me know what you think about the following

Here is preliminary logic for generating a confidence score:

Confidence levels:

  • Auto - We are so confident that no human intervention is required (humans can reject still if needed)
  • High - We are 75%+ certain that they items are related but it needs human confirmation
  • Med - We are 50% - 75% certain that they items are related defintiely needs human confirmation
  • Low - We are less than 50% certain which means human confirmation is required

Related Works (e.g. detected by DataCite EventData, OpenAlex, etc.):

Confidence         Match type
--------------------------------------------------------------------------------------------------
Auto               Grant ID match (in a scenario where the external system had record of the grant ID)
Auto               DMP ID match (in a scenario where the external system had record of the DMP ID)

High               1+ PI ORCID matches, funders match and title/abstract keywords match
High               1+ PI ROR matches, funders match and title/abstract keywords match
High               Funders match, repository match and title/abstract keywords match
High               1+ PI names match, repository match, output type match and title/abstract keywords match

Med                1+ PI names match and title/abstract keywords match
Med                Funders match and title/abstract keywords match
Med                one or more PI names match and title/abstract keywords match

Low                1+ PI names and title/abstract keywords match
Low                Funders match, repository match and title/abstract keywords match
Low                Funders match and title/abstract keywords match

Grant Information (e.g. detected by NIH Awards API, etc.):

Confidence         Match type
--------------------------------------------------------------------------------------------------
High                Funder match, opportunity number match and 1+ PI names match
High                Funder match, title exact match, 1+ PI names match
High                Funder match, 1+ PI names match, project start/end match and title/abstract keywords match

Med                Funder match, 1+ PI names match and title/abstract keywords match
Med                Funder match, 1+ PI names match and project start/end match
Med                Funder match, opportunity number match and title/abstract keywords match

Low                Funder match, 1+ PI names match and project start/end match
Low                Funder match, project start/end match and title/abstract keywords match

@briri
Copy link
Collaborator Author

briri commented Dec 11, 2023

The API functionality is there and being used by the new React UI. We just need to build a schedule-able Lambda that will call them to search for info. Will do that once we've identified a pattern for scanning that reduces the burden on the external API (see issues #66 and #77)

@briri
Copy link
Collaborator Author

briri commented Jul 22, 2024

We could add a new harvester Lambda that searches for award information using the logic we built for the pilot project's funder project search page.
Any discovered information could be added to the HARVESTER_MODS item in the Dynamo table for a DMP and then tied into future curation workflows in the new system

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant