Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

R package for semantic similarity calculation using non-KB and offline resources #20

Open
hlapp opened this issue Jan 23, 2023 · 3 comments
Labels
project idea Idea(s) for a subgroup project at the event

Comments

@hlapp
Copy link
Member

hlapp commented Jan 23, 2023

Create an R package for computing pairwise and profile semantic similarity metrics similar to the Rphenoscape package, but instead of using the Phenoscape KB API to obtain subsumers of input terms, use a more general online service (in particular, Ubergraph / Relationgraph queries using SPARQL), and ultimately offline sources (in particular, downloaded relation graph edge tables, and Ubergraph in the form of SemSQL table downloads).

Rphenoscape includes methods for calculating a variety of both pairwise and profile semantic similarity metrics, but relies on the Phenoscape KB API to obtain subsumers (and, for IC-based metrics, term frequencies). This limits these capabilities to the ontologies that are part of the current KB build. Adding new ontologies to the KB build is both a non-trivial undertaking, and outside of the control of researchers outside of the SCATE / Phenoscape project. The goal here is to make the semantic similarity algorithms much more easily available to existing or future new ontologies that aren't directly used within Phenoscape.

@hlapp hlapp added the project idea Idea(s) for a subgroup project at the event label Jan 23, 2023
@wdahdul
Copy link

wdahdul commented Jan 24, 2023

Would modified mutual exclusivity functions be appropriate to add here? Currently mutual exclusivity is returned based on evidence from studies in the KB but could potential query a user defined dataset. The optional quality_opposites parameter for mutually exclusivity will be user specified, so opening the overall function to other datasets might make sense.

@hlapp
Copy link
Member Author

hlapp commented Jan 28, 2023

@wdahdul it's not impossible to add this in some way to ubeRsim, but we'd first need to understand a lot better how we would want to define this.

More specifically, for Rphenoscape mutual exclusivity is determined for phenotypes in the KB. Phenotypes could be in Ubergraph, but presumably only from pre-composed named classes in requisite phenotype ontologies (such as HPO or MP), and not as the anonymous class expressions we use for annotating natural trait data.

Though perhaps it's time to create a pre-composed phenotype ontology for natural trait data as well. Or try to add our phenotypes to one that's already being developed (OBA? UBERPHENO?). (Thoughts @balhoff ?)

@hlapp
Copy link
Member Author

hlapp commented Jan 28, 2023

I'm actually moving this to phenoscape/ubeRsim#2. Additional comments please either to that issue, or create a new one on the ubeRsim issue tracker.

@phenoscape phenoscape locked as resolved and limited conversation to collaborators Jan 28, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
project idea Idea(s) for a subgroup project at the event
Projects
None yet
Development

No branches or pull requests

2 participants