-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Epitope data from TDC #96
base: main
Are you sure you want to change the base?
Conversation
I add Epitope data from tdc here : https://tdcommons.ai/single_pred_tasks/epitope/ I will need help in validation my approach, need to ensure the indices start with 0 or 1 in epitope active binding one
for more information, see https://pre-commit.ci
@phalem Thanks a lot for your efforts; it is really great to have you take the lead on the datasets; we could only have that much progress thanks to your contributions. Do you think that makes sense? If you want help with that or to discuss this, I'm happy to do so. |
Thanks kevin for your kind words. I think that problem raise from description that TDC provide. This can be solved by break |
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add benchmark field
data/iedb_jespersen_et_al/meta.yaml
Outdated
units: '' | ||
type: categorical | ||
names: | ||
- amino acids sequence active in binding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the target is a sequence and the input is the full sequence?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the target is a sequence and the input is the full sequence?
Yeah, I forget to comment on that. Input was sequence and target is sequence as well we could make it categorical, but with a lot of category. I don't know what is your opinion of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is how it is currently implemented. I set it up to only give the corresponding epitope sequence without the _
for the other parts of the sequence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once more: Thanks 💯
(And similar comments)
As data was complex and get in an indirect way. I didn't implement split
Before I forget @kjappelbaum we need someone to ensure if index of start with 0 or 1. As it will be different sequences and situation |
@MicPie we need to carefully look into this. I don't think this is a simple classification/regression task. |
Also need to check how this overlaps with #67 |
I add Epitope data from tdc here :
https://tdcommons.ai/single_pred_tasks/epitope/
I will need help in validation my approach, need to ensure the indices start with 0 or 1 in epitope active binding one