Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

expand makeGRangesFromDataFrame functionality #32

Open
wants to merge 2 commits into
base: devel
Choose a base branch
from
Open

expand makeGRangesFromDataFrame functionality #32

wants to merge 2 commits into from

Conversation

LiNk-NY
Copy link
Contributor

@LiNk-NY LiNk-NY commented Feb 3, 2020

Hi Hervé, @hpages

I thought something like this would be useful.
Let me know what you think.
Particularly for inherits(x, "GenomicRanges"), it could also work
for GRanges but I thought GenomicRanges would be more
general.

if (length(mcols) == 0L && inherits(x, "GenomicRanges"))

Thanks,
Marcel

@LiNk-NY LiNk-NY requested a review from hpages February 3, 2020 23:29
@LiNk-NY LiNk-NY self-assigned this Feb 4, 2020
@hpages
Copy link
Contributor

hpages commented Feb 4, 2020

Thanks Marcel,

I would expect all 3 forms to work:

makeGPosFromDataFrame(data.frame(seqnames=1:6, pos=11:16))
makeGPosFromDataFrame(data.frame(seqnames=1:6, start=11:16))
makeGPosFromDataFrame(data.frame(seqnames=1:6, end=11:16))

but right now only the 1st one works.

I would also expect this to fail but not in such an obscure way:

makeGPosFromDataFrame(data.frame(seqnames=1:6, start=11:16, end=20))
# Error in stop_if_wrong_length("'seqnames'", ans_len) :
#   'seqnames' must have the length of the object to construct (45) or
#  length 1

Seems like we should start by agreeing on what behavior we expect exactly. A good way to do this is to come up with a set of unit tests that cover all the situations we anticipate.

I don't think makeGPosFromDataFrame() should have the start.field and end.field arguments. We only need a pos.field argument to let the user specify which data frame column contains the pos information in case we fail to auto-detect.

Also from a package organization point of view I'd prefer if makeGPosFromDataFrame was next to makeGRangesFromDataFrame i.e. the 2 functions should be implemented in the same file, documented in the same file, and have their unit tests in the same file. Thanks again!

H.

@hpages
Copy link
Contributor

hpages commented Feb 4, 2020

I should add that a GPos is a GRanges object so I wonder if we actually need 2 functions. Given that a GPos object can generally be used in any place where a GRanges is expected it sounds like it would be acceptable if makeGRangesFromDataFrame() was returning a GPos under certain conditions. We could add an argument, say as, to provide more control. It would be set to "auto" by default, but the user could set it to "GRanges" or "GPos" to enforce one type or the other.

@LiNk-NY LiNk-NY changed the title add helper makeGPosFromDataFrame expand makeGRangesFromDataFrame functionality Feb 4, 2020
@LiNk-NY
Copy link
Contributor Author

LiNk-NY commented Feb 4, 2020

Hi Hervé, @hpages
I've updated the PR to reflect your feedback.
I hacked the index localization a bit in R/makeGRangesFromDataFrame.R
so that GPos is supported.
Thanks
-MR

@LiNk-NY
Copy link
Contributor Author

LiNk-NY commented Feb 11, 2020

Hi Hervé, @hpages
I know you're busy but when you have a chance let me know
if there is anything else that you'd like me to add. Thanks.
-Marcel

@liutiming
Copy link

liutiming commented Apr 14, 2021

I was thinking of this functionality too. However, I do have difficulty in remembering the functions/arguments in many Bioconductor packages. Should makeGRangesFromDataFrame have a logic that when pos exists as a column, it sets both start and end according to that column (if not otherwise specified) and display a warning?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants