Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

propagate mcols() through setOps where possible #12

Open
rcorty opened this issue Apr 10, 2019 · 6 comments
Open

propagate mcols() through setOps where possible #12

rcorty opened this issue Apr 10, 2019 · 6 comments

Comments

@rcorty
Copy link

rcorty commented Apr 10, 2019

Just an idea -- where possible, propagate mcols().

For example,

mcols(c(a,b)) = rbind(mcols(a), mcols(b))

(IRanges already does this)

mcols(intersect(a,b))[i, ] = cbind(mcols(a)[j, ], mcols(b)[k, ])

where j is the row in a that gave rise to row i in the intersect and k is the same for b.

mcols(setdiff(a, b))[i, ] = mcols(a)[j, ]

where j is the row in a that gave rise to row i in the setdiff.

I don't think this idea applies to union, since a single output row doesn't necessarily map back to a single row in each input IRanges.

I think similar logic could apply to findOverlaps methods.

@hpages
Copy link
Contributor

hpages commented Apr 10, 2019

Unfortunately this idea doesn't apply to intersect or setdiff either because for these operations too a range in the output doesn't necessarily map back to a single range in each input object. This happens for example when some ranges have overlaps within the individual input objects:

> library(IRanges)
> a <- IRanges(c(2, 11), c(10, 15))
> b <- IRanges(c(8, 11), c(12, 20))
> intersect(a, b)
IRanges object with 1 range and 0 metadata columns:
          start       end     width
      <integer> <integer> <integer>
  [1]         8        15         8

This is because the ranges in the input object are conceptually reduced before the intersection is computed i.e. everything happens as if intersect(reduce(a), reduce(b)) was performed.

This idea doesn't apply to findOverlaps either for the same reason.

@rcorty
Copy link
Author

rcorty commented Apr 10, 2019

Ah, good point. Could/should there be a separate method for NormalRanges that does this propagation? If I'm understanding correctly, the funny business arises only in situations where the inputs aren't normal?

@rcorty
Copy link
Author

rcorty commented Apr 10, 2019

If the maintainers believe this is a good idea, I'll be happy to take a stab at it.

@hpages
Copy link
Contributor

hpages commented Apr 10, 2019

Sounds good to me. Just to clarify, we want to propagate the metadata columns from the left input object (x) for intersect() and setdiff(). So we only need to overwrite the intersect,IntegerRanges,IntegerRanges and setdiff,IntegerRanges,IntegerRanges methods with intersect,NormalIRanges,IntegerRanges and intersect,NormalIRanges,IntegerRanges methods. Yes this introduces a small dissymmetry in intersect() but propagating from the first argument is consistent with many other operations (and it's easy enough for the user to do intersect(b, a) instead of intersect(a, b) in case b is the NormalIRanges object). Also these new methods should probably return a NormalIRanges object. Thanks!

@rcorty
Copy link
Author

rcorty commented Apr 10, 2019

You're very welcome -- the IRanges package has been extremely useful to me.

@hpages
Copy link
Contributor

hpages commented Apr 10, 2019

One more thing, and this is just to put as much things as possible on the table, I should mention that reduce() and a few other inter-range transformations (e.g. disjoin() and range()) support the with.revmap argument for mapping back the output ranges to the input ranges. It sounds like set ops union(), intersect(), and setdiff() could also support this argument to map back the output ranges to the ranges in their first argument. If we had this, it would be trivial to implement the new intersect and setdiff methods discussed above. But unfortunately we don't have this :-/ So I think the easiest way for you is to use the heavy findOverlaps() hammer to compute this mapping. Unless you can think of something else...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants