Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propagate mcols NormalIRanges #13

Closed
wants to merge 4 commits into from
Closed

Propagate mcols NormalIRanges #13

wants to merge 4 commits into from

Conversation

rcorty
Copy link

@rcorty rcorty commented Apr 11, 2019

As discussed in #12

For intersect, I implemented two new signatures:

  1. NormalIRanges, NormalIRanges -- can propagate mcols of none (default), both, x, or y
  2. NormalIRanges, IntegerRanges -- can propagate mcols of none (default) or x

I wrote some tests for these new signatures and they pass

For setdiff, I implemented one new signature:

  1. NormalIRanges, IntegerRanges -- can propagate mcols of none (default) or x

I wrote tests for this new signature, which currently fail. (https://github.com/rcorty/IRanges/blob/aaa83ff9fb7fcc54d2c78029839f1a45375ebcf4/inst/unitTests/test_setops-methods.R#L71-L84) I haven't been able to figure out why. When I run the individual commands in the function definition on the inputs, the right result is produced, but inside the package, it doesn't work. Can someone more experienced have a look? I wonder if it has to do with getting the call dispatched to the correct signature?

NormalIRanges, NormalIRanges can propagate mcols of none, x, y, or both (default none)
NormalIRanges, IntegerRanges can propagate mols of x or none (default none)
NormalIRanges, IntegerRanges can propagate mcols of x or none — defaults to none

not working — new tests fail and I can’t figure out why — appears that the new function _does not work correctly_ as demonstrated in the new test (ie. not a problem with the test)
@hpages
Copy link
Contributor

hpages commented Apr 13, 2019

The current proposal with the propagate.mcols feels over engineered. I would prefer that we stick to the plan discussed here #12 (comment) , that is, only 2 new methods that propagate the metadata columns. So no need for propagate.mcols. If you think it's important to let the user decide whether the mcols should be propagated or not, please use the use.mcols argument (TRUE or FALSE, FALSE by default). This toggle is already used in various places (e.g. IRanges::ranges(), IRanges::rglist(), GenomicRanges::granges(), GenomicRanges::grglist()).

It's important that we try to preserve as much coherence as possible to the IRanges/GenomicRanges stack.

Please use this style

setMethod("intersect", c("NormalIRanges", "IntegerRanges"),
    function(x, y, propagate.mcols = c('none', 'both', 'x', 'y'))
    {
        ...
    }
)

instead of

setMethod(
    f = "intersect",
    signature = c("NormalIRanges", "IntegerRanges"),
    definition = function(
        x, 
        y, 
        propagate.mcols = c('none', 'both', 'x', 'y')) {
            ...
    }
)

for consistency with the existing style.

Always use double quotes (") instead of single quotes (') for string literals.

It is usually better to use callNextMethod():

ans <- callNextMethod()

than doing:

ans <- intersect(as(x, "IntegerRanges"), y)

Finally the setops-methods.Rd man page would need to be updated (add required aliases, add the 2 new methods to Usage section, and add some examples showing the behavior of these new methods).

And yes adding tests for these new methods is important.

Thanks!

@rcorty
Copy link
Author

rcorty commented Apr 13, 2019

Okay, will do. The reason I think it's important to allow the option not to propage the mcols is that, as you said, the mcols requires a call to findOverlaps which may be the slowest part.

Are you saying that the intersect(NormalIRanges,NormalIRanges) is not a good idea? The benefit I see is that it (almost) avoids asymmetry. (Of course the column order of mcols(ans) changes depending on the argument order.

@hpages
Copy link
Contributor

hpages commented Apr 13, 2019

I missed the possible usefulness of intersect(NormalIRanges, NormalIRanges, propagate.mcols="both"), sorry. So let's go for it. But for the sake of more consistency with the established argument naming scheme, I suggest that you use 2 arguments, use.x.mcols and use.y.mcols, both toggles (FALSE by default), instead of one (propagate.mcols), to control which mcols to propagate. Thanks!

still having problems with dispatch — some tests failing
@rcorty
Copy link
Author

rcorty commented Apr 13, 2019

I've made the edits you recommended. But I am having some issues with S4 method dispatch. For example, when I call intersect on two NormalIRanges objects (see line 44 of test-setops-method.R), I get this warning:

Note: method with signature ‘NormalIRanges#IRanges’ chosen for function ‘intersect’,
target signature ‘NormalIRanges#NormalIRanges’.
"NormalIRanges#IntegerRanges" would also be valid

@hpages
Copy link
Contributor

hpages commented Apr 14, 2019

I can't reproduce this:

hpages@spectre:~/sandbox$ git clone https://github.com/rcorty/IRanges.git
hpages@spectre:~/sandbox$ cd IRanges/
hpages@spectre:~/sandbox/IRanges$ git checkout propagate_mcols_NormalIRanges
hpages@spectre:~/sandbox/IRanges$ R CMD INSTALL .

Then from R:

library(IRanges)
showMethods("intersect")
# Function: intersect (package BiocGenerics)
# x="ANY", y="ANY"
# x="ANY", y="Rle"
# x="CompressedAtomicList", y="CompressedAtomicList"
# x="CompressedIRangesList", y="CompressedIRangesList"
# x="IntegerRanges", y="IntegerRanges"
# x="IntegerRangesList", y="IntegerRangesList"
# x="NormalIRanges", y="IntegerRanges"
# x="NormalIRanges", y="NormalIRanges"
# x="Pairs", y="missing"
# x="Rle", y="ANY"
# x="Rle", y="Rle"
# x="Vector", y="Vector"

library(RUnit)
x <- as(IRanges(c(1, 9), c(7, 14)), "NormalIRanges")
mcols(x)$test1 <- c("a", "b")
y <- as(IRanges(c(2, 2, 10), c(2, 3, 12)), "NormalIRanges")
mcols(y)$test2 = c("c", "d")
ans0 <- as(IRanges(c(2,10),c(3,12)), "NormalIRanges")
checkIdentical(intersect(x, y), ans0)
# [1] TRUE

Please only name optional arguments (i.e. arguments with a default value) in your function calls. So for example do:

ans <- as(callNextMethod(), "NormalIRanges")

instead of:

ans <- as(object = callNextMethod(), Class = "NormalIRanges")

or:

findOverlaps(ans, x)

instead of:

findOverlaps(query = ans, subject = x)

Thanks!

sessionInfo()

> sessionInfo()
R version 3.6.0 beta (2019-04-12 r76385)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS

Matrix products: default
BLAS:   /home/hpages/R/R-3.6.r76385/lib/libRblas.so
LAPACK: /home/hpages/R/R-3.6.r76385/lib/libRlapack.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C              
 [3] LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8    
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8   
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C            
[11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets 
[8] methods   base     

other attached packages:
[1] RUnit_0.4.32        IRanges_2.17.4      S4Vectors_0.21.23  
[4] BiocGenerics_0.29.2

loaded via a namespace (and not attached):
[1] compiler_3.6.0

@rcorty
Copy link
Author

rcorty commented Apr 16, 2019

I fixed the argument naming issue. I'm still seeing the same bugs, though.

I wonder if it could have to do with the fact that my BiocGenerics version is behind yours. Seems like I'd have to update to a pre-release version of R to get BiocGenerics 0.29. Could that be the problem? Typically, I wait until the release version of R to download.

> library(IRanges)
> library(RUnit)
> RUnit::runTestFile(absFileName = 'inst/unitTests/test_setops-methods.R')

Executing test function test_IntegerRangesList_setops  ...  done successfully.
Executing test function test_IRanges_intersect  ...  done successfully.
Executing test function test_IRanges_pgap  ...  done successfully.
Executing test function test_IRanges_pintersect  ...  done successfully.
Executing test function test_IRanges_psetdiff  ...  done successfully.
Executing test function test_IRanges_punion  ...  done successfully.
Executing test function test_IRanges_setdiff  ...  done successfully.
Executing test function test_IRanges_union  ...  done successfully.

Executing test function test_NormalIRanges_IRanges_intersect  ... Note: method with signature ‘IRanges#IRanges’ chosen for function ‘setdiff’,
 target signature ‘NormalIRanges#IRanges’.
 "IntegerRanges#IntegerRanges" would also be valid
 done successfully.

Executing test function test_NormalIRanges_IRanges_setdiff  ... Timing stopped at: 0.017 0.001 0.022
Error in .local(x, y, ...) : unused argument (use.mcols = use.mcols)
 done successfully.

Executing test function test_NormalIRanges_NormalIRanges_intersect  ... Note: method with signature ‘NormalIRanges#IRanges’ chosen for function ‘intersect’,
 target signature ‘NormalIRanges#NormalIRanges’.
 "NormalIRanges#IntegerRanges" would also be valid
Note: method with signature ‘IRanges#IRanges’ chosen for function ‘setdiff’,
 target signature ‘NormalIRanges#IRanges’.
 "IntegerRanges#IntegerRanges" would also be valid
Timing stopped at: 0.071 0.002 0.122
Error in .local(x, y, ...) : unused argument (use.x.mcols = use.x.mcols)
 done successfully.

Number of test functions: 11 
Number of errors: 2 
Number of failures: 0 

And here's my 'intersect' methods and sessionInfo...

> sessionInfo()
R version 3.5.3 (2019-03-11)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.4

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] RUnit_0.4.32        IRanges_2.17.4      S4Vectors_0.21.23   BiocGenerics_0.28.0

loaded via a namespace (and not attached):
[1] zlibbioc_1.28.0 compiler_3.5.3  XVector_0.22.0  tools_3.5.3     yaml_2.2.0     
>
>
> showMethods('intersect')
Function: intersect (package BiocGenerics)
x="ANY", y="ANY"
x="ANY", y="Rle"
x="CompressedAtomicList", y="CompressedAtomicList"
x="CompressedIRangesList", y="CompressedIRangesList"
x="IntegerRanges", y="IntegerRanges"
x="IntegerRangesList", y="IntegerRangesList"
x="IRanges", y="IRanges"
    (inherited from: x="IntegerRanges", y="IntegerRanges")
x="NormalIRanges", y="IntegerRanges"
x="NormalIRanges", y="IRanges"
    (inherited from: x="NormalIRanges", y="IntegerRanges")
x="NormalIRanges", y="NormalIRanges"
x="Pairs", y="missing"
x="Rle", y="ANY"
x="Rle", y="Rle"
x="SimpleIRangesList", y="SimpleIRangesList"
    (inherited from: x="IntegerRangesList", y="IntegerRangesList")
x="Vector", y="Vector"

@hpages
Copy link
Contributor

hpages commented Apr 16, 2019

You have a mix of release (BioC 3.8) and devel (BioC 3.9) packages. This can only cause trouble. Make sure you always install packages with BiocManager::install() to avoid this. Use BiocManager::valid() to check that your packages are in sync with the version of Bioconductor you are using.

Bioconductor packages should always be developed against the current devel version of Bioconductor, which is BioC 3.9 at the moment, and is tied with R 3.6. So yes, you need a recent R 3.6 (e.g. R 3.6 beta) and BioC 3.9. Loading BiocManager will tell you what version of Bioconductor you are using:

> library(BiocManager)
Bioconductor version 3.9 (BiocManager 1.30.4), ?BiocManager::install for help

@rcorty
Copy link
Author

rcorty commented Apr 16, 2019

Ah, I see. I've gotten R 3.6 and the devel version of Bioconductor and the errors seem the same to me. The only difference I see between your sessionInfo and mine is that my version of IRanges is 2.17.5, which I had to bump from 2.17.4 to get BiocManager::valid() to return TRUE. Maybe I should pull in some changes made to IRanges since I branched off?

> library(RUnit)
> library(IRanges)
Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: ‘BiocGenerics’

The following objects are masked from ‘package:parallel’:

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ, clusterExport,
    clusterMap, parApply, parCapply, parLapply, parLapplyLB, parRapply,
    parSapply, parSapplyLB

The following objects are masked from ‘package:stats’:

    IQR, mad, sd, var, xtabs

The following objects are masked from ‘package:base’:

    anyDuplicated, append, as.data.frame, basename, cbind, colMeans, colnames,
    colSums, dirname, do.call, duplicated, eval, evalq, Filter, Find, get, grep,
    grepl, intersect, is.unsorted, lapply, Map, mapply, match, mget, order,
    paste, pmax, pmax.int, pmin, pmin.int, Position, rank, rbind, Reduce,
    rowMeans, rownames, rowSums, sapply, setdiff, sort, table, tapply, union,
    unique, unsplit, which, which.max, which.min

Loading required package: S4Vectors
Loading required package: stats4

Attaching package: ‘S4Vectors’

The following object is masked from ‘package:base’:

    expand.grid

> runTestFile(absFileName = 'inst/unitTests/test_setops-methods.R')

Executing test function test_IntegerRangesList_setops  ... Timing stopped at: 0.122 0.004 0.127
Error in .range_CompressedIRangesList(x, with.revmap = with.revmap) : 
  the XVector package is required by the "range" method for CompressedIRangesList objects
 done successfully.

Executing test function test_IRanges_intersect  ...  done successfully.

Executing test function test_IRanges_pgap  ...  done successfully.

Executing test function test_IRanges_pintersect  ...  done successfully.

Executing test function test_IRanges_psetdiff  ...  done successfully.

Executing test function test_IRanges_punion  ...  done successfully.

Executing test function test_IRanges_setdiff  ...  done successfully.

Executing test function test_IRanges_union  ...  done successfully.

Executing test function test_NormalIRanges_IRanges_intersect  ... Note: method with signature ‘IRanges#IRanges’ chosen for function ‘setdiff’,
 target signature ‘NormalIRanges#IRanges’.
 "IntegerRanges#IntegerRanges" would also be valid
 done successfully.

Executing test function test_NormalIRanges_IRanges_setdiff  ... Timing stopped at: 0.009 0 0.009
Error in .local(x, y, ...) : unused argument (use.mcols = use.mcols)
 done successfully.

Executing test function test_NormalIRanges_NormalIRanges_intersect  ... Note: method with signature ‘NormalIRanges#IRanges’ chosen for function ‘intersect’,
 target signature ‘NormalIRanges#NormalIRanges’.
 "NormalIRanges#IntegerRanges" would also be valid
Note: method with signature ‘IRanges#IRanges’ chosen for function ‘setdiff’,
 target signature ‘NormalIRanges#IRanges’.
 "IntegerRanges#IntegerRanges" would also be valid
Timing stopped at: 0.06 0.002 0.059
Error in .local(x, y, ...) : unused argument (use.x.mcols = use.x.mcols)
 done successfully.

Number of test functions: 11 
Number of errors: 3 
Number of failures: 0 
> sessionInfo()
R version 3.6.0 beta (2019-04-15 r76395)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Mojave 10.14.4

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.6/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    parallel  stats     graphics  grDevices utils     datasets  methods  
[9] base     

other attached packages:
[1] IRanges_2.17.5      S4Vectors_0.21.23   BiocGenerics_0.29.2 RUnit_0.4.32       

loaded via a namespace (and not attached):
[1] compiler_3.6.0 tools_3.6.0   
> BiocManager::valid()
[1] TRUE

@rcorty
Copy link
Author

rcorty commented Apr 16, 2019

Nevermind on that idea -- the only change since I forked was incrementing the version number.

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants