diff --git a/NEWS.md b/NEWS.md index b239f5f..fe80b46 100644 --- a/NEWS.md +++ b/NEWS.md @@ -62,7 +62,7 @@ R-release due to pandoc version not available preventing the `pvalue-function` v test is performed. * Add a function to compute p-value functions for sets of null hypotheses. * Draft of article illustrating the computation of p-value functions with -[**flipr**](https://lmjl-alea.github.io/flipr/). +[**flipr**](https://permaverse.github.io/flipr/). * Add $t$, mean and Fisher test statistics. * Correct two-tail p-value computation. * Better API for pvalue function. diff --git a/README.Rmd b/README.Rmd index f2d4a92..1c5330b 100644 --- a/README.Rmd +++ b/README.Rmd @@ -13,22 +13,22 @@ knitr::opts_chunk$set( ) ``` -# Overview +# Overview -[![R-CMD-check](https://github.com/LMJL-Alea/flipr/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/LMJL-Alea/flipr/actions/workflows/R-CMD-check.yaml) -[![test-coverage](https://github.com/LMJL-Alea/flipr/workflows/test-coverage/badge.svg)](https://github.com/LMJL-Alea/flipr/actions) -[![Codecov test coverage](https://codecov.io/gh/LMJL-Alea/flipr/branch/master/graph/badge.svg)](https://codecov.io/gh/LMJL-Alea/flipr?branch=master) -[![pkgdown](https://github.com/LMJL-Alea/flipr/workflows/pkgdown/badge.svg)](https://github.com/LMJL-Alea/flipr/actions) +[![R-CMD-check](https://github.com/permaverse/flipr/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/permaverse/flipr/actions/workflows/R-CMD-check.yaml) +[![test-coverage](https://github.com/permaverse/flipr/workflows/test-coverage/badge.svg)](https://github.com/permaverse/flipr/actions) +[![Codecov test coverage](https://codecov.io/gh/permaverse/flipr/branch/master/graph/badge.svg)](https://codecov.io/gh/permaverse/flipr?branch=master) +[![pkgdown](https://github.com/permaverse/flipr/workflows/pkgdown/badge.svg)](https://github.com/permaverse/flipr/actions) [![CRAN status](https://www.r-pkg.org/badges/version/flipr)](https://CRAN.R-project.org/package=flipr) -The goal of the [**flipr**](https://lmjl-alea.github.io/flipr/) package is to +The goal of the [**flipr**](https://permaverse.github.io/flipr/) package is to provide a flexible framework for making inference via permutation. The idea is to promote the permutation framework as an incredibly well-suited tool for inference on complex data. You supply your data, as complex as it might be, in the form of lists in which each entry stores one data point in a representation -that suits you and [**flipr**](https://lmjl-alea.github.io/flipr/) takes care of +that suits you and [**flipr**](https://permaverse.github.io/flipr/) takes care of the permutation magic and provides you with either point estimates or confidence regions or $p$-value of hypothesis tests. Permutation tests are especially appealing because they are exact no matter how small or big your sample sizes @@ -36,7 +36,7 @@ are. You can also use the so-called *non-parametric combination* approach in this setting to combine several statistics to better target the alternative hypothesis you are testing against. Asymptotic consistency is also guaranteed under mild conditions on the statistic you use. The -[**flipr**](https://lmjl-alea.github.io/flipr/) package provides a flexible +[**flipr**](https://permaverse.github.io/flipr/) package provides a flexible permutation framework for making inference such as point estimation, confidence intervals or hypothesis testing, on any kind of data, be it univariate, multivariate, or more complex such as network-valued data, topological data, @@ -44,18 +44,21 @@ functional data or density-valued data. ## Installation -You can install the latest stable version of -[**flipr**](https://lmjl-alea.github.io/flipr/) on CRAN with: +You can install the package from [CRAN](https://CRAN.R-project.org) with: + ``` r install.packages("flipr") ``` -Or you can install the development version from [GitHub](https://github.com/) with: +Alternatively, You can install the development version of +[**flipr**](https://permaverse.github.io/flipr/) from +[GitHub](https://github.com/) with: ``` r -# install.packages("remotes") -remotes::install_github("LMJL-Alea/flipr") +# install.packages("pak") +pak::pak("permaverse/flipr") ``` + ## Example ```{r} @@ -64,7 +67,7 @@ library(flipr) We hereby use the very simple t-test for comparing the means of two univariate samples to show how easy it is to carry out a permutation test with -[**flipr**](https://lmjl-alea.github.io/flipr/). +[**flipr**](https://permaverse.github.io/flipr/). ### Data generation @@ -98,12 +101,12 @@ null_spec <- function(y, parameters) { Next, we need to decide which test statistic(s) we are going to use for performing the test. Here, we are only interested in one parameter, namely the mean difference $\delta$. Since the two samples share the same variance, we can use for example the $t$-statistic with a pooled estimate of the common variance. -This statistic can be easily computed using `stats::t.test(x, y, var.equal = TRUE)$statistic`. However, we want to extend its evaluation to any permuted version of the data. Test statistic functions compatible with [**flipr**](https://lmjl-alea.github.io/flipr/) should have at least two mandatory input arguments: +This statistic can be easily computed using `stats::t.test(x, y, var.equal = TRUE)$statistic`. However, we want to extend its evaluation to any permuted version of the data. Test statistic functions compatible with [**flipr**](https://permaverse.github.io/flipr/) should have at least two mandatory input arguments: - `data` which is either a concatenated list of size $n_x + n_y$ regrouping the data points of both samples or a distance matrix of size $(n_x + n_y) \times (n_x + n_y)$ stored as an object of class `dist`. - `indices1` which is an integer vector of size $n_x$ storing the indices of the data points belonging to the first sample in the current permuted version of the data. -Some test statistics are already implemented in [**flipr**](https://lmjl-alea.github.io/flipr/) and ready to use. User-defined test statistics can be used as well, with the use of the helper function `use_stat(nsamples = 2, stat_name = )`. This function creates and saves an `.R` file in the `R/` folder of the current working directory and populates it with the following template: +Some test statistics are already implemented in [**flipr**](https://permaverse.github.io/flipr/) and ready to use. User-defined test statistics can be used as well, with the use of the helper function `use_stat(nsamples = 2, stat_name = )`. This function creates and saves an `.R` file in the `R/` folder of the current working directory and populates it with the following template: ```{r, eval=FALSE} #' Test Statistic for the Two-Sample Problem #' @@ -141,7 +144,7 @@ stat_{{{name}}} <- function(data, indices1) { } ``` -For instance, a [**flipr**](https://lmjl-alea.github.io/flipr/)-compatible version of the $t$-statistic with pooled variance will look like: +For instance, a [**flipr**](https://permaverse.github.io/flipr/)-compatible version of the $t$-statistic with pooled variance will look like: ```{r} my_t_stat <- function(data, indices1) { n <- if (inherits(data, "dist")) @@ -172,7 +175,7 @@ stat_functions <- list(my_t_stat) ### Assign test statistics to parameters -Finally we need to define a named list that tells [**flipr**](https://lmjl-alea.github.io/flipr/) which test statistics among the ones declared in the `stat_functions` list should be used for each parameter under investigation. This is used to determine bounds on each parameter for the plausibility function. This list, often termed `stat_assignments`, should therefore have as many elements as there are parameters under investigation. Each element should be named after a parameter under investigation and should list the indices corresponding to the test statistics that should be used for that parameter in `stat_functions`. In our example, it boils down to: +Finally we need to define a named list that tells [**flipr**](https://permaverse.github.io/flipr/) which test statistics among the ones declared in the `stat_functions` list should be used for each parameter under investigation. This is used to determine bounds on each parameter for the plausibility function. This list, often termed `stat_assignments`, should therefore have as many elements as there are parameters under investigation. Each element should be named after a parameter under investigation and should list the indices corresponding to the test statistics that should be used for that parameter in `stat_functions`. In our example, it boils down to: ```{r} stat_assignments <- list(delta = 1) ``` diff --git a/README.md b/README.md index eb826c3..d3b17c1 100644 --- a/README.md +++ b/README.md @@ -1,26 +1,26 @@ -# Overview +# Overview -[![R-CMD-check](https://github.com/LMJL-Alea/flipr/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/LMJL-Alea/flipr/actions/workflows/R-CMD-check.yaml) -[![test-coverage](https://github.com/LMJL-Alea/flipr/workflows/test-coverage/badge.svg)](https://github.com/LMJL-Alea/flipr/actions) +[![R-CMD-check](https://github.com/permaverse/flipr/actions/workflows/R-CMD-check.yaml/badge.svg)](https://github.com/permaverse/flipr/actions/workflows/R-CMD-check.yaml) +[![test-coverage](https://github.com/permaverse/flipr/workflows/test-coverage/badge.svg)](https://github.com/permaverse/flipr/actions) [![Codecov test -coverage](https://codecov.io/gh/LMJL-Alea/flipr/branch/master/graph/badge.svg)](https://codecov.io/gh/LMJL-Alea/flipr?branch=master) -[![pkgdown](https://github.com/LMJL-Alea/flipr/workflows/pkgdown/badge.svg)](https://github.com/LMJL-Alea/flipr/actions) +coverage](https://codecov.io/gh/permaverse/flipr/branch/master/graph/badge.svg)](https://codecov.io/gh/permaverse/flipr?branch=master) +[![pkgdown](https://github.com/permaverse/flipr/workflows/pkgdown/badge.svg)](https://github.com/permaverse/flipr/actions) [![CRAN status](https://www.r-pkg.org/badges/version/flipr)](https://CRAN.R-project.org/package=flipr) -The goal of the [**flipr**](https://lmjl-alea.github.io/flipr/) package +The goal of the [**flipr**](https://permaverse.github.io/flipr/) package is to provide a flexible framework for making inference via permutation. The idea is to promote the permutation framework as an incredibly well-suited tool for inference on complex data. You supply your data, as complex as it might be, in the form of lists in which each entry stores one data point in a representation that suits you and -[**flipr**](https://lmjl-alea.github.io/flipr/) takes care of the +[**flipr**](https://permaverse.github.io/flipr/) takes care of the permutation magic and provides you with either point estimates or confidence regions or $p$-value of hypothesis tests. Permutation tests are especially appealing because they are exact no matter how small or @@ -29,7 +29,7 @@ big your sample sizes are. You can also use the so-called statistics to better target the alternative hypothesis you are testing against. Asymptotic consistency is also guaranteed under mild conditions on the statistic you use. The -[**flipr**](https://lmjl-alea.github.io/flipr/) package provides a +[**flipr**](https://permaverse.github.io/flipr/) package provides a flexible permutation framework for making inference such as point estimation, confidence intervals or hypothesis testing, on any kind of data, be it univariate, multivariate, or more complex such as @@ -38,19 +38,20 @@ data. ## Installation -You can install the latest stable version of -[**flipr**](https://lmjl-alea.github.io/flipr/) on CRAN with: +You can install the package from [CRAN](https://CRAN.R-project.org) +with: ``` r install.packages("flipr") ``` -Or you can install the development version from +Alternatively, You can install the development version of +[**flipr**](https://permaverse.github.io/flipr/) from [GitHub](https://github.com/) with: ``` r -# install.packages("remotes") -remotes::install_github("LMJL-Alea/flipr") +# install.packages("pak") +pak::pak("permaverse/flipr") ``` ## Example @@ -61,7 +62,7 @@ library(flipr) We hereby use the very simple t-test for comparing the means of two univariate samples to show how easy it is to carry out a permutation -test with [**flipr**](https://lmjl-alea.github.io/flipr/). +test with [**flipr**](https://permaverse.github.io/flipr/). ### Data generation @@ -119,8 +120,8 @@ This statistic can be easily computed using `stats::t.test(x, y, var.equal = TRUE)$statistic`. However, we want to extend its evaluation to any permuted version of the data. Test statistic functions compatible with -[**flipr**](https://lmjl-alea.github.io/flipr/) should have at least two -mandatory input arguments: +[**flipr**](https://permaverse.github.io/flipr/) should have at least +two mandatory input arguments: - `data` which is either a concatenated list of size $n_x + n_y$ regrouping the data points of both samples or a distance matrix of @@ -131,7 +132,7 @@ mandatory input arguments: current permuted version of the data. Some test statistics are already implemented in -[**flipr**](https://lmjl-alea.github.io/flipr/) and ready to use. +[**flipr**](https://permaverse.github.io/flipr/) and ready to use. User-defined test statistics can be used as well, with the use of the helper function `use_stat(nsamples = 2, stat_name = )`. This function creates and saves an `.R` file in the `R/` folder of the current working @@ -175,7 +176,7 @@ stat_{{{name}}} <- function(data, indices1) { ``` For instance, a -[**flipr**](https://lmjl-alea.github.io/flipr/)-compatible version of +[**flipr**](https://permaverse.github.io/flipr/)-compatible version of the $t$-statistic with pooled variance will look like: ``` r @@ -214,7 +215,7 @@ stat_functions <- list(my_t_stat) ### Assign test statistics to parameters Finally we need to define a named list that tells -[**flipr**](https://lmjl-alea.github.io/flipr/) which test statistics +[**flipr**](https://permaverse.github.io/flipr/) which test statistics among the ones declared in the `stat_functions` list should be used for each parameter under investigation. This is used to determine bounds on each parameter for the plausibility function. This list, often termed diff --git a/vignettes/flipr.Rmd b/vignettes/flipr.Rmd index 5ad548b..7cb07a7 100644 --- a/vignettes/flipr.Rmd +++ b/vignettes/flipr.Rmd @@ -26,12 +26,12 @@ one to perform point estimation, confidence regions and hypothesis tests under mild assumptions about the collected data and no distributional assumption. In this article, we briefly illustrate how each of these aspects can be treated from a permutation point of view using the -[**flipr**](https://lmjl-alea.github.io/flipr/) package. This package has been +[**flipr**](https://permaverse.github.io/flipr/) package. This package has been written and is intended as a low-level implementation of the permutation framework in the context of statistical inference. The mathematical object behind the scene is the so-called plausibility function, sometimes called p-value function. This article explains what the plausibility function is and shows how it can be easily computed using the -permutation framework with [**flipr**](https://lmjl-alea.github.io/flipr/). We +permutation framework with [**flipr**](https://permaverse.github.io/flipr/). We illustrate the shape of the plausibility function using both Gaussian and Gamma distributions. @@ -100,12 +100,12 @@ represents the variation of the $p$-value of a test in which the null hypothesis is $\delta = \delta_0$ as a function of $\delta_0$ [@martin2017; @fraser2019; @infanger2019]. -With [**flipr**](https://lmjl-alea.github.io/flipr/), it is easy to trace +With [**flipr**](https://permaverse.github.io/flipr/), it is easy to trace such a plausibility function. Three ingredients are required alongside the data to instantiate such a function: -- a null specification function that tells [**flipr**](https://lmjl-alea.github.io/flipr/) how the second sample should be transformed in order to make it exchangeable with the first sample under the null hypothesis; +- a null specification function that tells [**flipr**](https://permaverse.github.io/flipr/) how the second sample should be transformed in order to make it exchangeable with the first sample under the null hypothesis; - a list of test statistics to use for detecting differences between the distributions that generated the two observed samples; -- a list of index assignments that tells [**flipr**](https://lmjl-alea.github.io/flipr/) which test statistics to use for each parameter under investigation. +- a list of index assignments that tells [**flipr**](https://permaverse.github.io/flipr/) which test statistics to use for each parameter under investigation. ### Null specification @@ -127,26 +127,26 @@ null_spec <- function(y, parameters) { Next, we need to decide which test statistic(s) we are going to use for performing the test. Here, we are only interested in one parameter, namely the mean difference $\delta$. Since the two samples share the same variance, we can use for example the $t$-statistic with a pooled estimate of the common variance. -This statistic can be easily computed using `stats::t.test(x, y, var.equal = TRUE)$statistic`. However, we want to extend its evaluation to any permuted version of the data. Test statistic functions compatible with [**flipr**](https://lmjl-alea.github.io/flipr/) should have at least two mandatory input arguments: +This statistic can be easily computed using `stats::t.test(x, y, var.equal = TRUE)$statistic`. However, we want to extend its evaluation to any permuted version of the data. Test statistic functions compatible with [**flipr**](https://permaverse.github.io/flipr/) should have at least two mandatory input arguments: - `data` which is either a concatenated list of size $n_x + n_y$ regrouping the data points of both samples or a distance matrix of size $(n_x + n_y) \times (n_x + n_y)$ stored as an object of class `dist`. - `indices1` which is an integer vector of size $n_x$ storing the indices of the data points belonging to the first sample in the current permuted version of the data. -A [**flipr**](https://lmjl-alea.github.io/flipr/)-compatible version of the t-statistic is already implemented in [**flipr**](https://lmjl-alea.github.io/flipr/) and ready to use as `stat_student` or its alias `stat_t`. Here, we are only going to use the $t$-statistic for this example, but we might be willing to use more than one statistic for a parameter or we might have several parameters under investigation, each one of them requiring a different test statistic. We therefore group all the test statistics that we need into a single list: +A [**flipr**](https://permaverse.github.io/flipr/)-compatible version of the t-statistic is already implemented in [**flipr**](https://permaverse.github.io/flipr/) and ready to use as `stat_student` or its alias `stat_t`. Here, we are only going to use the $t$-statistic for this example, but we might be willing to use more than one statistic for a parameter or we might have several parameters under investigation, each one of them requiring a different test statistic. We therefore group all the test statistics that we need into a single list: ```{r} stat_functions <- list(stat_t) ``` ### Statistic assignments -Finally we need to define a named list that tells [**flipr**](https://lmjl-alea.github.io/flipr/) which test statistics among the ones declared in the `stat_functions` list should be used for each parameter under investigation. This is used to determine bounds on each parameter for the plausibility function. This list, often termed `stat_assignments`, should therefore have as many elements as there are parameters under investigation. Each element should be named after a parameter under investigation and should list the indices corresponding to the test statistics that should be used for that parameter in `stat_functions`. In our example, it boils down to: +Finally we need to define a named list that tells [**flipr**](https://permaverse.github.io/flipr/) which test statistics among the ones declared in the `stat_functions` list should be used for each parameter under investigation. This is used to determine bounds on each parameter for the plausibility function. This list, often termed `stat_assignments`, should therefore have as many elements as there are parameters under investigation. Each element should be named after a parameter under investigation and should list the indices corresponding to the test statistics that should be used for that parameter in `stat_functions`. In our example, it boils down to: ```{r} stat_assignments <- list(delta = 1) ``` ### Instantiation of the plausibility function -In [**flipr**](https://lmjl-alea.github.io/flipr/), the plausibility function is implemented as an [R6Class](https://r6.r-lib.org/reference/R6Class.html) object. Assume we observed two samples stored in lists `x` and `y`, we therefore instantiate a plausibility function for this data as follows: +In [**flipr**](https://permaverse.github.io/flipr/), the plausibility function is implemented as an [R6Class](https://r6.r-lib.org/reference/R6Class.html) object. Assume we observed two samples stored in lists `x` and `y`, we therefore instantiate a plausibility function for this data as follows: ```{r, eval=FALSE} pf <- PlausibilityFunction$new( null_spec = null_spec, diff --git a/vignettes/parallelization.Rmd b/vignettes/parallelization.Rmd index 646bdef..4b52fc3 100644 --- a/vignettes/parallelization.Rmd +++ b/vignettes/parallelization.Rmd @@ -18,7 +18,7 @@ time_without_parallelization <- df_parallelization$time_without_par time_with_parallelization <- df_parallelization$time_par ``` -The [**flipr**](https://lmjl-alea.github.io/flipr/) package uses functions +The [**flipr**](https://permaverse.github.io/flipr/) package uses functions contained in the [**furrr**](https://future.futureverse.org/index.html) package for parallel processing. The setting of parallelization has to be done on the user side. We illustrate here how to achieve asynchronous evaluation. We use the @@ -34,7 +34,7 @@ define a default cluster with `parallel::setDefaultCluster()`. Then, to enable the visualization of evaluation progress, we can put the code in the `progressr::with_progress()` function, or more simply set it for all the following code with the `progressr::handlers()` function. After these settings, -[**flipr**](https://lmjl-alea.github.io/flipr/) functions can be used, as shown +[**flipr**](https://permaverse.github.io/flipr/) functions can be used, as shown in this example. To show the benefit of parallel processing, we compare here the processing times @@ -87,7 +87,7 @@ define a default cluster with `parallel::setDefaultCluster()`. Then, to enable the visualization of evaluation progress, we can put the code in the `progressr::with_progress()` function, or more simply set it for all the following code with the `progressr::handlers()` function. After these settings, -[**flipr**](https://lmjl-alea.github.io/flipr/) functions can be used, as shown +[**flipr**](https://permaverse.github.io/flipr/) functions can be used, as shown in this example. ```{r, eval=FALSE}