From 424a320816d5eab0befd89bd2a1e1e5cca94b442 Mon Sep 17 00:00:00 2001 From: Indrajeet Patil Date: Sat, 22 Jan 2022 19:44:45 +0100 Subject: [PATCH] 80-char width formatting --- DESCRIPTION | 2 +- R/estimate_means.R | 2 + R/estimate_predicted.R | 107 +++++++++++++++++++++++++---------- R/estimate_slopes.R | 72 +++++++++++++++++++---- R/model_emmeans.R | 13 +++-- R/zero_crossings.R | 1 + man/estimate_contrasts.Rd | 66 +++++++++++++++++++--- man/estimate_expectation.Rd | 110 +++++++++++++++++++++++++----------- man/estimate_means.Rd | 66 +++++++++++++++++++--- man/estimate_slopes.Rd | 76 +++++++++++++++++++++---- man/model_emmeans.Rd | 13 +++-- 11 files changed, 420 insertions(+), 108 deletions(-) diff --git a/DESCRIPTION b/DESCRIPTION index 2510046f..7465053e 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -35,7 +35,7 @@ Depends: R (>= 3.4) Imports: bayestestR (>= 0.11.5), - effectsize (>= 0.5.0), + effectsize (>= 0.6.0), insight (>= 0.15.0), datawizard (>= 0.2.2), parameters (>= 0.16.0), diff --git a/R/estimate_means.R b/R/estimate_means.R index d2d43d8a..2a67f25b 100644 --- a/R/estimate_means.R +++ b/R/estimate_means.R @@ -124,6 +124,7 @@ estimate_means <- function(model, #' @keywords internal .clean_names_bayesian <- function(means, model, transform, type = "mean") { vars <- names(means)[names(means) %in% c("Median", "Mean", "MAP")] + if (length(vars) == 1) { if (type == "contrast") { if (insight::model_info(model)$is_logit & transform == "response") { @@ -143,6 +144,7 @@ estimate_means <- function(model, names(means)[names(means) == vars] <- "Coefficient" } } + means$CI <- NULL means$ROPE_CI <- NULL means$ROPE_low <- NULL diff --git a/R/estimate_predicted.R b/R/estimate_predicted.R index da17738c..96d1ba89 100644 --- a/R/estimate_predicted.R +++ b/R/estimate_predicted.R @@ -1,50 +1,75 @@ #' Model-based response estimates and uncertainty #' -#' After fitting a model, it is useful generate model-based estimates of the response variables for different combinations of predictor values. -#' Such estimates can be used to make inferences about relationships between variables and to make predictions about individual cases. +#' After fitting a model, it is useful generate model-based estimates of the +#' response variables for different combinations of predictor values. Such +#' estimates can be used to make inferences about relationships between +#' variables and to make predictions about individual cases. #' \cr\cr -#' Model-based response estimates and uncertainty can be generated for both the conditional average response values (the regression line or expectation) and for predictions about individual cases. -#' See below for details. +#' Model-based response estimates and uncertainty can be generated for both the +#' conditional average response values (the regression line or expectation) and +#' for predictions about individual cases. See below for details. #' #' @section Expected (average) values: #' -#' The most important way that various types of response estimates differ is in terms of what quantity is being estimated and the meaning of the uncertainty intervals. -#' The major choices are **expected values** for uncertainty in the regression line and **predicted values** for uncertainty in the individual case predictions. +#' The most important way that various types of response estimates differ is in +#' terms of what quantity is being estimated and the meaning of the uncertainty +#' intervals. The major choices are **expected values** for uncertainty in the +#' regression line and **predicted values** for uncertainty in the individual +#' case predictions. #' -#' **Expected values** refer the the fitted regression line---the estimated *average* response value (i.e., the "expectation") for individuals with specific predictor values. -#' For example, in a linear model *y* = 2 + 3*x* + 4*z* + *e*, the estimated average *y* for individuals with *x* = 1 and *z* = 2 is 11. +#' **Expected values** refer the the fitted regression line---the estimated +#' *average* response value (i.e., the "expectation") for individuals with +#' specific predictor values. For example, in a linear model *y* = 2 + 3*x* + +#' 4*z* + *e*, the estimated average *y* for individuals with *x* = 1 and *z* = +#' 2 is 11. #' -#' For expected values, uncertainty intervals refer to uncertainty in the estimated **conditional average** (where might the true regression line actually fall)? -#' Uncertainty intervals for expected values are also called "confidence intervals". +#' For expected values, uncertainty intervals refer to uncertainty in the +#' estimated **conditional average** (where might the true regression line +#' actually fall)? Uncertainty intervals for expected values are also called +#' "confidence intervals". #' -#' Expected values and their uncertainty intervals are useful for describing the relationship between variables and for describing how precisely a model has been estimated. +#' Expected values and their uncertainty intervals are useful for describing the +#' relationship between variables and for describing how precisely a model has +#' been estimated. #' #' For generalized linear models, expected values are reported on one of two scales: #' -#' - The **link scale** refers to scale of the fitted regression line, after transformation by the link function. -#' For example, for a logistic regression (logit binomial) model, the link scale gives expected log-odds. -#' For a log-link Poisson model, the link scale gives the expected log-count. +#' - The **link scale** refers to scale of the fitted regression line, after +#' transformation by the link function. For example, for a logistic regression +#' (logit binomial) model, the link scale gives expected log-odds. For a +#' log-link Poisson model, the link scale gives the expected log-count. #' -#' - The **response scale** refers to the original scale of the response variable (i.e., without any link function transformation). -#' Expected values on the link scale are back-transformed to the original response variable metric (e.g., expected probabilities for binomial models, expected counts for Poisson models). +#' - The **response scale** refers to the original scale of the response +#' variable (i.e., without any link function transformation). Expected values +#' on the link scale are back-transformed to the original response variable +#' metric (e.g., expected probabilities for binomial models, expected counts +#' for Poisson models). #' #' #' @section Individual case predictions: #' -#' In contrast to expected values, **predicted values** refer to predictions for **individual cases**. -#' Predicted values are also called "posterior predictions" or "posterior predictive draws". +#' In contrast to expected values, **predicted values** refer to predictions for +#' **individual cases**. Predicted values are also called "posterior +#' predictions" or "posterior predictive draws". #' -#' For predicted values, uncertainty intervals refer to uncertainty in the **individual response values for each case** (where might any single case actually fall)? -#' Uncertainty intervals for predicted values are also called "prediction intervals" or "posterior predictive intervals". +#' For predicted values, uncertainty intervals refer to uncertainty in the +#' **individual response values for each case** (where might any single case +#' actually fall)? Uncertainty intervals for predicted values are also called +#' "prediction intervals" or "posterior predictive intervals". #' -#' Predicted values and their uncertainty intervals are useful for forecasting the range of values that might be observed in new data, for making decisions about individual cases, and for checking if model predictions are reasonable ("posterior predictive checks"). +#' Predicted values and their uncertainty intervals are useful for forecasting +#' the range of values that might be observed in new data, for making decisions +#' about individual cases, and for checking if model predictions are reasonable +#' ("posterior predictive checks"). #' -#' Predicted values and intervals are always on the scale of the original response variable (not the link scale). +#' Predicted values and intervals are always on the scale of the original +#' response variable (not the link scale). #' #' #' @section Functions for estimating predicted values and uncertainty: #' -#' *modelbased* provides 4 functions for generating model-based response estimates and their uncertainty: +#' *modelbased* provides 4 functions for generating model-based response +#' estimates and their uncertainty: #' #' - **`estimate_expectation()`**: #' - Generates **expected values** (conditional average) on the **response scale**. @@ -54,7 +79,8 @@ #' - **`estimate_link()`**: #' - Generates **expected values** (conditional average) on the **link scale**. #' - The uncertainty interval is a *confidence interval*. -#' - By default, values are computed using a reference grid spanning the observed range of predictor values (see [visualisation_matrix()]). +#' - By default, values are computed using a reference grid spanning the +#' observed range of predictor values (see [visualisation_matrix()]). #' #' - **`estimate_prediction()`**: #' - Generates **predicted values** (for individual cases) on the **response scale**. @@ -66,28 +92,49 @@ #' - Useful for visualizing a model. #' - Generates **expected values** (conditional average) on the **response scale**. #' - The uncertainty interval is a *confidence interval*. -#' - By default, values are computed using a reference grid spanning the observed range of predictor values (see [visualisation_matrix()]). +#' - By default, values are computed using a reference grid spanning the +#' observed range of predictor values (see [visualisation_matrix()]). #' #' `estimate_response()` is a deprecated alias for `estimate_expectation()`. #' #' @section Data for predictions: #' -#' If the `data = NULL`, values are estimated using the data used to fit the model. If `data = "grid"`, values are computed using a reference grid spanning the observed range of predictor values with [visualisation_matrix()]. This can be useful for model visualization. The number of predictor values used for each variable can be controlled with the `length` argument. `data` can also be a data frame containing columns with names matching the model frame (see [insight::get_data()]). This can be used to generate model predictions for specific combinations of predictor values. +#' If the `data = NULL`, values are estimated using the data used to fit the +#' model. If `data = "grid"`, values are computed using a reference grid +#' spanning the observed range of predictor values with +#' [visualisation_matrix()]. This can be useful for model visualization. The +#' number of predictor values used for each variable can be controlled with the +#' `length` argument. `data` can also be a data frame containing columns with +#' names matching the model frame (see [insight::get_data()]). This can be used +#' to generate model predictions for specific combinations of predictor values. #' #' @note #' -#' These functions are built on top of [insight::get_predicted()] and correspond to different specifications of its parameters. It may be useful to read its [documentation](https://easystats.github.io/insight/reference/get_predicted.html), in particular the description of the `predict` argument for additional details on the difference between expected vs. predicted values and link vs. response scales. +#' These functions are built on top of [insight::get_predicted()] and correspond +#' to different specifications of its parameters. It may be useful to read its +#' [documentation](https://easystats.github.io/insight/reference/get_predicted.html), +#' in particular the description of the `predict` argument for additional +#' details on the difference between expected vs. predicted values and link vs. +#' response scales. #' -#' Additional control parameters can be used to control results from [visualisation_matrix()] (when `data = "grid"`) and from [insight::get_predicted()] (the function used internally to compute predictions). +#' Additional control parameters can be used to control results from +#' [visualisation_matrix()] (when `data = "grid"`) and from +#' [insight::get_predicted()] (the function used internally to compute +#' predictions). #' -#' For plotting, check the examples in [visualisation_recipe()]. Also check out the [Vignettes](https://easystats.github.io/modelbased/articles/) and [README examples](https://easystats.github.io/modelbased/index.html#features) for various examples, tutorials and usecases. +#' For plotting, check the examples in [visualisation_recipe()]. Also check out +#' the [Vignettes](https://easystats.github.io/modelbased/articles/) and [README +#' examples](https://easystats.github.io/modelbased/index.html#features) for +#' various examples, tutorials and usecases. #' #' @inheritParams estimate_means #' @inheritParams bayestestR::describe_posterior #' @param data A data frame with model's predictors to estimate the response. If #' `NULL`, the model's data is used. If "grid", the model matrix is obtained #' (through [visualisation_matrix()]). -#' @param ... You can add all the additional control arguments from [visualisation_matrix()] (used when `data = "grid"`) and [insight::get_predicted()]. +#' @param ... You can add all the additional control arguments from +#' [visualisation_matrix()] (used when `data = "grid"`) and +#' [insight::get_predicted()]. #' #' @examples #' library(modelbased) diff --git a/R/estimate_slopes.R b/R/estimate_slopes.R index 31688a17..078aa9e2 100644 --- a/R/estimate_slopes.R +++ b/R/estimate_slopes.R @@ -1,24 +1,74 @@ #' Estimate Marginal Effects #' -#' Estimate the slopes (i.e., the coefficient) of a predictor over or within different -#' factor levels, or alongside a numeric variable . In other words, to assess the effect of a predictor *at* specific configurations data. Other related -#' functions based on marginal estimations includes [estimate_contrasts()] and -#' [estimate_means()]. +#' Estimate the slopes (i.e., the coefficient) of a predictor over or within +#' different factor levels, or alongside a numeric variable . In other words, to +#' assess the effect of a predictor *at* specific configurations data. Other +#' related functions based on marginal estimations includes +#' [estimate_contrasts()] and [estimate_means()]. #' \cr\cr -#' See the **Details** section below, and don't forget to also check out the [Vignettes](https://easystats.github.io/modelbased/articles/estimate_slopes.html) and [README examples](https://easystats.github.io/modelbased/index.html#features) for various examples, tutorials and usecases. +#' +#' See the **Details** section below, and don't forget to also check out the +#' [Vignettes](https://easystats.github.io/modelbased/articles/estimate_slopes.html) +#' and [README examples](https://easystats.github.io/modelbased/index.html#features) for +#' various examples, tutorials and use cases. #' #' @inheritParams model_emmeans #' @inheritParams estimate_means #' -#' @details The [estimate_slopes()], [estimate_means()] and [estimate_contrasts()] functions are forming a group, as they are all based on *marginal* estimations (estimations based on a model). All three are also built on the \pkg{emmeans} package, so reading its documentation (for instance for [emmeans::emmeans()] and [emmeans::emtrends()]) is recommended to understand the idea behind these types of procedures. +#' @details The [estimate_slopes()], [estimate_means()] and +#' [estimate_contrasts()] functions are forming a group, as they are all based +#' on *marginal* estimations (estimations based on a model). All three are +#' also built on the \pkg{emmeans} package, so reading its documentation (for +#' instance for [emmeans::emmeans()] and [emmeans::emtrends()]) is recommended +#' to understand the idea behind these types of procedures. #' #' \itemize{ -#' \item Model-based **predictions** is the basis for all that follows. Indeed, the first thing to understand is how models can be used to make predictions (see [estimate_link()]). This corresponds to the predicted response (or "outcome variable") given specific predictor values of the predictors (i.e., given a specific data configuration). This is why the concept of [`reference grid()`][visualisation_matrix] is so important for direct predictions. -#' \item **Marginal "means"**, obtained via [estimate_means()], are an extension of such predictions, allowing to "average" (collapse) some of the predictors, to obtain the average response value at a specific predictors configuration. This is typically used when some of the predictors of interest are factors. Indeed, the parameters of the model will usually give you the intercept value and then the "effect" of each factor level (how different it is from the intercept). Marginal means can be used to directly give you the mean value of the response variable at all the levels of a factor. Moreover, it can also be used to control, or average over predictors, which is useful in the case of multiple predictors with or without interactions. -#' \item **Marginal contrasts**, obtained via [estimate_contrasts()], are themselves at extension of marginal means, in that they allow to investigate the difference (i.e., the contrast) between the marginal means. This is, again, often used to get all pairwise differences between all levels of a factor. It works also for continuous predictors, for instance one could also be interested in whether the difference at two extremes of a continuous predictor is significant. -#' \item Finally, **marginal effects**, obtained via [estimate_slopes()], are different in that their focus is not values on the response variable, but the model's parameters. The idea is to assess the effect of a predictor at a specific configuration of the other predictors. This is relevant in the case of interactions or non-linear relationships, when the effect of a predictor variable changes depending on the other predictors. Moreover, these effects can also be "averaged" over other predictors, to get for instance the "general trend" of a predictor over different factor levels. +#' \item Model-based **predictions** is the basis for all that follows. Indeed, +#' the first thing to understand is how models can be used to make predictions +#' (see [estimate_link()]). This corresponds to the predicted response (or +#' "outcome variable") given specific predictor values of the predictors (i.e., +#' given a specific data configuration). This is why the concept of [`reference +#' grid()`][visualisation_matrix] is so important for direct predictions. +#' +#' \item **Marginal "means"**, obtained via [estimate_means()], are an extension +#' of such predictions, allowing to "average" (collapse) some of the predictors, +#' to obtain the average response value at a specific predictors configuration. +#' This is typically used when some of the predictors of interest are factors. +#' Indeed, the parameters of the model will usually give you the intercept value +#' and then the "effect" of each factor level (how different it is from the +#' intercept). Marginal means can be used to directly give you the mean value of +#' the response variable at all the levels of a factor. Moreover, it can also be +#' used to control, or average over predictors, which is useful in the case of +#' multiple predictors with or without interactions. +#' +#' \item **Marginal contrasts**, obtained via [estimate_contrasts()], are +#' themselves at extension of marginal means, in that they allow to investigate +#' the difference (i.e., the contrast) between the marginal means. This is, +#' again, often used to get all pairwise differences between all levels of a +#' factor. It works also for continuous predictors, for instance one could also +#' be interested in whether the difference at two extremes of a continuous +#' predictor is significant. +#' +#' \item Finally, **marginal effects**, obtained via [estimate_slopes()], are +#' different in that their focus is not values on the response variable, but the +#' model's parameters. The idea is to assess the effect of a predictor at a +#' specific configuration of the other predictors. This is relevant in the case +#' of interactions or non-linear relationships, when the effect of a predictor +#' variable changes depending on the other predictors. Moreover, these effects +#' can also be "averaged" over other predictors, to get for instance the +#' "general trend" of a predictor over different factor levels. #' } -#' **Example:** let's imagine the following model `lm(y ~ condition * x)` where `condition` is a factor with 3 levels A, B and C and `x` a continuous variable (like age for example). One idea is to see how this model performs, and compare the actual response y to the one predicted by the model (using [estimate_response()]). Another idea is evaluate the average mean at each of the condition's levels (using [estimate_means()]), which can be useful to visualize them. Another possibility is to evaluate the difference between these levels (using [estimate_contrasts()]). Finally, one could also estimate the effect of x averaged over all conditions, or instead within each condition (`using [estimate_slopes]`). +#' +#' **Example:** let's imagine the following model `lm(y ~ condition * x)` where +#' `condition` is a factor with 3 levels A, B and C and `x` a continuous +#' variable (like age for example). One idea is to see how this model performs, +#' and compare the actual response y to the one predicted by the model (using +#' [estimate_response()]). Another idea is evaluate the average mean at each of +#' the condition's levels (using [estimate_means()]), which can be useful to +#' visualize them. Another possibility is to evaluate the difference between +#' these levels (using [estimate_contrasts()]). Finally, one could also estimate +#' the effect of x averaged over all conditions, or instead within each +#' condition (`using [estimate_slopes]`). #' #' #' @examples diff --git a/R/model_emmeans.R b/R/model_emmeans.R index 0b9f7fee..b35d015d 100644 --- a/R/model_emmeans.R +++ b/R/model_emmeans.R @@ -1,9 +1,11 @@ #' Easy 'emmeans' and 'emtrends' #' #' The `model_emmeans` function is a wrapper to facilitate the usage of -#' `emmeans::emmeans()` and `emmeans::emtrends()`, providing a -#' somewhat simpler and intuitive API to find the specifications and variables of interest. -#' It is meanly made to for the developers to facilitate the organization and debugging, and end-users should rather use the `estimate_*` series of functions. +#' `emmeans::emmeans()` and `emmeans::emtrends()`, providing a somewhat simpler +#' and intuitive API to find the specifications and variables of interest. It is +#' meanly made to for the developers to facilitate the organization and +#' debugging, and end-users should rather use the `estimate_*` series of +#' functions. #' #' @param model A statistical model. #' @param fixed A character vector indicating the names of the predictors to be @@ -19,7 +21,10 @@ #' log-odds (probabilities on logit scale) and `"response"` in terms of #' probabilities. #' @param levels,modulate Deprecated, use `at` instead. -#' @param at The predictor variable(s) *at* which to evaluate the desired effect / mean / contrasts. Other predictors of the model that are not included here will be collapsed and "averaged" over (the effect will be estimated across them). +#' @param at The predictor variable(s) *at* which to evaluate the desired effect +#' / mean / contrasts. Other predictors of the model that are not included +#' here will be collapsed and "averaged" over (the effect will be estimated +#' across them). #' @param ... Other arguments passed for instance to [visualisation_matrix()]. #' #' @examples diff --git a/R/zero_crossings.R b/R/zero_crossings.R index 362f38da..7675c43b 100644 --- a/R/zero_crossings.R +++ b/R/zero_crossings.R @@ -19,6 +19,7 @@ zero_crossings <- function(x) { if (length(zerocrossings) == 0) { return(NA) } + zerocrossings } diff --git a/man/estimate_contrasts.Rd b/man/estimate_contrasts.Rd index 156fb7ea..97684bf7 100644 --- a/man/estimate_contrasts.Rd +++ b/man/estimate_contrasts.Rd @@ -21,7 +21,10 @@ estimate_contrasts( \item{contrast}{A character vector indicating the name of the variable(s) for which to compute the contrasts.} -\item{at}{The predictor variable(s) \emph{at} which to evaluate the desired effect / mean / contrasts. Other predictors of the model that are not included here will be collapsed and "averaged" over (the effect will be estimated across them).} +\item{at}{The predictor variable(s) \emph{at} which to evaluate the desired effect +/ mean / contrasts. Other predictors of the model that are not included +here will be collapsed and "averaged" over (the effect will be estimated +across them).} \item{fixed}{A character vector indicating the names of the predictors to be "fixed" (i.e., maintained), so that the estimation is made at these values.} @@ -54,15 +57,64 @@ factor. See also other related functions such as \code{\link[=estimate_means]{es and \code{\link[=estimate_slopes]{estimate_slopes()}}. } \details{ -The \code{\link[=estimate_slopes]{estimate_slopes()}}, \code{\link[=estimate_means]{estimate_means()}} and \code{\link[=estimate_contrasts]{estimate_contrasts()}} functions are forming a group, as they are all based on \emph{marginal} estimations (estimations based on a model). All three are also built on the \pkg{emmeans} package, so reading its documentation (for instance for \code{\link[emmeans:emmeans]{emmeans::emmeans()}} and \code{\link[emmeans:emtrends]{emmeans::emtrends()}}) is recommended to understand the idea behind these types of procedures. +See the \strong{Details} section below, and don't forget to also check out the +\href{https://easystats.github.io/modelbased/articles/estimate_slopes.html}{Vignettes} +and \href{https://easystats.github.io/modelbased/index.html#features}{README examples} for +various examples, tutorials and use cases. + +The \code{\link[=estimate_slopes]{estimate_slopes()}}, \code{\link[=estimate_means]{estimate_means()}} and +\code{\link[=estimate_contrasts]{estimate_contrasts()}} functions are forming a group, as they are all based +on \emph{marginal} estimations (estimations based on a model). All three are +also built on the \pkg{emmeans} package, so reading its documentation (for +instance for \code{\link[emmeans:emmeans]{emmeans::emmeans()}} and \code{\link[emmeans:emtrends]{emmeans::emtrends()}}) is recommended +to understand the idea behind these types of procedures. \itemize{ -\item Model-based \strong{predictions} is the basis for all that follows. Indeed, the first thing to understand is how models can be used to make predictions (see \code{\link[=estimate_link]{estimate_link()}}). This corresponds to the predicted response (or "outcome variable") given specific predictor values of the predictors (i.e., given a specific data configuration). This is why the concept of \code{\link[=visualisation_matrix]{reference grid()}} is so important for direct predictions. -\item \strong{Marginal "means"}, obtained via \code{\link[=estimate_means]{estimate_means()}}, are an extension of such predictions, allowing to "average" (collapse) some of the predictors, to obtain the average response value at a specific predictors configuration. This is typically used when some of the predictors of interest are factors. Indeed, the parameters of the model will usually give you the intercept value and then the "effect" of each factor level (how different it is from the intercept). Marginal means can be used to directly give you the mean value of the response variable at all the levels of a factor. Moreover, it can also be used to control, or average over predictors, which is useful in the case of multiple predictors with or without interactions. -\item \strong{Marginal contrasts}, obtained via \code{\link[=estimate_contrasts]{estimate_contrasts()}}, are themselves at extension of marginal means, in that they allow to investigate the difference (i.e., the contrast) between the marginal means. This is, again, often used to get all pairwise differences between all levels of a factor. It works also for continuous predictors, for instance one could also be interested in whether the difference at two extremes of a continuous predictor is significant. -\item Finally, \strong{marginal effects}, obtained via \code{\link[=estimate_slopes]{estimate_slopes()}}, are different in that their focus is not values on the response variable, but the model's parameters. The idea is to assess the effect of a predictor at a specific configuration of the other predictors. This is relevant in the case of interactions or non-linear relationships, when the effect of a predictor variable changes depending on the other predictors. Moreover, these effects can also be "averaged" over other predictors, to get for instance the "general trend" of a predictor over different factor levels. +\item Model-based \strong{predictions} is the basis for all that follows. Indeed, +the first thing to understand is how models can be used to make predictions +(see \code{\link[=estimate_link]{estimate_link()}}). This corresponds to the predicted response (or +"outcome variable") given specific predictor values of the predictors (i.e., +given a specific data configuration). This is why the concept of \code{\link[=visualisation_matrix]{reference grid()}} is so important for direct predictions. + +\item \strong{Marginal "means"}, obtained via \code{\link[=estimate_means]{estimate_means()}}, are an extension +of such predictions, allowing to "average" (collapse) some of the predictors, +to obtain the average response value at a specific predictors configuration. +This is typically used when some of the predictors of interest are factors. +Indeed, the parameters of the model will usually give you the intercept value +and then the "effect" of each factor level (how different it is from the +intercept). Marginal means can be used to directly give you the mean value of +the response variable at all the levels of a factor. Moreover, it can also be +used to control, or average over predictors, which is useful in the case of +multiple predictors with or without interactions. + +\item \strong{Marginal contrasts}, obtained via \code{\link[=estimate_contrasts]{estimate_contrasts()}}, are +themselves at extension of marginal means, in that they allow to investigate +the difference (i.e., the contrast) between the marginal means. This is, +again, often used to get all pairwise differences between all levels of a +factor. It works also for continuous predictors, for instance one could also +be interested in whether the difference at two extremes of a continuous +predictor is significant. + +\item Finally, \strong{marginal effects}, obtained via \code{\link[=estimate_slopes]{estimate_slopes()}}, are +different in that their focus is not values on the response variable, but the +model's parameters. The idea is to assess the effect of a predictor at a +specific configuration of the other predictors. This is relevant in the case +of interactions or non-linear relationships, when the effect of a predictor +variable changes depending on the other predictors. Moreover, these effects +can also be "averaged" over other predictors, to get for instance the +"general trend" of a predictor over different factor levels. } -\strong{Example:} let's imagine the following model \code{lm(y ~ condition * x)} where \code{condition} is a factor with 3 levels A, B and C and \code{x} a continuous variable (like age for example). One idea is to see how this model performs, and compare the actual response y to the one predicted by the model (using \code{\link[=estimate_response]{estimate_response()}}). Another idea is evaluate the average mean at each of the condition's levels (using \code{\link[=estimate_means]{estimate_means()}}), which can be useful to visualize them. Another possibility is to evaluate the difference between these levels (using \code{\link[=estimate_contrasts]{estimate_contrasts()}}). Finally, one could also estimate the effect of x averaged over all conditions, or instead within each condition (\code{using [estimate_slopes]}). + +\strong{Example:} let's imagine the following model \code{lm(y ~ condition * x)} where +\code{condition} is a factor with 3 levels A, B and C and \code{x} a continuous +variable (like age for example). One idea is to see how this model performs, +and compare the actual response y to the one predicted by the model (using +\code{\link[=estimate_response]{estimate_response()}}). Another idea is evaluate the average mean at each of +the condition's levels (using \code{\link[=estimate_means]{estimate_means()}}), which can be useful to +visualize them. Another possibility is to evaluate the difference between +these levels (using \code{\link[=estimate_contrasts]{estimate_contrasts()}}). Finally, one could also estimate +the effect of x averaged over all conditions, or instead within each +condition (\code{using [estimate_slopes]}). } \examples{ library(modelbased) diff --git a/man/estimate_expectation.Rd b/man/estimate_expectation.Rd index 3b752056..7e5e7ac4 100644 --- a/man/estimate_expectation.Rd +++ b/man/estimate_expectation.Rd @@ -50,67 +50,104 @@ bootstrapped or Bayesian models. They will be added as additional columns named \verb{iter_1, iter_2, ...}. You can reshape them to a long format by running \code{\link[bayestestR:reshape_iterations]{reshape_iterations()}}.} -\item{...}{You can add all the additional control arguments from \code{\link[=visualisation_matrix]{visualisation_matrix()}} (used when \code{data = "grid"}) and \code{\link[insight:get_predicted]{insight::get_predicted()}}.} +\item{...}{You can add all the additional control arguments from +\code{\link[=visualisation_matrix]{visualisation_matrix()}} (used when \code{data = "grid"}) and +\code{\link[insight:get_predicted]{insight::get_predicted()}}.} } \value{ A data frame of predicted values and uncertainty intervals, with class \code{"estimate_predicted"}. Methods for \code{\link[=visualisation_recipe.estimate_predicted]{visualisation_recipe()}} and \code{\link[=visualisation_recipe.estimate_predicted]{plot()}} are available. } \description{ -After fitting a model, it is useful generate model-based estimates of the response variables for different combinations of predictor values. -Such estimates can be used to make inferences about relationships between variables and to make predictions about individual cases. +After fitting a model, it is useful generate model-based estimates of the +response variables for different combinations of predictor values. Such +estimates can be used to make inferences about relationships between +variables and to make predictions about individual cases. \cr\cr -Model-based response estimates and uncertainty can be generated for both the conditional average response values (the regression line or expectation) and for predictions about individual cases. -See below for details. +Model-based response estimates and uncertainty can be generated for both the +conditional average response values (the regression line or expectation) and +for predictions about individual cases. See below for details. } \note{ -These functions are built on top of \code{\link[insight:get_predicted]{insight::get_predicted()}} and correspond to different specifications of its parameters. It may be useful to read its \href{https://easystats.github.io/insight/reference/get_predicted.html}{documentation}, in particular the description of the \code{predict} argument for additional details on the difference between expected vs. predicted values and link vs. response scales. - -Additional control parameters can be used to control results from \code{\link[=visualisation_matrix]{visualisation_matrix()}} (when \code{data = "grid"}) and from \code{\link[insight:get_predicted]{insight::get_predicted()}} (the function used internally to compute predictions). - -For plotting, check the examples in \code{\link[=visualisation_recipe]{visualisation_recipe()}}. Also check out the \href{https://easystats.github.io/modelbased/articles/}{Vignettes} and \href{https://easystats.github.io/modelbased/index.html#features}{README examples} for various examples, tutorials and usecases. +These functions are built on top of \code{\link[insight:get_predicted]{insight::get_predicted()}} and correspond +to different specifications of its parameters. It may be useful to read its +\href{https://easystats.github.io/insight/reference/get_predicted.html}{documentation}, +in particular the description of the \code{predict} argument for additional +details on the difference between expected vs. predicted values and link vs. +response scales. + +Additional control parameters can be used to control results from +\code{\link[=visualisation_matrix]{visualisation_matrix()}} (when \code{data = "grid"}) and from +\code{\link[insight:get_predicted]{insight::get_predicted()}} (the function used internally to compute +predictions). + +For plotting, check the examples in \code{\link[=visualisation_recipe]{visualisation_recipe()}}. Also check out +the \href{https://easystats.github.io/modelbased/articles/}{Vignettes} and \href{https://easystats.github.io/modelbased/index.html#features}{README examples} for +various examples, tutorials and usecases. } \section{Expected (average) values}{ -The most important way that various types of response estimates differ is in terms of what quantity is being estimated and the meaning of the uncertainty intervals. -The major choices are \strong{expected values} for uncertainty in the regression line and \strong{predicted values} for uncertainty in the individual case predictions. +The most important way that various types of response estimates differ is in +terms of what quantity is being estimated and the meaning of the uncertainty +intervals. The major choices are \strong{expected values} for uncertainty in the +regression line and \strong{predicted values} for uncertainty in the individual +case predictions. -\strong{Expected values} refer the the fitted regression line---the estimated \emph{average} response value (i.e., the "expectation") for individuals with specific predictor values. -For example, in a linear model \emph{y} = 2 + 3\emph{x} + 4\emph{z} + \emph{e}, the estimated average \emph{y} for individuals with \emph{x} = 1 and \emph{z} = 2 is 11. +\strong{Expected values} refer the the fitted regression line---the estimated +\emph{average} response value (i.e., the "expectation") for individuals with +specific predictor values. For example, in a linear model \emph{y} = 2 + 3\emph{x} + +4\emph{z} + \emph{e}, the estimated average \emph{y} for individuals with \emph{x} = 1 and \emph{z} = +2 is 11. -For expected values, uncertainty intervals refer to uncertainty in the estimated \strong{conditional average} (where might the true regression line actually fall)? -Uncertainty intervals for expected values are also called "confidence intervals". +For expected values, uncertainty intervals refer to uncertainty in the +estimated \strong{conditional average} (where might the true regression line +actually fall)? Uncertainty intervals for expected values are also called +"confidence intervals". -Expected values and their uncertainty intervals are useful for describing the relationship between variables and for describing how precisely a model has been estimated. +Expected values and their uncertainty intervals are useful for describing the +relationship between variables and for describing how precisely a model has +been estimated. For generalized linear models, expected values are reported on one of two scales: \itemize{ -\item The \strong{link scale} refers to scale of the fitted regression line, after transformation by the link function. -For example, for a logistic regression (logit binomial) model, the link scale gives expected log-odds. -For a log-link Poisson model, the link scale gives the expected log-count. -\item The \strong{response scale} refers to the original scale of the response variable (i.e., without any link function transformation). -Expected values on the link scale are back-transformed to the original response variable metric (e.g., expected probabilities for binomial models, expected counts for Poisson models). +\item The \strong{link scale} refers to scale of the fitted regression line, after +transformation by the link function. For example, for a logistic regression +(logit binomial) model, the link scale gives expected log-odds. For a +log-link Poisson model, the link scale gives the expected log-count. +\item The \strong{response scale} refers to the original scale of the response +variable (i.e., without any link function transformation). Expected values +on the link scale are back-transformed to the original response variable +metric (e.g., expected probabilities for binomial models, expected counts +for Poisson models). } } \section{Individual case predictions}{ -In contrast to expected values, \strong{predicted values} refer to predictions for \strong{individual cases}. -Predicted values are also called "posterior predictions" or "posterior predictive draws". +In contrast to expected values, \strong{predicted values} refer to predictions for +\strong{individual cases}. Predicted values are also called "posterior +predictions" or "posterior predictive draws". -For predicted values, uncertainty intervals refer to uncertainty in the \strong{individual response values for each case} (where might any single case actually fall)? -Uncertainty intervals for predicted values are also called "prediction intervals" or "posterior predictive intervals". +For predicted values, uncertainty intervals refer to uncertainty in the +\strong{individual response values for each case} (where might any single case +actually fall)? Uncertainty intervals for predicted values are also called +"prediction intervals" or "posterior predictive intervals". -Predicted values and their uncertainty intervals are useful for forecasting the range of values that might be observed in new data, for making decisions about individual cases, and for checking if model predictions are reasonable ("posterior predictive checks"). +Predicted values and their uncertainty intervals are useful for forecasting +the range of values that might be observed in new data, for making decisions +about individual cases, and for checking if model predictions are reasonable +("posterior predictive checks"). -Predicted values and intervals are always on the scale of the original response variable (not the link scale). +Predicted values and intervals are always on the scale of the original +response variable (not the link scale). } \section{Functions for estimating predicted values and uncertainty}{ -\emph{modelbased} provides 4 functions for generating model-based response estimates and their uncertainty: +\emph{modelbased} provides 4 functions for generating model-based response +estimates and their uncertainty: \itemize{ \item \strong{\code{estimate_expectation()}}: \itemize{ @@ -122,7 +159,8 @@ Predicted values and intervals are always on the scale of the original response \itemize{ \item Generates \strong{expected values} (conditional average) on the \strong{link scale}. \item The uncertainty interval is a \emph{confidence interval}. -\item By default, values are computed using a reference grid spanning the observed range of predictor values (see \code{\link[=visualisation_matrix]{visualisation_matrix()}}). +\item By default, values are computed using a reference grid spanning the +observed range of predictor values (see \code{\link[=visualisation_matrix]{visualisation_matrix()}}). } \item \strong{\code{estimate_prediction()}}: \itemize{ @@ -136,7 +174,8 @@ Predicted values and intervals are always on the scale of the original response \item Useful for visualizing a model. \item Generates \strong{expected values} (conditional average) on the \strong{response scale}. \item The uncertainty interval is a \emph{confidence interval}. -\item By default, values are computed using a reference grid spanning the observed range of predictor values (see \code{\link[=visualisation_matrix]{visualisation_matrix()}}). +\item By default, values are computed using a reference grid spanning the +observed range of predictor values (see \code{\link[=visualisation_matrix]{visualisation_matrix()}}). } } @@ -146,7 +185,14 @@ Predicted values and intervals are always on the scale of the original response \section{Data for predictions}{ -If the \code{data = NULL}, values are estimated using the data used to fit the model. If \code{data = "grid"}, values are computed using a reference grid spanning the observed range of predictor values with \code{\link[=visualisation_matrix]{visualisation_matrix()}}. This can be useful for model visualization. The number of predictor values used for each variable can be controlled with the \code{length} argument. \code{data} can also be a data frame containing columns with names matching the model frame (see \code{\link[insight:get_data]{insight::get_data()}}). This can be used to generate model predictions for specific combinations of predictor values. +If the \code{data = NULL}, values are estimated using the data used to fit the +model. If \code{data = "grid"}, values are computed using a reference grid +spanning the observed range of predictor values with +\code{\link[=visualisation_matrix]{visualisation_matrix()}}. This can be useful for model visualization. The +number of predictor values used for each variable can be controlled with the +\code{length} argument. \code{data} can also be a data frame containing columns with +names matching the model frame (see \code{\link[insight:get_data]{insight::get_data()}}). This can be used +to generate model predictions for specific combinations of predictor values. } \examples{ diff --git a/man/estimate_means.Rd b/man/estimate_means.Rd index 3fb947bf..da510e6a 100644 --- a/man/estimate_means.Rd +++ b/man/estimate_means.Rd @@ -16,7 +16,10 @@ estimate_means( \arguments{ \item{model}{A statistical model.} -\item{at}{The predictor variable(s) \emph{at} which to evaluate the desired effect / mean / contrasts. Other predictors of the model that are not included here will be collapsed and "averaged" over (the effect will be estimated across them).} +\item{at}{The predictor variable(s) \emph{at} which to evaluate the desired effect +/ mean / contrasts. Other predictors of the model that are not included +here will be collapsed and "averaged" over (the effect will be estimated +across them).} \item{fixed}{A character vector indicating the names of the predictors to be "fixed" (i.e., maintained), so that the estimation is made at these values.} @@ -45,15 +48,64 @@ other related functions such as \code{\link[=estimate_contrasts]{estimate_contra \code{\link[=estimate_slopes]{estimate_slopes()}}. } \details{ -The \code{\link[=estimate_slopes]{estimate_slopes()}}, \code{\link[=estimate_means]{estimate_means()}} and \code{\link[=estimate_contrasts]{estimate_contrasts()}} functions are forming a group, as they are all based on \emph{marginal} estimations (estimations based on a model). All three are also built on the \pkg{emmeans} package, so reading its documentation (for instance for \code{\link[emmeans:emmeans]{emmeans::emmeans()}} and \code{\link[emmeans:emtrends]{emmeans::emtrends()}}) is recommended to understand the idea behind these types of procedures. +See the \strong{Details} section below, and don't forget to also check out the +\href{https://easystats.github.io/modelbased/articles/estimate_slopes.html}{Vignettes} +and \href{https://easystats.github.io/modelbased/index.html#features}{README examples} for +various examples, tutorials and use cases. + +The \code{\link[=estimate_slopes]{estimate_slopes()}}, \code{\link[=estimate_means]{estimate_means()}} and +\code{\link[=estimate_contrasts]{estimate_contrasts()}} functions are forming a group, as they are all based +on \emph{marginal} estimations (estimations based on a model). All three are +also built on the \pkg{emmeans} package, so reading its documentation (for +instance for \code{\link[emmeans:emmeans]{emmeans::emmeans()}} and \code{\link[emmeans:emtrends]{emmeans::emtrends()}}) is recommended +to understand the idea behind these types of procedures. \itemize{ -\item Model-based \strong{predictions} is the basis for all that follows. Indeed, the first thing to understand is how models can be used to make predictions (see \code{\link[=estimate_link]{estimate_link()}}). This corresponds to the predicted response (or "outcome variable") given specific predictor values of the predictors (i.e., given a specific data configuration). This is why the concept of \code{\link[=visualisation_matrix]{reference grid()}} is so important for direct predictions. -\item \strong{Marginal "means"}, obtained via \code{\link[=estimate_means]{estimate_means()}}, are an extension of such predictions, allowing to "average" (collapse) some of the predictors, to obtain the average response value at a specific predictors configuration. This is typically used when some of the predictors of interest are factors. Indeed, the parameters of the model will usually give you the intercept value and then the "effect" of each factor level (how different it is from the intercept). Marginal means can be used to directly give you the mean value of the response variable at all the levels of a factor. Moreover, it can also be used to control, or average over predictors, which is useful in the case of multiple predictors with or without interactions. -\item \strong{Marginal contrasts}, obtained via \code{\link[=estimate_contrasts]{estimate_contrasts()}}, are themselves at extension of marginal means, in that they allow to investigate the difference (i.e., the contrast) between the marginal means. This is, again, often used to get all pairwise differences between all levels of a factor. It works also for continuous predictors, for instance one could also be interested in whether the difference at two extremes of a continuous predictor is significant. -\item Finally, \strong{marginal effects}, obtained via \code{\link[=estimate_slopes]{estimate_slopes()}}, are different in that their focus is not values on the response variable, but the model's parameters. The idea is to assess the effect of a predictor at a specific configuration of the other predictors. This is relevant in the case of interactions or non-linear relationships, when the effect of a predictor variable changes depending on the other predictors. Moreover, these effects can also be "averaged" over other predictors, to get for instance the "general trend" of a predictor over different factor levels. +\item Model-based \strong{predictions} is the basis for all that follows. Indeed, +the first thing to understand is how models can be used to make predictions +(see \code{\link[=estimate_link]{estimate_link()}}). This corresponds to the predicted response (or +"outcome variable") given specific predictor values of the predictors (i.e., +given a specific data configuration). This is why the concept of \code{\link[=visualisation_matrix]{reference grid()}} is so important for direct predictions. + +\item \strong{Marginal "means"}, obtained via \code{\link[=estimate_means]{estimate_means()}}, are an extension +of such predictions, allowing to "average" (collapse) some of the predictors, +to obtain the average response value at a specific predictors configuration. +This is typically used when some of the predictors of interest are factors. +Indeed, the parameters of the model will usually give you the intercept value +and then the "effect" of each factor level (how different it is from the +intercept). Marginal means can be used to directly give you the mean value of +the response variable at all the levels of a factor. Moreover, it can also be +used to control, or average over predictors, which is useful in the case of +multiple predictors with or without interactions. + +\item \strong{Marginal contrasts}, obtained via \code{\link[=estimate_contrasts]{estimate_contrasts()}}, are +themselves at extension of marginal means, in that they allow to investigate +the difference (i.e., the contrast) between the marginal means. This is, +again, often used to get all pairwise differences between all levels of a +factor. It works also for continuous predictors, for instance one could also +be interested in whether the difference at two extremes of a continuous +predictor is significant. + +\item Finally, \strong{marginal effects}, obtained via \code{\link[=estimate_slopes]{estimate_slopes()}}, are +different in that their focus is not values on the response variable, but the +model's parameters. The idea is to assess the effect of a predictor at a +specific configuration of the other predictors. This is relevant in the case +of interactions or non-linear relationships, when the effect of a predictor +variable changes depending on the other predictors. Moreover, these effects +can also be "averaged" over other predictors, to get for instance the +"general trend" of a predictor over different factor levels. } -\strong{Example:} let's imagine the following model \code{lm(y ~ condition * x)} where \code{condition} is a factor with 3 levels A, B and C and \code{x} a continuous variable (like age for example). One idea is to see how this model performs, and compare the actual response y to the one predicted by the model (using \code{\link[=estimate_response]{estimate_response()}}). Another idea is evaluate the average mean at each of the condition's levels (using \code{\link[=estimate_means]{estimate_means()}}), which can be useful to visualize them. Another possibility is to evaluate the difference between these levels (using \code{\link[=estimate_contrasts]{estimate_contrasts()}}). Finally, one could also estimate the effect of x averaged over all conditions, or instead within each condition (\code{using [estimate_slopes]}). + +\strong{Example:} let's imagine the following model \code{lm(y ~ condition * x)} where +\code{condition} is a factor with 3 levels A, B and C and \code{x} a continuous +variable (like age for example). One idea is to see how this model performs, +and compare the actual response y to the one predicted by the model (using +\code{\link[=estimate_response]{estimate_response()}}). Another idea is evaluate the average mean at each of +the condition's levels (using \code{\link[=estimate_means]{estimate_means()}}), which can be useful to +visualize them. Another possibility is to evaluate the difference between +these levels (using \code{\link[=estimate_contrasts]{estimate_contrasts()}}). Finally, one could also estimate +the effect of x averaged over all conditions, or instead within each +condition (\code{using [estimate_slopes]}). } \examples{ library(modelbased) diff --git a/man/estimate_slopes.Rd b/man/estimate_slopes.Rd index d96b37a3..9ffa8169 100644 --- a/man/estimate_slopes.Rd +++ b/man/estimate_slopes.Rd @@ -12,7 +12,10 @@ estimate_slopes(model, trend = NULL, at = NULL, ci = 0.95, ...) \item{trend}{A character indicating the name of the variable for which to compute the slopes.} -\item{at}{The predictor variable(s) \emph{at} which to evaluate the desired effect / mean / contrasts. Other predictors of the model that are not included here will be collapsed and "averaged" over (the effect will be estimated across them).} +\item{at}{The predictor variable(s) \emph{at} which to evaluate the desired effect +/ mean / contrasts. Other predictors of the model that are not included +here will be collapsed and "averaged" over (the effect will be estimated +across them).} \item{ci}{Confidence Interval (CI) level. Default to \code{0.95} (\verb{95\%}).} @@ -22,23 +25,72 @@ for which to compute the slopes.} A data.frame of class \code{estimate_slopes}. } \description{ -Estimate the slopes (i.e., the coefficient) of a predictor over or within different -factor levels, or alongside a numeric variable . In other words, to assess the effect of a predictor \emph{at} specific configurations data. Other related -functions based on marginal estimations includes \code{\link[=estimate_contrasts]{estimate_contrasts()}} and -\code{\link[=estimate_means]{estimate_means()}}. +Estimate the slopes (i.e., the coefficient) of a predictor over or within +different factor levels, or alongside a numeric variable . In other words, to +assess the effect of a predictor \emph{at} specific configurations data. Other +related functions based on marginal estimations includes +\code{\link[=estimate_contrasts]{estimate_contrasts()}} and \code{\link[=estimate_means]{estimate_means()}}. \cr\cr -See the \strong{Details} section below, and don't forget to also check out the \href{https://easystats.github.io/modelbased/articles/estimate_slopes.html}{Vignettes} and \href{https://easystats.github.io/modelbased/index.html#features}{README examples} for various examples, tutorials and usecases. } \details{ -The \code{\link[=estimate_slopes]{estimate_slopes()}}, \code{\link[=estimate_means]{estimate_means()}} and \code{\link[=estimate_contrasts]{estimate_contrasts()}} functions are forming a group, as they are all based on \emph{marginal} estimations (estimations based on a model). All three are also built on the \pkg{emmeans} package, so reading its documentation (for instance for \code{\link[emmeans:emmeans]{emmeans::emmeans()}} and \code{\link[emmeans:emtrends]{emmeans::emtrends()}}) is recommended to understand the idea behind these types of procedures. +See the \strong{Details} section below, and don't forget to also check out the +\href{https://easystats.github.io/modelbased/articles/estimate_slopes.html}{Vignettes} +and \href{https://easystats.github.io/modelbased/index.html#features}{README examples} for +various examples, tutorials and use cases. + +The \code{\link[=estimate_slopes]{estimate_slopes()}}, \code{\link[=estimate_means]{estimate_means()}} and +\code{\link[=estimate_contrasts]{estimate_contrasts()}} functions are forming a group, as they are all based +on \emph{marginal} estimations (estimations based on a model). All three are +also built on the \pkg{emmeans} package, so reading its documentation (for +instance for \code{\link[emmeans:emmeans]{emmeans::emmeans()}} and \code{\link[emmeans:emtrends]{emmeans::emtrends()}}) is recommended +to understand the idea behind these types of procedures. \itemize{ -\item Model-based \strong{predictions} is the basis for all that follows. Indeed, the first thing to understand is how models can be used to make predictions (see \code{\link[=estimate_link]{estimate_link()}}). This corresponds to the predicted response (or "outcome variable") given specific predictor values of the predictors (i.e., given a specific data configuration). This is why the concept of \code{\link[=visualisation_matrix]{reference grid()}} is so important for direct predictions. -\item \strong{Marginal "means"}, obtained via \code{\link[=estimate_means]{estimate_means()}}, are an extension of such predictions, allowing to "average" (collapse) some of the predictors, to obtain the average response value at a specific predictors configuration. This is typically used when some of the predictors of interest are factors. Indeed, the parameters of the model will usually give you the intercept value and then the "effect" of each factor level (how different it is from the intercept). Marginal means can be used to directly give you the mean value of the response variable at all the levels of a factor. Moreover, it can also be used to control, or average over predictors, which is useful in the case of multiple predictors with or without interactions. -\item \strong{Marginal contrasts}, obtained via \code{\link[=estimate_contrasts]{estimate_contrasts()}}, are themselves at extension of marginal means, in that they allow to investigate the difference (i.e., the contrast) between the marginal means. This is, again, often used to get all pairwise differences between all levels of a factor. It works also for continuous predictors, for instance one could also be interested in whether the difference at two extremes of a continuous predictor is significant. -\item Finally, \strong{marginal effects}, obtained via \code{\link[=estimate_slopes]{estimate_slopes()}}, are different in that their focus is not values on the response variable, but the model's parameters. The idea is to assess the effect of a predictor at a specific configuration of the other predictors. This is relevant in the case of interactions or non-linear relationships, when the effect of a predictor variable changes depending on the other predictors. Moreover, these effects can also be "averaged" over other predictors, to get for instance the "general trend" of a predictor over different factor levels. +\item Model-based \strong{predictions} is the basis for all that follows. Indeed, +the first thing to understand is how models can be used to make predictions +(see \code{\link[=estimate_link]{estimate_link()}}). This corresponds to the predicted response (or +"outcome variable") given specific predictor values of the predictors (i.e., +given a specific data configuration). This is why the concept of \code{\link[=visualisation_matrix]{reference grid()}} is so important for direct predictions. + +\item \strong{Marginal "means"}, obtained via \code{\link[=estimate_means]{estimate_means()}}, are an extension +of such predictions, allowing to "average" (collapse) some of the predictors, +to obtain the average response value at a specific predictors configuration. +This is typically used when some of the predictors of interest are factors. +Indeed, the parameters of the model will usually give you the intercept value +and then the "effect" of each factor level (how different it is from the +intercept). Marginal means can be used to directly give you the mean value of +the response variable at all the levels of a factor. Moreover, it can also be +used to control, or average over predictors, which is useful in the case of +multiple predictors with or without interactions. + +\item \strong{Marginal contrasts}, obtained via \code{\link[=estimate_contrasts]{estimate_contrasts()}}, are +themselves at extension of marginal means, in that they allow to investigate +the difference (i.e., the contrast) between the marginal means. This is, +again, often used to get all pairwise differences between all levels of a +factor. It works also for continuous predictors, for instance one could also +be interested in whether the difference at two extremes of a continuous +predictor is significant. + +\item Finally, \strong{marginal effects}, obtained via \code{\link[=estimate_slopes]{estimate_slopes()}}, are +different in that their focus is not values on the response variable, but the +model's parameters. The idea is to assess the effect of a predictor at a +specific configuration of the other predictors. This is relevant in the case +of interactions or non-linear relationships, when the effect of a predictor +variable changes depending on the other predictors. Moreover, these effects +can also be "averaged" over other predictors, to get for instance the +"general trend" of a predictor over different factor levels. } -\strong{Example:} let's imagine the following model \code{lm(y ~ condition * x)} where \code{condition} is a factor with 3 levels A, B and C and \code{x} a continuous variable (like age for example). One idea is to see how this model performs, and compare the actual response y to the one predicted by the model (using \code{\link[=estimate_response]{estimate_response()}}). Another idea is evaluate the average mean at each of the condition's levels (using \code{\link[=estimate_means]{estimate_means()}}), which can be useful to visualize them. Another possibility is to evaluate the difference between these levels (using \code{\link[=estimate_contrasts]{estimate_contrasts()}}). Finally, one could also estimate the effect of x averaged over all conditions, or instead within each condition (\code{using [estimate_slopes]}). + +\strong{Example:} let's imagine the following model \code{lm(y ~ condition * x)} where +\code{condition} is a factor with 3 levels A, B and C and \code{x} a continuous +variable (like age for example). One idea is to see how this model performs, +and compare the actual response y to the one predicted by the model (using +\code{\link[=estimate_response]{estimate_response()}}). Another idea is evaluate the average mean at each of +the condition's levels (using \code{\link[=estimate_means]{estimate_means()}}), which can be useful to +visualize them. Another possibility is to evaluate the difference between +these levels (using \code{\link[=estimate_contrasts]{estimate_contrasts()}}). Finally, one could also estimate +the effect of x averaged over all conditions, or instead within each +condition (\code{using [estimate_slopes]}). } \examples{ # Get an idea of the data diff --git a/man/model_emmeans.Rd b/man/model_emmeans.Rd index 96954243..90adf250 100644 --- a/man/model_emmeans.Rd +++ b/man/model_emmeans.Rd @@ -42,7 +42,10 @@ model_emtrends( \item{contrast}{A character vector indicating the name of the variable(s) for which to compute the contrasts.} -\item{at}{The predictor variable(s) \emph{at} which to evaluate the desired effect / mean / contrasts. Other predictors of the model that are not included here will be collapsed and "averaged" over (the effect will be estimated across them).} +\item{at}{The predictor variable(s) \emph{at} which to evaluate the desired effect +/ mean / contrasts. Other predictors of the model that are not included +here will be collapsed and "averaged" over (the effect will be estimated +across them).} \item{fixed}{A character vector indicating the names of the predictors to be "fixed" (i.e., maintained), so that the estimation is made at these values.} @@ -66,9 +69,11 @@ for which to compute the slopes.} } \description{ The \code{model_emmeans} function is a wrapper to facilitate the usage of -\code{emmeans::emmeans()} and \code{emmeans::emtrends()}, providing a -somewhat simpler and intuitive API to find the specifications and variables of interest. -It is meanly made to for the developers to facilitate the organization and debugging, and end-users should rather use the \verb{estimate_*} series of functions. +\code{emmeans::emmeans()} and \code{emmeans::emtrends()}, providing a somewhat simpler +and intuitive API to find the specifications and variables of interest. It is +meanly made to for the developers to facilitate the organization and +debugging, and end-users should rather use the \verb{estimate_*} series of +functions. } \examples{ # Basic usage