Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vignette about "Nested" interactions #332

Open
strengejacke opened this issue Nov 10, 2020 · 1 comment
Open

Vignette about "Nested" interactions #332

strengejacke opened this issue Nov 10, 2020 · 1 comment
Assignees
Labels
Docs 📚 Something to be adressed in docs and/or vignettes

Comments

@strengejacke
Copy link
Member

We should use "*" consistently for interactions. Currently, we use : to denote nested effects, but it might be confusing (because ":" is used to add specific interactions in formulas)... maybe another symbol would be better like a dash or / (so that it's straightforward).

Mmh, let me try to recap, as I feel like this is a pretty nightmarish stuff, and getting this right would tremendously help users to understand the meaning of the parameters. Warning: it might be hairpulling

all interactions '*' vs. specific interactions ":" vs. nested effects '/'

Simple case 1 - factor and numeric

'*' vs. ':'

library(see)
library(ggplot2)
library(dplyr)

m1 <- lm(Sepal.Length ~ Species * Petal.Length, data = iris) 
m2 <- lm(Sepal.Length ~ Species + Species:Petal.Length, data = iris) 
m3 <- lm(Sepal.Length ~ Petal.Length + Species:Petal.Length, data = iris) 
m4 <- lm(Sepal.Length ~ Petal.Length + Species + Species:Petal.Length, data = iris) 

p1 <- modelbased::estimate_link(m1, preserve_range=FALSE) %>% 
  ggplot(aes(x=Petal.Length, y=Predicted, color=Species)) + 
  geom_line()
p2 <- modelbased::estimate_link(m2, preserve_range=FALSE) %>% 
  ggplot(aes(x=Petal.Length, y=Predicted, color=Species)) + 
  geom_line()
p3 <- modelbased::estimate_link(m3, preserve_range=FALSE) %>% 
  ggplot(aes(x=Petal.Length, y=Predicted, color=Species)) + 
  geom_line()
p4 <- modelbased::estimate_link(m4, preserve_range=FALSE) %>% 
  ggplot(aes(x=Petal.Length, y=Predicted, color=Species)) + 
  geom_line()
see::plots(p1, p2, p3, p4)

Created on 2020-11-10 by the reprex package (v0.3.0)

So here, m1, m2 and m4 are the same models, because they allow a different intercept for each Species + its differentiated modulation by Petal.Length. m3 is different because, while the slope is allowed to be modulated by Species, there is no different intercept allowed for Species (all the lines must origin from 0).

That said, it's all about interactions here (i.e., a * b is a placeholder for a + b + a:b), so in parameters, all of them should be denoted by * (which we use for interactions instead of ':', which I find clearer [and also because R uses ':' for interactions and nested effects and we need to differentiate]), which is not the case currently:

parameters::parameters(m1)
#> Parameter                           | Coefficient |   SE |         95% CI | t(144) |      p
#> -------------------------------------------------------------------------------------------
#> (Intercept)                         |        4.21 | 0.41 | [ 3.41,  5.02] |  10.34 | < .001
#> Species [versicolor]                |       -1.81 | 0.60 | [-2.99, -0.62] |  -3.02 | 0.003 
#> Species [virginica]                 |       -3.15 | 0.63 | [-4.41, -1.90] |  -4.97 | < .001
#> Petal.Length                        |        0.54 | 0.28 | [ 0.00,  1.09] |   1.96 | 0.052 
#> Species [versicolor] * Petal.Length |        0.29 | 0.30 | [-0.30,  0.87] |   0.97 | 0.334 
#> Species [virginica] * Petal.Length  |        0.45 | 0.29 | [-0.12,  1.03] |   1.56 | 0.120
parameters::parameters(m2)
#> Parameter                           | Coefficient |   SE |         95% CI | t(144) |      p
#> -------------------------------------------------------------------------------------------
#> (Intercept)                         |        4.21 | 0.41 | [ 3.41,  5.02] |  10.34 | < .001
#> Species [versicolor]                |       -1.81 | 0.60 | [-2.99, -0.62] |  -3.02 | 0.003 
#> Species [virginica]                 |       -3.15 | 0.63 | [-4.41, -1.90] |  -4.97 | < .001
#> Species [setosa] : Petal.Length     |        0.54 | 0.28 | [ 0.00,  1.09] |   1.96 | 0.052 
#> Species [versicolor] : Petal.Length |        0.83 | 0.10 | [ 0.63,  1.03] |   8.10 | < .001
#> Species [virginica] : Petal.Length  |        1.00 | 0.09 | [ 0.82,  1.17] |  11.43 | < .001
parameters::parameters(m3)
#> Parameter                        | Coefficient |   SE |         95% CI | t(146) |      p
#> ----------------------------------------------------------------------------------------
#> (Intercept)                      |        2.74 | 0.27 | [ 2.20,  3.28] |  10.00 | < .001
#> Petal.Length                     |        1.54 | 0.19 | [ 1.16,  1.91] |   8.16 | < .001
#> Petal.Length : Speciesversicolor |       -0.78 | 0.13 | [-1.03, -0.53] |  -6.19 | < .001
#> Petal.Length : Speciesvirginica  |       -0.84 | 0.14 | [-1.12, -0.56] |  -5.97 | < .001
parameters::parameters(m4)
#> Parameter                           | Coefficient |   SE |         95% CI | t(144) |      p
#> -------------------------------------------------------------------------------------------
#> (Intercept)                         |        4.21 | 0.41 | [ 3.41,  5.02] |  10.34 | < .001
#> Petal.Length                        |        0.54 | 0.28 | [ 0.00,  1.09] |   1.96 | 0.052 
#> Species [versicolor]                |       -1.81 | 0.60 | [-2.99, -0.62] |  -3.02 | 0.003 
#> Species [virginica]                 |       -3.15 | 0.63 | [-4.41, -1.90] |  -4.97 | < .001
#> Petal.Length * Species [versicolor] |        0.29 | 0.30 | [-0.30,  0.87] |   0.97 | 0.334 
#> Petal.Length * Species [virginica]  |        0.45 | 0.29 | [-0.12,  1.03] |   1.56 | 0.120

Created on 2020-11-10 by the reprex package (v0.3.0)

'*' vs. '/'

library(see)
library(ggplot2)
library(dplyr)

m1 <- lm(Sepal.Length ~ Species * Petal.Length, data = iris)
m2 <- lm(Sepal.Length ~ Species / Petal.Length, data = iris)

p1 <- modelbased::estimate_link(m1) %>%
  ggplot(aes(x=Petal.Length, y=Predicted, color=Species)) +
  geom_line()
p2 <- modelbased::estimate_link(m2) %>%
  ggplot(aes(x=Petal.Length, y=Predicted, color=Species)) +
  geom_line()
see::plots(p1, p2)

parameters::parameters(m1)
#> Parameter                           | Coefficient |   SE |         95% CI | t(144) |      p
#> -------------------------------------------------------------------------------------------
#> (Intercept)                         |        4.21 | 0.41 | [ 3.41,  5.02] |  10.34 | < .001
#> Species [versicolor]                |       -1.81 | 0.60 | [-2.99, -0.62] |  -3.02 | 0.003 
#> Species [virginica]                 |       -3.15 | 0.63 | [-4.41, -1.90] |  -4.97 | < .001
#> Petal.Length                        |        0.54 | 0.28 | [ 0.00,  1.09] |   1.96 | 0.052 
#> Species [versicolor] * Petal.Length |        0.29 | 0.30 | [-0.30,  0.87] |   0.97 | 0.334 
#> Species [virginica] * Petal.Length  |        0.45 | 0.29 | [-0.12,  1.03] |   1.56 | 0.120
parameters::parameters(m2)
#> Parameter                           | Coefficient |   SE |         95% CI | t(144) |      p
#> -------------------------------------------------------------------------------------------
#> (Intercept)                         |        4.21 | 0.41 | [ 3.41,  5.02] |  10.34 | < .001
#> Species [versicolor]                |       -1.81 | 0.60 | [-2.99, -0.62] |  -3.02 | 0.003 
#> Species [virginica]                 |       -3.15 | 0.63 | [-4.41, -1.90] |  -4.97 | < .001
#> Species [setosa] : Petal.Length     |        0.54 | 0.28 | [ 0.00,  1.09] |   1.96 | 0.052 
#> Species [versicolor] : Petal.Length |        0.83 | 0.10 | [ 0.63,  1.03] |   8.10 | < .001
#> Species [virginica] : Petal.Length  |        1.00 | 0.09 | [ 0.82,  1.17] |  11.43 | < .001

Created on 2020-11-10 by the reprex package (v0.3.0)

Here, while it is the same model (in the case of a variable nested in a factor (spoiler, it gets weird when it's in a numeric)), it's the same model, but the parameters represent different things. In the case of the nested model, the effects are pretty much regular effects (i.e., the coefficient of the slope) estimated "within" each level. So it's conceptually different than interactions (which evaluates the change in another effect). So currently we denote nested by : but it might be confusing, so maybe we should replace by / or | to show that it's "the effect of x within the factor level.

Now I'm trying to wrap my head around nested effects for continuous... and I didn't even want to look at more than 2 variables ^^

Originally posted by @DominiqueMakowski in #330 (comment)

@strengejacke strengejacke added the Docs 📚 Something to be adressed in docs and/or vignettes label Nov 12, 2020
@strengejacke
Copy link
Member Author

See #330 and #155

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs 📚 Something to be adressed in docs and/or vignettes
Projects
None yet
Development

No branches or pull requests

3 participants