Dirichlet regression: how to supply a two-part formula separated by pipes? #8
-
Hi Joseph! Thank you for developping this package! (This is a follow-up on a Stack Overflow question on the same topic: the solution suggested there no longer works. I would like to ask your advice for using The problem is that I cannot run the domir::domir() function on a Dirichlet regression. Strengely enough, I already asked you about this over at Stack Overflow on July 2023, and that approach no longer works. I've tried with Can you provide any insight? Sorry to bother you again with the same problem! See reproducible example below. The toy example contains a single category, but it yields the same error message as applied to my data (with 4 categories). In my research I'll rely on another pseudo-R2, but the . library(DirichletReg)
#> Warning: package 'DirichletReg' was built under R version 4.2.3
#> Loading required package: Formula
library(domir)
library(performance)
#> Warning: package 'performance' was built under R version 4.2.3
RS <- ReadingSkills
response.d <- DR_data(RS$accuracy)
#> only one variable in [0, 1] supplied - beta-distribution assumed.
#> check this assumption.
# Fit Dirichlet regression
rs2 <- DirichReg(response.d ~ dyslexia + iq | dyslexia + iq, data = RS, model = "alternative")
performance::r2( rs2)[[1]]
#> Nagelkerke's R2
#> 0.4590758
summary(rs2)
#> Call:
#> DirichReg(formula = response.d ~ dyslexia + iq | dyslexia + iq, data = RS,
#> model = "alternative")
#>
#> Standardized Residuals:
#> Min 1Q Median 3Q Max
#> 1 - accuracy -1.5279 -0.7798 -0.343 0.6992 2.4213
#> accuracy -2.4213 -0.6992 0.343 0.7798 1.5279
#>
#> MEAN MODELS:
#> ------------------------------------------------------------------
#> Coefficients for variable no. 1: 1 - accuracy
#> - variable omitted (reference category) -
#> ------------------------------------------------------------------
#> Coefficients for variable no. 2: accuracy
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 2.22386 0.28087 7.918 2.42e-15 ***
#> dyslexiayes -1.81261 0.29696 -6.104 1.04e-09 ***
#> iq -0.02676 0.06900 -0.388 0.698
#> ------------------------------------------------------------------
#>
#> PRECISION MODEL:
#> ------------------------------------------------------------------
#> Estimate Std. Error z value Pr(>|z|)
#> (Intercept) 1.71017 0.32697 5.230 1.69e-07 ***
#> dyslexiayes 2.47521 0.55055 4.496 6.93e-06 ***
#> iq 0.04097 0.27537 0.149 0.882
#> ------------------------------------------------------------------
#> Significance codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> Log-likelihood: 61.26 on 6 df (33 BFGS + 1 NR Iterations)
#> AIC: -110.5, BIC: -99.81
#> Number of Observations: 44
#> Links: Logit (Means) and Log (Precision)
#> Parametrization: alternative
domir::domir(response.d ~ dyslexia + iq,
function(y) {
iv <- attr(terms(y), "term.labels")
fml <- paste0("response.d ~ ", paste0(iv, collapse = "+"), "| dyslexia + iq", collapse = "")
print(fml)
performance::r2( DirichReg(formula(fml), data = RS, model = "alternative") )[[1]]})
#> [1] "response.d ~ dyslexia+iq| dyslexia + iq"
#> Error: '.fct' produced an error when applied to '.obj'.
#> Also, check arguments passed to '.fct'
domir::domir(response.d ~ dyslexia + iq,
function(y) {
performance::r2( DirichReg(update(y, . ~ . | dyslexia + iq), data = RS, model = "alternative") )[[1]]})
#> Error: '.fct' produced an error when applied to '.obj'.
#> Also, check arguments passed to '.fct'
domir::domir(response.d ~ dyslexia + iq | dyslexia + iq,
function(y) {
performance::r2( DirichReg(y, data = RS, model = "alternative") )[[1]]})
#> Error: At least two subsets are needed for a dominance analysis.
sessionInfo()
#> R version 4.2.0 (2022-04-22 ucrt)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 22631)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=Spanish_Spain.utf8 LC_CTYPE=Spanish_Spain.utf8
#> [3] LC_MONETARY=Spanish_Spain.utf8 LC_NUMERIC=C
#> [5] LC_TIME=Spanish_Spain.utf8
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] performance_0.10.5 domir_1.0.1 DirichletReg_0.7-1 Formula_1.2-4
#>
#> loaded via a namespace (and not attached):
#> [1] rstudioapi_0.13 knitr_1.39 magrittr_2.0.3 insight_0.19.5
#> [5] lattice_0.20-45 rlang_1.1.1 fastmap_1.1.0 stringr_1.5.0
#> [9] highr_0.9 tools_4.2.0 grid_4.2.0 xfun_0.40
#> [13] cli_3.6.1 withr_2.5.0 htmltools_0.5.6 maxLik_1.5-2
#> [17] miscTools_0.6-28 yaml_2.3.5 digest_0.6.33 lifecycle_1.0.3
#> [21] vctrs_0.6.3 fs_1.5.2 glue_1.6.2 evaluate_0.15
#> [25] rmarkdown_2.14 sandwich_3.0-1 reprex_2.0.2 stringi_1.7.6
#> [29] compiler_4.2.0 generics_0.1.2 zoo_1.8-10 Created on 2024-02-22 with reprex v2.0.2 |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments
-
Hi Marc, Able to replicate the error but, fundamentally, is the same issue as last time. That is, an issue with
That error suggests that the way the formula is being passed internally is in a way that it's not able to access the value of Might be worth looking through the documentation more closely, asking about it on StackExchange, or contacting the author to see if they have any ideas. Worth noting that
Which should work as well if I do think an issue like this came up before with another package (might have been survey), will have to look to see if I was able to resolve it. If I can find what I did with the survey package will respond here. Will also give this some more thought, time permitting. |
Beta Was this translation helpful? Give feedback.
-
Hi Joseph! library(DirichletReg)
#> Warning: package 'DirichletReg' was built under R version 4.1.3
#> Loading required package: Formula
#> Warning: package 'Formula' was built under R version 4.1.1
library(domir)
library(performance)
#> Warning: package 'performance' was built under R version 4.1.3
RS <- ReadingSkills
response.d <- DR_data(RS$accuracy)
#> only one variable in [0, 1] supplied - beta-distribution assumed.
#> check this assumption.
# Fit Dirichlet regression
rs2 <- DirichReg(response.d ~ dyslexia + iq | dyslexia + iq, data = RS, model = "alternative")
performance::r2( rs2)[[1]]
#> Nagelkerke's R2
#> 0.4590758
domir::domir(response.d ~ dyslexia + iq,
function(y) {
iv <- attr(terms(y), "term.labels")
fml <- paste0("response.d ~ ", paste0(iv, collapse = "+"), "| dyslexia + iq", collapse = "")
print(fml)
performance::r2( update(rs2, formula(fml) ) )[[1]]})
#> [1] "response.d ~ dyslexia+iq| dyslexia + iq"
#> [1] "response.d ~ dyslexia| dyslexia + iq"
#> [1] "response.d ~ iq| dyslexia + iq"
#> Overall Value: 0.4590758
#>
#> General Dominance Values:
#> General Dominance Standardized Ranks
#> dyslexia 0.455042413 0.99121418 1
#> iq 0.004033357 0.00878582 2
#>
#> Conditional Dominance Values:
#> Subset Size: 1 Subset Size: 2
#> dyslexia 0.457263643 0.452821183
#> iq 0.006254587 0.001812127
#>
#> Complete Dominance Designations:
#> Dmnated?dyslexia Dmnated?iq
#> Dmnates?dyslexia NA TRUE
#> Dmnates?iq FALSE NA
sessionInfo()
#> R version 4.1.0 (2021-05-18)
#> Platform: x86_64-w64-mingw32/x64 (64-bit)
#> Running under: Windows 10 x64 (build 19045)
#>
#> Matrix products: default
#>
#> locale:
#> [1] LC_COLLATE=Spanish_Spain.1252 LC_CTYPE=Spanish_Spain.1252
#> [3] LC_MONETARY=Spanish_Spain.1252 LC_NUMERIC=C
#> [5] LC_TIME=Spanish_Spain.1252
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] performance_0.10.0 domir_1.0.1 DirichletReg_0.7-1 Formula_1.2-4
#>
#> loaded via a namespace (and not attached):
#> [1] rstudioapi_0.13 knitr_1.38 magrittr_2.0.3 insight_0.19.1
#> [5] lattice_0.20-44 rlang_1.1.0 fastmap_1.1.0 stringr_1.5.0
#> [9] highr_0.9 tools_4.1.0 grid_4.1.0 xfun_0.39
#> [13] cli_3.6.0 withr_2.5.2 htmltools_0.5.5 maxLik_1.5-2
#> [17] miscTools_0.6-28 yaml_2.3.5 digest_0.6.29 lifecycle_1.0.4
#> [21] vctrs_0.6.1 fs_1.5.2 glue_1.6.2 evaluate_0.15
#> [25] rmarkdown_2.13 sandwich_3.0-1 reprex_2.0.1 stringi_1.7.6
#> [29] compiler_4.1.0 generics_0.1.2 zoo_1.8-9 Created on 2024-02-26 by the reprex package (v2.0.1) |
Beta Was this translation helpful? Give feedback.
Hi Joseph!
I think I have found a solution building on the solution you suggested over at Stack Overflow: fit a Dirichlet model outside the
domir()
call, then useupdate()
insidedomir()
to pass the formula to the model.Thank you very much for your help! I'll be happy to thank you in the Acknowledgements of a future publication.