Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Warn user for invalid formula #562

Merged
merged 3 commits into from
Nov 21, 2024
Merged

Warn user for invalid formula #562

merged 3 commits into from
Nov 21, 2024

Conversation

strengejacke
Copy link
Member

@strengejacke strengejacke commented Nov 21, 2024

The message when standardizing failed due to "problematic formulas" was not always clear. Now it should be clearer to users why model cannot be standardized.
Inspired by this SO post: https://stackoverflow.com/questions/79207876/variable-names-and-easystats-reports

data(mtcars)
m <- lm(mpg ~ hp, data = mtcars)
datawizard::standardise(m)
#> 
#> Call:
#> lm(formula = mpg ~ hp, data = data_std)
#> 
#> Coefficients:
#> (Intercept)           hp  
#>  -3.149e-17   -7.762e-01

colnames(mtcars)[1] <- "1_mpg"
m <- lm(`1_mpg` ~ hp, data = mtcars)
datawizard::standardise(m)
#> Warning: Looks like you are using invalid syntactically variables names, quoted
#>   in backticks: `1_mpg`. This may result in unexpected behaviour. Please
#>   rename your variables (e.g., `X1_mpg` instead of `1_mpg`) and fit the
#>   model again.
#> Model cannot be standardized.
#> 
#> Call:
#> lm(formula = `1_mpg` ~ hp, data = mtcars)
#> 
#> Coefficients:
#> (Intercept)           hp  
#>    30.09886     -0.06823

data(mtcars)
m <- lm(mtcars$mpg ~ mtcars$hp)
datawizard::standardise(m)
#> Warning: Using `$` in model formulas can produce unexpected results. Specify your
#>   model using the `data` argument instead.
#>   Try: mpg ~ hp, data = mtcars
#> Model cannot be standardized.
#> 
#> Call:
#> lm(formula = mtcars$mpg ~ mtcars$hp)
#> 
#> Coefficients:
#> (Intercept)    mtcars$hp  
#>    30.09886     -0.06823

m <- lm(mtcars[, 1] ~ hp, data = mtcars)
datawizard::standardise(m)
#> Warning: Using indexed data frames, such as `df[, 5]`, as model response can
#>   produce unexpected results. Specify your model using the literal name of
#>   the response variable instead.
#> Model cannot be standardized.
#> 
#> Call:
#> lm(formula = mtcars[, 1] ~ hp, data = mtcars)
#> 
#> Coefficients:
#> (Intercept)           hp  
#>    30.09886     -0.06823

Created on 2024-11-21 with reprex v2.1.1

@strengejacke strengejacke changed the title Warn user for invalif formula Warn user for invalid formula Nov 21, 2024
@strengejacke strengejacke marked this pull request as ready for review November 21, 2024 15:22
Copy link
Member

@etiennebacher etiennebacher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice messages, very clear! I have a doubt on the behavior when formula is not ok, and there are some tests to fix but otherwise LGTM.


Edit: actually shouldn't it be "using syntactically invalid variables names" instead of "using invalid syntactically variables names"?

R/standardize.models.R Outdated Show resolved Hide resolved
tests/testthat/test-standardize_models.R Outdated Show resolved Hide resolved
@strengejacke
Copy link
Member Author

"using syntactically invalid variables names" instead of "using invalid syntactically variables names"?

yeah, but that's an issue in insight ;-) will fix.

@strengejacke
Copy link
Member Author

Just saw, some tests are out of date, because these code-lines are now no longer reached for the exceptions. Will look at it.

@strengejacke
Copy link
Member Author

data(mtcars)
m <- lm(mpg ~ hp, data = mtcars)
datawizard::standardise(m)
#> 
#> Call:
#> lm(formula = mpg ~ hp, data = data_std)
#> 
#> Coefficients:
#> (Intercept)           hp  
#>  -3.149e-17   -7.762e-01

colnames(mtcars)[1] <- "1_mpg"
m <- lm(`1_mpg` ~ hp, data = mtcars)
datawizard::standardise(m)
#> Error: Model cannot be standardized.
#>   Looks like you are using syntactically invalid variables names, quoted
#>   in backticks: `1_mpg`. This may result in unexpected behaviour. Please
#>   rename your variables (e.g., `X1_mpg` instead of `1_mpg`) and fit the
#>   model again.

data(mtcars)
m <- lm(mtcars$mpg ~ mtcars$hp)
datawizard::standardise(m)
#> Error: Model cannot be standardized.
#>   Using `$` in model formulas can produce unexpected results. Specify your
#>   model using the `data` argument instead.
#>   Try: mpg ~ hp, data = mtcars

m <- lm(mtcars[, 1] ~ hp, data = mtcars)
datawizard::standardise(m)
#> Error: Model cannot be standardized.
#>   Using indexed data frames, such as `df[, 5]`, as model response can
#>   produce unexpected results. Specify your model using the literal name of
#>   the response variable instead.

Created on 2024-11-21 with reprex v2.1.1

@etiennebacher
Copy link
Member

Thanks!

@etiennebacher etiennebacher merged commit 2741cdc into main Nov 21, 2024
21 of 22 checks passed
@etiennebacher etiennebacher deleted the check_formula branch November 21, 2024 18:55
@strengejacke
Copy link
Member Author

I fixed "variables names" in insight, should be "variable names" in the error message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants