Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

error checking model from parsnip object: operator is invalid #301

Open
verajosemanuel opened this issue May 19, 2021 · 7 comments
Open
Labels
3 investigators ❔❓ Need to look further into this issue Bug 🐛 Something isn't working

Comments

@verajosemanuel
Copy link

verajosemanuel commented May 19, 2021

Tried to check_model using a very simple glmnet classification task.

Code from here:
https://stackoverflow.com/questions/65969913/extract-plain-model-from-tidymodel-object

library(magrittr)
library(tidymodels)
library(performance)

data(two_class_dat)

glm_spec <- logistic_reg() %>%
  set_engine("glmnet")

norm_rec <- recipe(Class ~ A + B, data = two_class_dat) %>%
  step_normalize(all_predictors())

glm_fit <- workflow() %>%
  add_recipe(norm_rec) %>%
  add_model(glm_spec) %>%
  fit(two_class_dat) %>%
  pull_workflow_fit()



performance::check_model(glm_fit)

Error: $ operator is invalid for atomic vectors

@verajosemanuel verajosemanuel changed the title error checking model from parsnip object error checking model from parsnip object: operator is invalid May 19, 2021
@strengejacke
Copy link
Member

Not sure this is a parsnip issue.... How would this code look in "regular form"? Something like glmnet(Class ~ A + B, data = two_class_dat)?

@strengejacke strengejacke added 3 investigators ❔❓ Need to look further into this issue Bug 🐛 Something isn't working labels May 19, 2021
@EmilHvitfeldt
Copy link

library(performance)
data(two_class_dat, package = "modeldata")

fit <- glmnet::glmnet(two_class_dat[, 1:2], two_class_dat[, 3], family = "binomial")

check_model(fit)
#> Error: $ operator is invalid for atomic vectors

Created on 2021-05-19 by the reprex package (v2.0.0)

The error happens in insight::model_info when it is trying to subset the results of stats::family(fit)

@strengejacke
Copy link
Member

I just wanted to revisit this issue, but there seems to be a new issue, possibly in glmnet:

data(two_class_dat, package = "modeldata")
fit <- glmnet::glmnet(two_class_dat[, 1:2], two_class_dat[, 3], family = "binomial")
#> Warning in Ops.factor(left, right): '*' not meaningful for factors
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'x' in selecting a method for function 'drop': requires numeric/complex matrix/vector arguments

Created on 2021-06-16 by the reprex package (v2.0.0)

@IndrajeetPatil
Copy link
Member

Strange.

If I run the code interactively, it works:

> data(two_class_dat, package = "modeldata")
> fit <- glmnet::glmnet(two_class_dat[, 1:2], two_class_dat[, 3], family = "binomial")
> fit

Call:  glmnet::glmnet(x = two_class_dat[, 1:2], y = two_class_dat[,      3], family = "binomial") 

   Df  %Dev   Lambda
1   0  0.00 0.308100
2   1  4.74 0.280700
3   1  8.72 0.255800
4   1 12.08 0.233100
5   1 14.96 0.212400
6   1 17.44 0.193500
7   1 19.57 0.176300
8   1 21.42 0.160600
9   1 23.02 0.146400
10  1 24.41 0.133400
11  1 25.62 0.121500
12  1 26.67 0.110700
13  1 27.58 0.100900
14  1 28.37 0.091930
15  1 29.05 0.083760
16  1 29.64 0.076320
17  1 30.15 0.069540
18  1 30.59 0.063360
19  1 30.97 0.057730
20  1 31.30 0.052610
21  1 31.58 0.047930
22  1 31.82 0.043670
23  1 32.02 0.039790
24  2 32.48 0.036260
25  2 33.27 0.033040
26  2 33.95 0.030100
27  2 34.54 0.027430
28  2 35.05 0.024990
29  2 35.48 0.022770
30  2 35.86 0.020750
31  2 36.18 0.018910
32  2 36.46 0.017230
33  2 36.70 0.015700
34  2 36.90 0.014300
35  2 37.08 0.013030
36  2 37.23 0.011870
37  2 37.35 0.010820
38  2 37.46 0.009857
39  2 37.55 0.008982
40  2 37.63 0.008184
41  2 37.70 0.007457
42  2 37.75 0.006794
43  2 37.80 0.006191
44  2 37.84 0.005641
45  2 37.87 0.005140
46  2 37.90 0.004683
47  2 37.92 0.004267
48  2 37.94 0.003888
49  2 37.96 0.003543
50  2 37.98 0.003228
51  2 37.99 0.002941
52  2 38.00 0.002680
53  2 38.01 0.002442
54  2 38.01 0.002225
55  2 38.02 0.002027
56  2 38.02 0.001847
57  2 38.03 0.001683
58  2 38.03 0.001533
59  2 38.03 0.001397
60  2 38.03 0.001273
61  2 38.04 0.001160
62  2 38.04 0.001057
63  2 38.04 0.000963
64  2 38.04 0.000878
65  2 38.04 0.000800

But, if I try to create a reprex, it doesn't 🤔

data(two_class_dat, package = "modeldata")
fit <- glmnet::glmnet(two_class_dat[, 1:2], two_class_dat[, 3], family = "binomial")
#> Warning in Ops.factor(left, right): '*' not meaningful for factors
#> Error in h(simpleError(msg, call)): error in evaluating the argument 'x' in selecting a method for function 'drop': requires numeric/complex matrix/vector arguments

Created on 2021-06-16 by the reprex package (v2.0.0)

Session info
sessioninfo::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value                       
#>  version  R version 4.1.0 (2021-05-18)
#>  os       macOS Mojave 10.14.6        
#>  system   x86_64, darwin17.0          
#>  ui       X11                         
#>  language (EN)                        
#>  collate  en_US.UTF-8                 
#>  ctype    en_US.UTF-8                 
#>  tz       Europe/Berlin               
#>  date     2021-06-16                  
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date       lib source                       
#>  backports     1.2.1      2020-12-09 [1] CRAN (R 4.1.0)               
#>  cli           2.5.0.9000 2021-06-11 [1] Github (r-lib/cli@571fea6)   
#>  codetools     0.2-18     2020-11-04 [2] CRAN (R 4.1.0)               
#>  crayon        1.4.1      2021-02-08 [1] CRAN (R 4.1.0)               
#>  digest        0.6.27     2020-10-24 [1] CRAN (R 4.1.0)               
#>  ellipsis      0.3.2      2021-04-29 [1] CRAN (R 4.1.0)               
#>  evaluate      0.14       2019-05-28 [1] CRAN (R 4.1.0)               
#>  fansi         0.5.0      2021-05-25 [1] CRAN (R 4.1.0)               
#>  foreach       1.5.1      2020-10-15 [1] CRAN (R 4.1.0)               
#>  fs            1.5.0      2020-07-31 [1] CRAN (R 4.1.0)               
#>  glmnet        4.1-1      2021-02-21 [1] CRAN (R 4.1.0)               
#>  glue          1.4.2      2020-08-27 [1] CRAN (R 4.1.0)               
#>  highr         0.9        2021-04-16 [1] CRAN (R 4.1.0)               
#>  htmltools     0.5.1.1    2021-01-22 [1] CRAN (R 4.1.0)               
#>  iterators     1.0.13     2020-10-15 [1] CRAN (R 4.1.0)               
#>  knitr         1.33       2021-04-24 [1] CRAN (R 4.1.0)               
#>  lattice       0.20-44    2021-05-02 [2] CRAN (R 4.1.0)               
#>  lifecycle     1.0.0      2021-02-15 [1] CRAN (R 4.1.0)               
#>  magrittr      2.0.1      2020-11-17 [1] CRAN (R 4.1.0)               
#>  Matrix        1.3-3      2021-05-04 [2] CRAN (R 4.1.0)               
#>  pillar        1.6.1      2021-05-16 [1] CRAN (R 4.1.0)               
#>  pkgconfig     2.0.3      2019-09-22 [1] CRAN (R 4.1.0)               
#>  purrr         0.3.4      2020-04-17 [1] CRAN (R 4.1.0)               
#>  reprex        2.0.0      2021-04-02 [1] CRAN (R 4.1.0)               
#>  rlang         0.4.11     2021-04-30 [1] CRAN (R 4.1.0)               
#>  rmarkdown     2.9        2021-06-15 [1] CRAN (R 4.1.0)               
#>  sessioninfo   1.1.1      2018-11-05 [1] CRAN (R 4.1.0)               
#>  shape         1.4.6      2021-05-19 [1] CRAN (R 4.1.0)               
#>  stringi       1.6.2      2021-05-17 [1] CRAN (R 4.1.0)               
#>  stringr       1.4.0      2019-02-10 [1] CRAN (R 4.1.0)               
#>  styler        1.4.1.9003 2021-06-09 [1] Github (r-lib/styler@a58a411)
#>  survival      3.2-11     2021-04-26 [2] CRAN (R 4.1.0)               
#>  tibble        3.1.2      2021-05-16 [1] CRAN (R 4.1.0)               
#>  utf8          1.2.1      2021-03-12 [1] CRAN (R 4.1.0)               
#>  vctrs         0.3.8      2021-04-29 [1] CRAN (R 4.1.0)               
#>  withr         2.4.2      2021-04-18 [1] CRAN (R 4.1.0)               
#>  xfun          0.24       2021-06-15 [1] CRAN (R 4.1.0)               
#>  yaml          2.2.1      2020-02-01 [1] CRAN (R 4.1.0)               
#> 
#> [1] /Users/patil/Library/R/x86_64/4.1/library
#> [2] /Library/Frameworks/R.framework/Versions/4.1/Resources/library

@strengejacke
Copy link
Member

The reprex no longer works. Any updates on this issue to reproduce the error?

library(magrittr)
library(tidymodels)
#> Registered S3 method overwritten by 'tune':
#>   method                   from   
#>   required_pkgs.model_spec parsnip
library(performance)
#> 
#> Attaching package: 'performance'
#> The following objects are masked from 'package:yardstick':
#> 
#>     mae, rmse

data(two_class_dat)

glm_spec <- logistic_reg() %>%
  set_engine("glmnet")

norm_rec <- recipe(Class ~ A + B, data = two_class_dat) %>%
  step_normalize(all_predictors())

glm_fit <- workflow() %>%
  add_recipe(norm_rec) %>%
  add_model(glm_spec) %>%
  fit(two_class_dat) %>%
  pull_workflow_fit()
#> Warning: `pull_workflow_fit()` was deprecated in workflows 0.2.3.
#> Please use `extract_fit_parsnip()` instead.
#> Error in `.check_glmnet_penalty_fit()`:
#> ! For the glmnet engine, `penalty` must be a single number (or a value of `tune()`).
#> * There are 0 values for `penalty`.
#> * To try multiple values for total regularization, use the tune package.
#> * To predict multiple penalties, use `multi_predict()`

Created on 2022-03-02 by the reprex package (v2.0.1)

@EmilHvitfeldt
Copy link

Here is an updated reprex reflecting the changes in tidymodels 😃

library(magrittr)
library(tidymodels)
#> Registered S3 method overwritten by 'tune':
#>   method                   from   
#>   required_pkgs.model_spec parsnip
library(performance)
#> 
#> Attaching package: 'performance'
#> The following objects are masked from 'package:yardstick':
#> 
#>     mae, rmse

data(two_class_dat)

glm_spec <- logistic_reg(penalty = 1) %>%
  set_engine("glmnet")

norm_rec <- recipe(Class ~ A + B, data = two_class_dat) %>%
  step_normalize(all_predictors())

glm_fit <- workflow() %>%
  add_recipe(norm_rec) %>%
  add_model(glm_spec) %>%
  fit(two_class_dat) %>%
  extract_fit_parsnip()

performance::check_model(glm_fit)
#> Error: $ operator is invalid for atomic vectors

Created on 2022-03-02 by the reprex package (v2.0.1)

@etiennebacher
Copy link
Member

Hello, this issue comes from these lines in insight:::model_info.default() which are called by performance::check_model:

https://github.com/easystats/insight/blob/e104d8a95c59c7092b5712e29da0118b05ced215/R/model_info.R#L116-L124

This is because stats::family() apparently returns less info than with other objects:

> class(glm_fit$fit)
[1] "lognet" "glmnet"

> stats::family(glm_fit$fit)
    lognet 
"binomial" 

Compared to lme4::lmer objects for example:

library(lme4)
#> Le chargement a nécessité le package : Matrix
m <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
stats::family(m)
#> 
#> Family: gaussian 
#> Link function: identity

Created on 2022-05-20 by the reprex package (v2.0.1)

However, I don't know what should be the arguments of .make_family() for this kind of objects so I can't fix this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 investigators ❔❓ Need to look further into this issue Bug 🐛 Something isn't working
Projects
None yet
Development

No branches or pull requests

5 participants