Fix snapshot tests #560

strengejacke · 2024-11-03T09:29:40Z

Fixes #559

strengejacke · 2024-11-03T12:16:46Z

Doesn't work perfect if the last row in the data frame is a duplicate - then, there are two table lines at the bottom, this still needs to be fixed:

And: which default? Which name for the argument? (currently: remove_duplicates)

print(datawizard::data_codebook(ggeffects::efc[, 1:4]), remove_duplicates = FALSE)
#> ggeffects::efc[, 1:4] (908 rows and 4 variables, 4 shown)
#> 
#> ID | Name     | Label                                    | Type    |  Missings
#> ---+----------+------------------------------------------+---------+----------
#> 1  | c12hour  | average number of hours of care per week | numeric |  6 (0.7%)
#> ---+----------+------------------------------------------+---------+----------
#> 2  | e15relat | relationship to elder                    | numeric |  7 (0.8%)
#>    |          |                                          |         |          
#>    |          |                                          |         |          
#>    |          |                                          |         |          
#>    |          |                                          |         |          
#>    |          |                                          |         |          
#>    |          |                                          |         |          
#>    |          |                                          |         |          
#> ---+----------+------------------------------------------+---------+----------
#> 3  | e16sex   | elder's gender                           | numeric |  7 (0.8%)
#>    |          |                                          |         |          
#> ---+----------+------------------------------------------+---------+----------
#> 4  | e17age   | elder' age                               | numeric | 17 (1.9%)
#> ------------------------------------------------------------------------------
#> 
#> ID |    Values | Value Labels            |           N
#> ---+-----------+-------------------------+------------
#> 1  |  [4, 168] |                         |         902
#> ---+-----------+-------------------------+------------
#> 2  |         1 | spouse/partner          | 171 (19.0%)
#>    |         2 | child                   | 473 (52.5%)
#>    |         3 | sibling                 |  29 ( 3.2%)
#>    |         4 | daughter or son -in-law |  85 ( 9.4%)
#>    |         5 | ancle/aunt              |  23 ( 2.6%)
#>    |         6 | nephew/niece            |  22 ( 2.4%)
#>    |         7 | cousin                  |   6 ( 0.7%)
#>    |         8 | other, specify          |  92 (10.2%)
#> ---+-----------+-------------------------+------------
#> 3  |         1 | male                    | 296 (32.9%)
#>    |         2 | female                  | 605 (67.1%)
#> ---+-----------+-------------------------+------------
#> 4  | [65, 103] |                         |         891
#> ------------------------------------------------------
print(datawizard::data_codebook(ggeffects::efc[, 1:4]), remove_duplicates = TRUE)
#> ggeffects::efc[, 1:4] (908 rows and 4 variables, 4 shown)
#> 
#> ID | Name     | Label                                    | Type    |  Missings
#> ---+----------+------------------------------------------+---------+----------
#> 1  | c12hour  | average number of hours of care per week | numeric |  6 (0.7%)
#> ---+----------+------------------------------------------+---------+----------
#> 2  | e15relat | relationship to elder                    | numeric |  7 (0.8%)
#> ---+----------+------------------------------------------+---------+----------
#> 3  | e16sex   | elder's gender                           | numeric |  7 (0.8%)
#> ---+----------+------------------------------------------+---------+----------
#> 4  | e17age   | elder' age                               | numeric | 17 (1.9%)
#> ------------------------------------------------------------------------------
#> 
#> ID |    Values | Value Labels            |           N
#> ---+-----------+-------------------------+------------
#> 1  |  [4, 168] |                         |         902
#> ---+-----------+-------------------------+------------
#> 2  |         1 | spouse/partner          | 171 (19.0%)
#>    |         2 | child                   | 473 (52.5%)
#>    |         3 | sibling                 |  29 ( 3.2%)
#>    |         4 | daughter or son -in-law |  85 ( 9.4%)
#>    |         5 | ancle/aunt              |  23 ( 2.6%)
#>    |         6 | nephew/niece            |  22 ( 2.4%)
#>    |         7 | cousin                  |   6 ( 0.7%)
#>    |         8 | other, specify          |  92 (10.2%)
#> ---+-----------+-------------------------+------------
#> 3  |         1 | male                    | 296 (32.9%)
#>    |         2 | female                  | 605 (67.1%)
#> ---+-----------+-------------------------+------------
#> 4  | [65, 103] |                         |         891
#> ------------------------------------------------------

print(datawizard::data_codebook(ggeffects::efc[, 1:3]), remove_duplicates = FALSE)
#> ggeffects::efc[, 1:3] (908 rows and 3 variables, 3 shown)
#> 
#> ID | Name     | Label                                    | Type    | Missings
#> ---+----------+------------------------------------------+---------+---------
#> 1  | c12hour  | average number of hours of care per week | numeric | 6 (0.7%)
#> ---+----------+------------------------------------------+---------+---------
#> 2  | e15relat | relationship to elder                    | numeric | 7 (0.8%)
#>    |          |                                          |         |         
#>    |          |                                          |         |         
#>    |          |                                          |         |         
#>    |          |                                          |         |         
#>    |          |                                          |         |         
#>    |          |                                          |         |         
#>    |          |                                          |         |         
#> ---+----------+------------------------------------------+---------+---------
#> 3  | e16sex   | elder's gender                           | numeric | 7 (0.8%)
#> ---+----------+------------------------------------------+---------+---------
#> -----------------------------------------------------------------------------
#> 
#> ID |   Values | Value Labels            |           N
#> ---+----------+-------------------------+------------
#> 1  | [4, 168] |                         |         902
#> ---+----------+-------------------------+------------
#> 2  |        1 | spouse/partner          | 171 (19.0%)
#>    |        2 | child                   | 473 (52.5%)
#>    |        3 | sibling                 |  29 ( 3.2%)
#>    |        4 | daughter or son -in-law |  85 ( 9.4%)
#>    |        5 | ancle/aunt              |  23 ( 2.6%)
#>    |        6 | nephew/niece            |  22 ( 2.4%)
#>    |        7 | cousin                  |   6 ( 0.7%)
#>    |        8 | other, specify          |  92 (10.2%)
#> ---+----------+-------------------------+------------
#> 3  |        1 | male                    | 296 (32.9%)
#>    |        2 | female                  | 605 (67.1%)
#> -----------------------------------------------------
print(datawizard::data_codebook(ggeffects::efc[, 1:3]), remove_duplicates = TRUE)
#> ggeffects::efc[, 1:3] (908 rows and 3 variables, 3 shown)
#> 
#> ID | Name     | Label                                    | Type    | Missings
#> ---+----------+------------------------------------------+---------+---------
#> 1  | c12hour  | average number of hours of care per week | numeric | 6 (0.7%)
#> ---+----------+------------------------------------------+---------+---------
#> 2  | e15relat | relationship to elder                    | numeric | 7 (0.8%)
#> ---+----------+------------------------------------------+---------+---------
#> 3  | e16sex   | elder's gender                           | numeric | 7 (0.8%)
#> ---+----------+------------------------------------------+---------+---------
#> -----------------------------------------------------------------------------
#> 
#> ID |   Values | Value Labels            |           N
#> ---+----------+-------------------------+------------
#> 1  | [4, 168] |                         |         902
#> ---+----------+-------------------------+------------
#> 2  |        1 | spouse/partner          | 171 (19.0%)
#>    |        2 | child                   | 473 (52.5%)
#>    |        3 | sibling                 |  29 ( 3.2%)
#>    |        4 | daughter or son -in-law |  85 ( 9.4%)
#>    |        5 | ancle/aunt              |  23 ( 2.6%)
#>    |        6 | nephew/niece            |  22 ( 2.4%)
#>    |        7 | cousin                  |   6 ( 0.7%)
#>    |        8 | other, specify          |  92 (10.2%)
#> ---+----------+-------------------------+------------
#> 3  |        1 | male                    | 296 (32.9%)
#>    |        2 | female                  | 605 (67.1%)
#> -----------------------------------------------------

^{Created on 2024-11-03 with reprex v2.1.1}

mattansb · 2024-11-03T12:28:06Z

Why is this only an issue for the last row? 🤔

strengejacke · 2024-11-03T12:43:41Z

I'm looking for consecutive duplicated lines, and these will be removed (from first duplicated to the one before the last - we want to keep one line that separates data rows).

In the 2nd example with three columns, we have following for the last data row (e16sex):

 ---+----------+------------------------------------------+---------+---------
 ---+----------+------------------------------------------+---------+---------
 -----------------------------------------------------------------------------

one of the two duplicated lines is removed, the other one is preserved as separator for the next data row - but there's no next data row, there's only the last line (which is not a duplicate, because it has no + signs).

strengejacke · 2024-11-03T12:46:07Z

I think we have to fix this somewhere in the new code that removes the duplicates:

https://github.com/easystats/insight/pull/953/files#diff-cafb3e0523369ec1d779f9377d74656f6e10b4a9131008d3f87a4e2c8a5482e1R734-R777

strengejacke · 2024-11-03T12:57:58Z

ok, should be fixed. Once the PR in insight is merged, we can run checks here again

codecov · 2024-11-03T14:32:16Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 91.29%. Comparing base (003e2b8) to head (f2b8670).
Report is 1 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #560   +/-   ##
=======================================
  Coverage   91.29%   91.29%           
=======================================
  Files          76       76           
  Lines        5983     5983           
=======================================
  Hits         5462     5462           
  Misses        521      521

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

etiennebacher · 2024-11-04T11:24:40Z

I'm confused about those changes. Does the default print method for data_codebook() produces a different output? If so, I don't know whether this change is necessary but it should be mentioned in the news as a breaking change (and the required version of insight should be bumped?).

mattansb · 2024-11-04T13:12:33Z

This is an easystats-wide change to make outputs more console-friendly.

The insight req versions should be bumped, yes.

strengejacke · 2024-11-04T17:18:35Z

Sorry, misunderstood

strengejacke · 2024-11-04T17:39:48Z

Basically, the table_width argument in insight::export_table() now defaults to "auto", and it can now split tables into as many parts as required to make the table-width fit into the console (previously, a maximum of two splits were possible).

The major issue is table_width = "auto" now. While previously wide tables were overflowing and printed in the next console line (and the table looked a bit messy), now tables are more readable. This doesn't work for this reprex of course, but the screenshots below show it quite well.

And these changes break certain snapshot tests. The other changes in this PR are just ensuring that ... is passed to insight::export_table() where necessary.

print(datawizard::data_codebook(ggeffects::efc[, 1:3]), table_width = Inf)
#> ggeffects::efc[, 1:3] (908 rows and 3 variables, 3 shown)
#> 
#> ID | Name     | Label                                    | Type    | Missings |   Values | Value Labels            |           N
#> ---+----------+------------------------------------------+---------+----------+----------+-------------------------+------------
#> 1  | c12hour  | average number of hours of care per week | numeric | 6 (0.7%) | [4, 168] |                         |         902
#> ---+----------+------------------------------------------+---------+----------+----------+-------------------------+------------
#> 2  | e15relat | relationship to elder                    | numeric | 7 (0.8%) |        1 | spouse/partner          | 171 (19.0%)
#>    |          |                                          |         |          |        2 | child                   | 473 (52.5%)
#>    |          |                                          |         |          |        3 | sibling                 |  29 ( 3.2%)
#>    |          |                                          |         |          |        4 | daughter or son -in-law |  85 ( 9.4%)
#>    |          |                                          |         |          |        5 | ancle/aunt              |  23 ( 2.6%)
#>    |          |                                          |         |          |        6 | nephew/niece            |  22 ( 2.4%)
#>    |          |                                          |         |          |        7 | cousin                  |   6 ( 0.7%)
#>    |          |                                          |         |          |        8 | other, specify          |  92 (10.2%)
#> ---+----------+------------------------------------------+---------+----------+----------+-------------------------+------------
#> 3  | e16sex   | elder's gender                           | numeric | 7 (0.8%) |        1 | male                    | 296 (32.9%)
#>    |          |                                          |         |          |        2 | female                  | 605 (67.1%)
#> --------------------------------------------------------------------------------------------------------------------------------

print(datawizard::data_codebook(ggeffects::efc[, 1:3]))
#> ggeffects::efc[, 1:3] (908 rows and 3 variables, 3 shown)
#> 
#> ID | Name     | Label                                    | Type    | Missings
#> ---+----------+------------------------------------------+---------+---------
#> 1  | c12hour  | average number of hours of care per week | numeric | 6 (0.7%)
#> ---+----------+------------------------------------------+---------+---------
#> 2  | e15relat | relationship to elder                    | numeric | 7 (0.8%)
#>    |          |                                          |         |         
#>    |          |                                          |         |         
#>    |          |                                          |         |         
#>    |          |                                          |         |         
#>    |          |                                          |         |         
#>    |          |                                          |         |         
#>    |          |                                          |         |         
#> ---+----------+------------------------------------------+---------+---------
#> 3  | e16sex   | elder's gender                           | numeric | 7 (0.8%)
#>    |          |                                          |         |         
#> -----------------------------------------------------------------------------
#> 
#> ID |   Values | Value Labels            |           N
#> ---+----------+-------------------------+------------
#> 1  | [4, 168] |                         |         902
#> ---+----------+-------------------------+------------
#> 2  |        1 | spouse/partner          | 171 (19.0%)
#>    |        2 | child                   | 473 (52.5%)
#>    |        3 | sibling                 |  29 ( 3.2%)
#>    |        4 | daughter or son -in-law |  85 ( 9.4%)
#>    |        5 | ancle/aunt              |  23 ( 2.6%)
#>    |        6 | nephew/niece            |  22 ( 2.4%)
#>    |        7 | cousin                  |   6 ( 0.7%)
#>    |        8 | other, specify          |  92 (10.2%)
#> ---+----------+-------------------------+------------
#> 3  |        1 | male                    | 296 (32.9%)
#>    |        2 | female                  | 605 (67.1%)
#> -----------------------------------------------------

^{Created on 2024-11-04 with reprex v2.1.1}

etiennebacher · 2024-11-04T19:19:47Z

Ok, thanks for the explanation, this sounds good to me. I was wondering whether this would also change the output when the console is wide enough to support the full table without splitting, but apparently this is not the case. I don't believe anyone relies on the current overflowing behavior and the main change comes from insight, so I don't think we need to put this in news here.

etiennebacher · 2024-11-04T19:22:08Z

@strengejacke I'm not sure the conversation with @mattansb is resolved so I'll let you merge if it is.

strengejacke · 2024-11-04T19:30:56Z

Ok, thanks for the explanation, this sounds good to me. I was wondering whether this would also change the output when the console is wide enough to support the full table without splitting, but apparently this is not the case. I don't believe anyone relies on the current overflowing behavior and the main change comes from insight, so I don't think we need to put this in news here.

If the table width ("output") fits into the console (i.e. console is wide enough), no new behaviour occurs. It's literally only an improvement. And even in vignettes, e.g. here:
https://easystats.github.io/parameters/articles/model_parameters.html
the table is not split, because the markdown document sets the line width to 120:

knitr::opts_chunk$set(tidy.opts = list(width.cutoff = 120))

only tables wider than 120 chars would be split into several chunks.

strengejacke added 4 commits November 3, 2024 10:28

Fix snapshot tests

c976001

Fixes #559

update

ced22eb

pass ... to export_table

c6b475a

update tests

14d16d8

This comment was marked as outdated.

Sign in to view

pass ...

3117387

strengejacke added 2 commits November 3, 2024 14:58

update tests

777e2a9

docs

f2b8670

Trigger CI

630d987

etiennebacher approved these changes Nov 4, 2024

View reviewed changes

lintr

35a8481

strengejacke merged commit b723df2 into main Nov 4, 2024
22 checks passed

strengejacke deleted the strengejacke/issue559 branch November 4, 2024 21:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix snapshot tests #560

Fix snapshot tests #560

strengejacke commented Nov 3, 2024

This comment was marked as outdated.

This comment was marked as outdated.

strengejacke commented Nov 3, 2024 •

edited

Loading

mattansb commented Nov 3, 2024

strengejacke commented Nov 3, 2024 •

edited

Loading

strengejacke commented Nov 3, 2024

strengejacke commented Nov 3, 2024

codecov bot commented Nov 3, 2024

etiennebacher commented Nov 4, 2024

mattansb commented Nov 4, 2024

strengejacke commented Nov 4, 2024 •

edited

Loading

strengejacke commented Nov 4, 2024 •

edited

Loading

etiennebacher commented Nov 4, 2024 •

edited

Loading

etiennebacher commented Nov 4, 2024

strengejacke commented Nov 4, 2024

Fix snapshot tests #560

Fix snapshot tests #560

Conversation

strengejacke commented Nov 3, 2024

This comment was marked as outdated.

This comment was marked as outdated.

strengejacke commented Nov 3, 2024 • edited Loading

mattansb commented Nov 3, 2024

strengejacke commented Nov 3, 2024 • edited Loading

strengejacke commented Nov 3, 2024

strengejacke commented Nov 3, 2024

codecov bot commented Nov 3, 2024

Codecov Report

etiennebacher commented Nov 4, 2024

mattansb commented Nov 4, 2024

strengejacke commented Nov 4, 2024 • edited Loading

strengejacke commented Nov 4, 2024 • edited Loading

etiennebacher commented Nov 4, 2024 • edited Loading

etiennebacher commented Nov 4, 2024

strengejacke commented Nov 4, 2024

strengejacke commented Nov 3, 2024 •

edited

Loading

strengejacke commented Nov 3, 2024 •

edited

Loading

strengejacke commented Nov 4, 2024 •

edited

Loading

strengejacke commented Nov 4, 2024 •

edited

Loading

etiennebacher commented Nov 4, 2024 •

edited

Loading