Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

unnormalize() with grouped data #415

Merged
merged 30 commits into from
Sep 12, 2023
Merged

Conversation

etiennebacher
Copy link
Member

@etiennebacher etiennebacher commented May 3, 2023

Close #375.

library(datawizard)

x <- iris |> 
  data_group(Species) |>
  normalize(select = Sepal.Length) |> 
  unnormalize(select = Sepal.Length) |> 
  data_ungroup()

identical(x, iris)
#> [1] TRUE

x <- iris |> 
  data_group(Species) |>
  standardize(select = Sepal.Length) |> 
  unstandardize(select = Sepal.Length) |> 
  data_ungroup()

identical(x, iris)
#> [1] TRUE

The idea is to store the dw_transformer attributes along the groups attributes so that we can then recover the dw_transformer attributes for each group separately. We then need to pass the dw_transformer attributes to normalize.data.frame() and then to normalize.numeric().

Same thing for unstandardize().

Note: there are probably ways to improve this in terms of performance

@codecov-commenter

This comment was marked as outdated.

@etiennebacher etiennebacher marked this pull request as ready for review May 5, 2023 05:33
@etiennebacher etiennebacher requested a review from strengejacke May 5, 2023 05:33
@etiennebacher
Copy link
Member Author

etiennebacher commented May 5, 2023

@strengejacke if you're ok with this code then I can expand it to unstandardize()

@etiennebacher etiennebacher mentioned this pull request May 30, 2023
4 tasks
@strengejacke
Copy link
Member

I'm not sure, but I think @mattansb started with one of the un*() functions? Maybe you can better review this PR?

R/unstandardize.R Outdated Show resolved Hide resolved
@mattansb
Copy link
Member

Daniel keeps thinking that @DominiqueMakowski 's code is mine 😅

@etiennebacher
Copy link
Member Author

etiennebacher commented May 31, 2023

The code is still incomplete for unstandardize(), it doesn't work with the example in #375

@strengejacke
Copy link
Member

Daniel keeps thinking that @DominiqueMakowski 's code is mine 😅

But see effectsize news, version 0.5.0 unstandardize() was moved to datawizard, and I think you made one of the first implementations of those un*() functions (I think I then tried to adopt this for unnormalize()).

@mattansb
Copy link
Member

I'm like 90% certain it was Dom, and then I maintained the code.

@DominiqueMakowski
Copy link
Member

What Daniel says becomes true tho 🤷

@etiennebacher

This comment was marked as outdated.

@etiennebacher
Copy link
Member Author

@strengejacke @DominiqueMakowski or @mattansb I don't know who's the best placed to review but can one of you do it? 😄

@etiennebacher
Copy link
Member Author

Last thing, I think we should harmonize the behavior of unstandardize() and unnormalize() when they don't have the necessary info. Currently, unstandardize() fails but unnormalize() only gives a warning:

library(datawizard)

unstandardize(mtcars, "mpg")
#> Error: You must provide the arguments `center`, `scale` or `reference`.

unnormalize(head(mtcars), "mpg")
#> Warning: Can't unnormalize variable. Information about range and/or minimum value
#>   is missing.
#>                    mpg cyl disp  hp drat    wt  qsec vs am gear carb
#> Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
#> Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
#> Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
#> Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
#> Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
#> Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

IMO both should fail. @strengejacke do you agree?

@strengejacke
Copy link
Member

yes.

@etiennebacher
Copy link
Member Author

Failures due to easystats/insight#804

@etiennebacher etiennebacher merged commit ad96b50 into main Sep 12, 2023
@etiennebacher etiennebacher deleted the unnormalize-grouped-data branch September 12, 2023 10:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

unstandardize() doesn't work with grouped data
5 participants