Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow named character vector in select, to rename variables #512

Merged
merged 13 commits into from
Jun 8, 2024
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Type: Package
Package: datawizard
Title: Easy Data Wrangling and Statistical Transformations
Version: 0.11.0.1
Version: 0.11.0.2
Authors@R: c(
person("Indrajeet", "Patil", , "[email protected]", role = "aut",
comment = c(ORCID = "0000-0003-1995-6531", Twitter = "@patilindrajeets")),
Expand Down
5 changes: 5 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# datawizard 0.11.0.1

## Changes

* `data_select()` can directly rename selected variables when a named vector
is provided in `select`, e.g. `data_select(mtcars, c(new = "mpg", old = "cyl"))`.
strengejacke marked this conversation as resolved.
Show resolved Hide resolved

# datawizard 0.11.0

BREAKING CHANGES
Expand Down
5 changes: 5 additions & 0 deletions R/data_select.R
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,11 @@ data_select <- function(data,

out <- data[columns]

# for named character vectors, we offer the service to directly rename the columns
if (!is.null(names(columns))) {
colnames(out) <- names(columns)
}

# add back attributes
out <- .replace_attrs(out, a)
out
Expand Down
7 changes: 7 additions & 0 deletions R/extract_column_names.R
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,10 @@
#'
#' @details
#'
#' Specifically for `data_select()`, `select` can also be a named character
#' vector. In this case, the names are used to rename the columns in the
#' output data frame. See 'Examples'.
#'
#' Note that it is possible to either pass an entire select helper or only the
#' pattern inside a select helper as a function argument:
#'
Expand Down Expand Up @@ -124,6 +128,9 @@
#' # find numeric with mean > 3.5
#' numeric_mean_35 <- function(x) is.numeric(x) && mean(x, na.rm = TRUE) > 3.5
#' extract_column_names(iris, numeric_mean_35)
#'
#' # rename returned columns for "data_select()"
#' head(data_select(mtcars, c(`Miles per Gallon` = "mpg", Cylinders = "cyl")))
#' @export
extract_column_names <- function(data,
select = NULL,
Expand Down
16 changes: 16 additions & 0 deletions R/select_nse.R
Original file line number Diff line number Diff line change
Expand Up @@ -75,6 +75,22 @@
out <- setdiff(out, grp_vars)
}

# for named character vectors, we offer the service to rename the columns
if (typeof(expr_select) == "language") {
strengejacke marked this conversation as resolved.
Show resolved Hide resolved
# safe evaluation of the expression, to get the named vector from "select"
new_names <- tryCatch(eval(expr_select), error = function(e) NULL)
# check if we really have a named vector
if (!is.null(new_names) && !is.null(names(new_names))) {
# if so, copy names
all_names <- names(new_names)
# if some of the element don't have a name, we set the value as name
strengejacke marked this conversation as resolved.
Show resolved Hide resolved
names(new_names)[!nzchar(all_names)] <- new_names[!nzchar(all_names)]
# after inclusion and exclusion, the original values in "select"
# may have changed, so we check that we only add names of valid values
out <- stats::setNames(out, names(new_names)[new_names %in% out])
}
}

out
}

Expand Down
7 changes: 7 additions & 0 deletions man/extract_column_names.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

24 changes: 20 additions & 4 deletions tests/testthat/test-data_select.R
Original file line number Diff line number Diff line change
Expand Up @@ -106,22 +106,22 @@
test_that("data_select works with negated select-functions", {
expect_identical(
data_select(iris, -is.numeric()),
iris[sapply(iris, function(i) !is.numeric(i))]
iris[sapply(iris, function(i) !is.numeric(i))] # nolint
)

expect_identical(
data_select(iris, -is.numeric),
iris[sapply(iris, function(i) !is.numeric(i))]
iris[sapply(iris, function(i) !is.numeric(i))] # nolint
)

expect_identical(
data_select(iris, -is.factor()),
iris[sapply(iris, function(i) !is.factor(i))]
iris[sapply(iris, function(i) !is.factor(i))] # nolint
)

expect_identical(
data_select(iris, -is.factor),
iris[sapply(iris, function(i) !is.factor(i))]
iris[sapply(iris, function(i) !is.factor(i))] # nolint
)

expect_identical(data_select(iris, -is.logical), iris)
Expand Down Expand Up @@ -363,7 +363,7 @@
)
})

test_that("select helpers work in functions and loops even if there's an object with the same name in the environment above", {

Check warning on line 366 in tests/testthat/test-data_select.R

View workflow job for this annotation

GitHub Actions / lint-changed-files / lint-changed-files

file=tests/testthat/test-data_select.R,line=366,col=121,[line_length_linter] Lines should not be more than 120 characters. This line is 127 characters.
i <- "Petal"
foo <- function(data, i) {
extract_column_names(data, select = starts_with(i))
Expand Down Expand Up @@ -402,3 +402,19 @@
c("Sepal.Length", "Sepal.Width")
)
})

test_that("data_select renames variables on the fly", {
data(mtcars)
out <- data_select(mtcars, c(new = "mpg", old = "cyl", hoho = "wt"))
expect_named(out, c("new", "old", "hoho"))
data_select(mtcars, c(new = "mpg", "cyl", hoho = "wt"))
strengejacke marked this conversation as resolved.
Show resolved Hide resolved
expect_named(out, c("new", "cyl", "hoho"))
data_select(mtcars, c("mpg", "cyl", "wt"))
expect_named(out, c("mpg", "cyl", "wt"))
# don't fail for non-existing columns
data_select(mtcars, c(new = "mpg", "cyl", hoho = "wt", test = "grea"))
expect_named(out, c("new", "cyl", "hoho"))
# check that excluded variables don't cause troubles
data_select(mtcars, c(new = "mpg", "cyl", hoho = "wt"), exclude = "wt")
expect_named(out, c("new", "cyl"))
})
Loading