Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find variables and their paths in a Crunch dataset or variable folder #640

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 57 additions & 0 deletions R/findVariables.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
#' Find variables and their paths in a Crunch dataset or folder
#'
#' @param x Crunch dataset or variable folder
#' @param deep FALSE (the default) or TRUE; should subfolders
#' @param include.hidden FALSE (default) or TRUE, should hidden be included in the result?
#'
#' @return Data.frame with one row per Crunch variable and columns \code{alias}, \code{path}, \code{hidden}
#' @export
findVariables <- function(x, deep = FALSE, include.hidden = FALSE) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer dropping the include.hidden param and if you ever wanted it, you'd have to explicitly do rbind(findVariables(ds), findVariables(hiddenFolder(ds)), and make this clear in documentation.

(otherwise needs an include.private option as well)

The API & documentation are kind of confusing about this, because it changed since I joined, but hidden variables are always in the hidden folder (before 2021, there could be hidden variables in any folder), so I think I'd prefer having the only indication in the output that it is hidden be the folder name.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can understand the case for not specifically dealing with hidden & private variables here, though for me the convenience of having include.hidden and include.private wins out. So the current version has them, but if you feel strongly about this, I'll drop them.

if (is.dataset(x)) {
x <- cd(x, ".")
startpath <- ""
} else if (is.folder(x)) {
startpath <- name(x)
} else {
halt('`x` should be "CrunchDataset" or "VariableFolder", not "', paste(class(x), collapse = ", "), '"')
}
if (!isTRUE(deep) && !isFALSE(deep)) {
halt("`deep` should be TRUE or FALSE")
}
if (!isTRUE(include.hidden) && !isFALSE(include.hidden)) {
halt("`hidden` should be TRUE or FALSE")
}
if (!deep) {
vars <- aliases(variables(x))
nvars <- length(vars)
res <- data.frame(alias = vars, path = rep(startpath, nvars), hidden = rep(FALSE, nvars))
return(res)
}
res <- .findVariables(x, startpath)
res$hidden <- rep(FALSE, nrow(res))
if (include.hidden) {
hidden <- .findVariables(hiddenFolder(x), startpath)
hidden$hidden <- rep(TRUE, nrow(hidden))
res <- rbind(res, hidden)
}
res
}

.findVariables <- function(x, path) {
vars <- variables(x)
res <- data.frame(alias = aliases(vars), path = rep(path, length(vars)))
dirs <- x[types(x) %in% "folder"]
if (length(dirs) == 0) {
return(res)
}
dirnames <- names(dirs)
res2 <- lapply(seq_along(dirnames), function(i) {
if (identical(path, "")) {
new_path <- dirnames[i]
} else {
new_path <- paste(path, dirnames[i], sep = " | ")
}
.findVariables(dirs[[i]], new_path)
})
rbind(res, do.call(rbind, res2))
sluga marked this conversation as resolved.
Show resolved Hide resolved
}
Loading