Skip to content

Commit

Permalink
Fix #1139
Browse files Browse the repository at this point in the history
  • Loading branch information
wlandau-lilly committed Oct 12, 2023
1 parent 0b0fc88 commit 34a24ce
Show file tree
Hide file tree
Showing 7 changed files with 45 additions and 35 deletions.
12 changes: 11 additions & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,16 @@
# targets 1.3.2.9001 (development)

* Add function `tar_seed_set()` which sets a seed and the default RNG algorithms.
## Invalidating changes

Because of the changes below, upgrading to this version of `targets` will unavoidably invalidate previously built targets in existing pipelines. Your pipeline code should still work, but any targets you ran before will most likely need to rerun after the upgrade.

* Use SHA512 during the creation of target-specific pseudo-random number generator seeds (#1139). This change decreases the risk of overlapping/correlated random number generator streams. See the "RNG overlap" section of the `tar_seed_create()` help file for details and justification.

## Other changes

* Add a new exported function `tar_seed_create()` which creates target-specific pseudo-random number generator seeds.
* Add an "RNG overlap" section in the `tar_seed_create()` help file to justify and defend how `targets` and `tarchetypes` approach pseudo-random numbers.
* Add function `tar_seed_set()` which sets a seed and sets all the RNG algorithms to their defaults in the R installation of the user. Each target now uses `tar_seed_set()` function to set its seed before running its R command (#1139).
* Deprecate `tar_seed()` in favor of the new `tar_seed_get()` function.

# targets 1.3.2
Expand Down
2 changes: 1 addition & 1 deletion R/class_branch.R
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ branch_init <- function(
command <- command_clone(command)
deps <- union(command$deps, deps)
command$deps <- setdiff(deps, settings$dimensions)
command$seed <- produce_seed(child)
command$seed <- tar_seed_create(child)
pedigree <- pedigree_new(settings$name, child, index)
settings <- settings_clone(settings)
settings$name <- child
Expand Down
2 changes: 1 addition & 1 deletion R/class_target.R
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ target_init <- function(
retrieval = "main",
cue = NULL
) {
seed <- produce_seed(name)
seed <- tar_seed_create(name)
command <- command_init(expr, packages, library, seed, deps, string)
cue <- cue %|||% cue_init()
if (any(grepl("^aws_", format))) {
Expand Down
16 changes: 8 additions & 8 deletions R/tar_seed_create.R
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@
#' @description Create a seed for a target.
#' @section Seeds:
#' A target's random number generator seed
#' is a deterministic function of its name and the global pipeline seed.
#' Consequently,
#' is a deterministic function of its name and the global pipeline seed
#' from [tar_option_get("seed")]. Consequently,
#'
#' 1. Each target runs with a reproducible seed so that
#' different runs of the same pipeline in the same computing
Expand All @@ -24,14 +24,14 @@
#' correlated results. (For a discussion of the motivating problem,
#' see the Section 6: "Random-number generation" in the `parallel`
#' package vignette: `vignette(topic = "parallel", package = "parallel")`.)
#' However, this risk is extremely small in practice.
#'
#' `targets` and `tarchetypes` take the approach discussed in
#' However, this risk is extremely small in practice, as shown by
#' L'Ecuyer et al. (2027) <https://doi.org/10.1016/j.matcom.2016.05.005>
#' "A single RNG with a 'random' seed for each stream" (Section 4:
#' under "A single RNG with a 'random' seed for each stream" (Section 4:
#' under "How to produce parallel streams and substreams").
#' Here, [tar_seed_create()] plays the role
#' of the upstream pseudo-random number generator (RNG) that produces
#'
#' `targets` and `tarchetypes` take the approach discussed in the
#' aforementioned section of the paper, where [tar_seed_create()] plays the
#' role of the upstream pseudo-random number generator (RNG) that produces
#' seeds for the subsequent parallel streams. Specifically,
#' [tar_seed_create()] acts as a counter-based RNG,
#' where the output function is the SHA512 hash algorithm.
Expand Down
16 changes: 8 additions & 8 deletions man/tar_seed_create.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

16 changes: 8 additions & 8 deletions man/tar_seed_get.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

16 changes: 8 additions & 8 deletions man/tar_seed_set.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit 34a24ce

Please sign in to comment.