Releases · easystats/datawizard

03 Apr 16:06

etiennebacher

v0.7.1

79d85e4

datawizard 0.7.1

BREAKING CHANGES

add_labs() was renamed into assign_labels(). Since add_labs() existed
only for a few days, there will be no alias for backwards compatibility.

NEW FUNCTIONS

labels_to_levels(), to use value labels of factors as their levels.

MINOR CHANGES

data_read() now checks if the imported object actually is a data frame (or
coercible to a data frame), and if not, no longer errors, but gives an
informative warning of the type of object that was imported.

BUG FIXES

Fix test for CRAN check on Mac OS arm64

Assets 2

22 Mar 17:04

etiennebacher

v0.7.0

6395a1d

datawizard 0.7.0

BREAKING CHANGES

In selection patterns, expressions like -var1:var3 to exclude all variables
between var1 and var3 are no longer accepted. The correct expression is
-(var1:var3). This is for 2 reasons:
- to be consistent with the behavior for numerics (-1:2 is not accepted but
  -(1:2) is);
- to be consistent with dplyr::select(), which throws a warning and only
  uses the first variable in the first expression.

NEW FUNCTIONS

recode_into(), similar to dplyr::case_when(), to recode values from one
or more variables into a new variable.
mean_sd() and median_mad() for summarizing vectors to their mean (or
median) and a range of one SD (or MAD) above and below.
data_write() as counterpart to data_read(), to write data frames into
CSV, SPSS, SAS, Stata files and many other file types. One advantage over
existing functions to write data in other packages is that labelled (numeric)
data can be converted into factors (with values labels used as factor levels)
even for text formats like CSV and similar. This allows exporting "labelled"
data into those file formats, too.
add_labs(), to manually add value and variable labels as attributes to
variables. These attributes are stored as "label" and "labels" attributes,
similar to the labelled class from the haven package.

MINOR CHANGES

data_rename() gets a verbose argument.
winsorize() now errors if the threshold is incorrect (previously, it provided
a warning and returned the unchanged data). The argument verbose is now
useless but is kept for backward compatibility. The documentation now contains
details about the valid values for threshold (#357).
In all functions that have arguments select and/or exclude, there is now
one warning per misspelled variable. The previous behavior was to have only one
warning.
Fixed inconsistent behaviour in standardize() when only one of the arguments
center or scale were provided (#365).
unstandardize() and replace_nan_inf() now work with select helpers (#376).
Added informative warning and error messages to reverse(). Furthermore, the
docs now describe the range argument more clearly (#380).
unnormalize() errors with unexpected inputs (#383).

BUG FIXES

empty_columns() (and therefore remove_empty_columns()) now correctly detects
columns containing only NA_character_ (#349).
Select helpers now work in custom functions when argument is called select
(#356).
Fix unexpected warning in convert_na_to() when select is a list (#352).
Fixed issue with correct labelling of numeric variables with more than nine
unique values and associated value labels.

Assets 2

14 Dec 17:48

etiennebacher

v0.6.5

33e96b8

datawizard 0.6.5

MAJOR CHANGES

Etienne Bacher is the new maintainer.

MINOR CHANGES

standardize(), center(), normalize() and rescale() can be used in
model formulas, similar to base::scale().
data_codebook() now includes the proportion for each category/value, in
addition to the counts. Furthermore, if data contains tagged NA values,
these are included in the frequency table.

BUG FIXES

center(x) now works correctly when x is a single value and either
reference or center is specified (#324).
Fixed issue in data_codebook(), which failed for labelled vectors when
values of labels were not in sorted order.

Assets 2

20 Nov 07:47

IndrajeetPatil

0.6.4

fb2e94b

datawizard 0.6.4

NEW FUNCTIONS

data_codebook(): to generate codebooks of data frames.
New functions to deal with duplicates: data_duplicated() (keep all duplicates,
including the first occurrence) and data_unique() (returns the data, excluding
all duplicates except one instance of each, based on the selected method).

MINOR CHANGES

.data.frame methods should now preserve custom attributes.
The include_bounds argument in normalize() can now also be a numeric
value, defining the limit to the upper and lower bound (i.e. the distance
to 1 and 0).
data_filter() now works with grouped data.

BUG FIXES

data_read() no longer prints message for empty columns when the data
actually had no empty columns.
data_to_wide() now drops columns that are not in id_cols (if specified),
names_from, or values_from. This is the behaviour observed in tidyr::pivot_wider().

Assets 2

22 Oct 12:48

IndrajeetPatil

0.6.3

0b0d117

datawizard 0.6.3

MAJOR CHANGES

There is a new publication about the {datawizard} package:
https://joss.theoj.org/papers/10.21105/joss.04684
Fixes failing tests due to changes in R-devel.
data_to_long() and data_to_wide() have had significant performance
improvements, sometimes as high as a ten-fold speedup.

MINOR CHANGES

When column names are misspelled, most functions now suggest which existing
columns possibly could be meant.
Miscellaneous performance gains.
convert_to_na() now requires argument na to be of class 'Date' to convert
specific dates to NA. For example, convert_to_na(x, na = "2022-10-17")
must be changed to convert_to_na(x, na = as.Date("2022-10-17")).

BUG FIXES

data_to_long() and data_to_wide() now correctly keep the date format.

Assets 2

04 Oct 15:17

IndrajeetPatil

0.6.2

0f7f292

datawizard 0.6.2

BREAKING CHANGES

Methods for grouped data frames (.grouped_df) no longer support
dplyr::group_by() for {dplyr} before version 0.8.0.
empty_columns() and remove_empty_columns() now also remove columns that
contain only empty characters. Likewise, empty_rows() and
remove_empty_rows() remove observations that completely have missing or
empty character values.

CHANGES

data_arrange() now works with data frames that were grouped using
data_group() (#274).
data_read() gains a convert_factors argument, to turn off automatic
conversion from numeric variables into factors.

Assets 2

25 Sep 14:59

IndrajeetPatil

0.6.1

a7ce980

datawizard 0.6.1

Updates tests for upcoming changes in the {tidyselect} package (#267).

Assets 2

15 Sep 11:55

IndrajeetPatil

0.6.0

f9e48b5

datawizard 0.6.0

BREAKING CHANGES

The minimum needed R version has been bumped to 3.6.
Following deprecated functions have been removed:

data_cut(), data_recode(), data_shift(), data_reverse(), data_rescale(),
data_to_factor(), data_to_numeric()
New text_format() alias is introduced for format_text(), latter of which
will be removed in the next release.
New recode_values() alias is introduced for change_code(), latter of which
will be removed in the next release.
data_merge() now errors if columns specified in by are not in both datasets.
Using negative values in arguments select and exclude now removes the columns
from the selection/exclusion. The previous behavior was to start the
selection/exclusion from the end of the dataset, which was inconsistent with
the use of "-" with other selecting possibilities.

NEW FUNCTIONS

data_peek(): to peek at values and type of variables in a data frame.
coef_var(): to compute the coefficient of variation.

CHANGES

data_filter() will give more informative messages on malformed syntax of
the filter argument.
It is now possible to use curly brackets to pass variable names to data_filter(),
like the following example. See examples section in the documentation of
data_filter().
The regex argument was added to functions that use select-helpers and did
not already have this argument.
Select helpers starts_with(), ends_with(), and contains() now accept
several patterns, e.g starts_with("Sep", "Petal").
Arguments select and exclude that are present in most functions have been
improved to work in loops and in custom functions. For example, the following
code now works:

foo <- function(data) {
  i <- "Sep"
  find_columns(data, select = starts_with(i))
}
foo(iris)

for (i in c("Sepal", "Sp")) {
  head(iris) |>
    find_columns(select = starts_with(i)) |>
    print()
}

There is now a vignette summarizing the various ways to select or exclude
variables in most {datawizard} functions.

Assets 3

18 Aug 04:49

IndrajeetPatil

0.5.1

0601442

datawizard 0.5.1

Fixes tests for {poorman} update

Assets 2

07 Aug 15:36

IndrajeetPatil

0.5.0

4c916b1

datawizard 0.5.0

MAJOR CHANGES

Following statistical transformation functions have been renamed to not have
data_*() prefix, since they do not work exclusively with data frames, but
are typically first of all used with vectors, and therefore had misleading
names:
- data_cut() -> categorize()
- data_recode() -> change_code()
- data_shift() -> slide()
- data_reverse() -> reverse()
- data_rescale() -> rescale()
- data_to_factor() -> to_factor()
- data_to_numeric() -> to_numeric()
Note that these functions also have .data.frame() methods and still work
for data frames as well. Former function names are still available as aliases,
but will be deprecated and removed in a future release.
Bumps the needed minimum R version to 3.5.
Removed deprecated function data_findcols(). Please use its replacement,
data_find().
Removed alias extract() for data_extract() function since it collided with
tidyr::extract().
Argument training_proportion in data_partition() is deprecated. Please use
proportion now.
Given his continued and significant contributions to the package, Etienne
Bacher (@etiennebacher) is now included as an author.
unstandardise() now works for center(x)
unnormalize() now works for change_scale(x)
reshape_wider() now follows more consistently tidyr::pivot_wider() syntax.
Arguments colnames_from, sep, and rows_from are deprecated and should be
replaced by names_from, names_sep, and id_cols respectively.
reshape_wider() also gains an argument names_glue (#182, #198).
Similarly, reshape_longer() now follows more consistently
tidyr::pivot_longer() syntax. Argument colnames_to is deprecated and
should be replaced by names_to. reshape_longer() also gains new arguments:
names_prefix, names_sep, names_pattern, and values_drop_na (#189).

CHANGES

Some of the text formatting helpers (like text_concatenate()) gain an
enclose argument, to wrap text elements with surrounding characters.
winsorize now accepts "raw" and "zscore" methods (in addition to
"percentile"). Additionally, when robust is set to TRUE together with
method = "zscore", winsorizes via the median and median absolute deviation
(MAD); else via the mean and standard deviation. (@rempsyc, #177, #49, #47).
data_partition() now allows to create multiple partitions from the data,
returning multiple training and a remaining test set.
Functions like center(), normalize() or standardize() no longer fail
when data contains infinite values (Inf).

NEW FUNCTIONS

row_to_colnames() and colnames_to_row() to move a row to column names, and
column names to row (@etiennebacher, #169).

BUG FIXES

Fixed wrong column names in data_to_wide() (#173).

Contributors

rempsyc and etiennebacher

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BREAKING CHANGES

CHANGES

Contributors

Releases: easystats/datawizard

datawizard 0.7.1

datawizard 0.7.0

datawizard 0.6.5

datawizard 0.6.4

datawizard 0.6.3

datawizard 0.6.2

BREAKING CHANGES

CHANGES

datawizard 0.6.1

datawizard 0.6.0

datawizard 0.5.1

datawizard 0.5.0

Contributors