[REVIEW]: datawizard: An R Package for Easy Data Preparation and Statistical Transformations #4684

editorialbot · 2022-08-19T16:05:11Z

Submitting author: @IndrajeetPatil (Indrajeet Patil)
Repository: https://github.com/easystats/datawizard
Branch with paper.md (empty if default branch):
Version: 0.6.2
Editor: @osorensen
Reviewers: @tomfaulkenberry, @garretrc
Archive: 10.5281/zenodo.7143971

Status

Status badge code:

HTML: <a href="https://joss.theoj.org/papers/e091e4c73d855e87d0d774eafed63964"><img src="https://joss.theoj.org/papers/e091e4c73d855e87d0d774eafed63964/status.svg"></a>
Markdown: [![status](https://joss.theoj.org/papers/e091e4c73d855e87d0d774eafed63964/status.svg)](https://joss.theoj.org/papers/e091e4c73d855e87d0d774eafed63964)

Reviewers and authors:

Please avoid lengthy details of difficulties in the review thread. Instead, please create a new issue in the target repository and link to those issues (especially acceptance-blockers) by leaving comments in the review thread below. (For completists: if the target issue tracker is also on GitHub, linking the review thread in the issue or vice versa will create corresponding breadcrumb trails in the link target.)

Reviewer instructions & questions

@tomfaulkenberry & @garretrc, your review will be checklist based. Each of you will have a separate checklist that you should update when carrying out your review.
First of all you need to run this command in a separate comment to create the checklist:

@editorialbot generate my checklist

The reviewer guidelines are available here: https://joss.readthedocs.io/en/latest/reviewer_guidelines.html. Any questions/concerns please let @osorensen know.

✨ Please start on your review when you are able, and be sure to complete your review in the next six weeks, at the very latest ✨

Checklists

📝 Checklist for @garretrc

📝 Checklist for @tomfaulkenberry

The text was updated successfully, but these errors were encountered:

editorialbot · 2022-08-19T16:05:13Z

Hello humans, I'm @editorialbot, a robot that can help you with some common editorial tasks.

For a list of things I can do to help you, just type:

@editorialbot commands

For example, to regenerate the paper pdf after making changes in the paper's md or bib files, type:

@editorialbot generate pdf

editorialbot · 2022-08-19T16:05:17Z

Software report:

github.com/AlDanial/cloc v 1.88  T=0.21 s (706.1 files/s, 114435.9 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
R                              111           2566           4339          10862
Markdown                        25            464              0           1816
XML                              1              0            129           1787
Rmd                              4            435            788            496
TeX                              2             47              0            272
YAML                             7             56             17            237
-------------------------------------------------------------------------------
SUM:                           150           3568           5273          15470
-------------------------------------------------------------------------------


gitinspector failed to run statistical information for the repository

editorialbot · 2022-08-19T16:05:18Z

Wordcount for paper.md is 1417

editorialbot · 2022-08-19T16:05:49Z

Reference check summary (note 'MISSING' DOIs are suggestions that need verification):

OK DOIs

- 10.21105/joss.02815 is OK
- 10.21105/joss.02445 is OK
- 10.21105/joss.03393 is OK
- 10.21105/joss.03139 is OK
- 10.21105/joss.01412 is OK
- 10.21105/joss.02306 is OK
- 10.21105/joss.03167 is OK
- 10.21105/joss.01541 is OK
- 10.21105/joss.01686 is OK

MISSING DOIs

- None

INVALID DOIs

- None

editorialbot · 2022-08-19T16:06:45Z

👉📄 Download article proof 📄 View article proof on GitHub 📄 👈

garretrc · 2022-08-19T18:32:24Z

Review checklist for @garretrc

Conflict of interest

I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

I confirm that I read and will adhere to the JOSS code of conduct.

General checks

Repository: Is the source code for this software available at the https://github.com/easystats/datawizard?
License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
Contribution and authorship: Has the submitting author (@IndrajeetPatil) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines

Functionality

Installation: Does installation proceed as outlined in the documentation?
Functionality: Have the functional claims of the software been confirmed?
Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
State of the field: Do the authors describe how this software compares to other commonly-used packages?
Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

osorensen · 2022-09-09T06:18:37Z

👋 @tomfaulkenberry, @garretrc, could you please update us on how it's going with your reviews?

tomfaulkenberry · 2022-09-09T09:38:09Z

Should have it done today! I'm sorry for my delay...our semester got started right at the same time the review period got started. I appreciate the reminder :)

tomfaulkenberry · 2022-09-09T09:38:45Z

Review checklist for @tomfaulkenberry

Conflict of interest

I confirm that I have read the JOSS conflict of interest (COI) policy and that: I have no COIs with reviewing this work or that any perceived COIs have been waived by JOSS for the purpose of this review.

Code of Conduct

I confirm that I read and will adhere to the JOSS code of conduct.

General checks

Repository: Is the source code for this software available at the https://github.com/easystats/datawizard?
License: Does the repository contain a plain-text LICENSE file with the contents of an OSI approved software license?
Contribution and authorship: Has the submitting author (@IndrajeetPatil) made major contributions to the software? Does the full list of paper authors seem appropriate and complete?
Substantial scholarly effort: Does this submission meet the scope eligibility described in the JOSS guidelines
Data sharing: If the paper contains original data, data are accessible to the reviewers. If the paper contains no original data, please check this item.
Reproducibility: If the paper contains original results, results are entirely reproducible by reviewers. If the paper contains no original results, please check this item.
Human and animal research: If the paper contains original data research on humans subjects or animals, does it comply with JOSS's human participants research policy and/or animal research policy? If the paper contains no such data, please check this item.

Functionality

Installation: Does installation proceed as outlined in the documentation?
Functionality: Have the functional claims of the software been confirmed?
Performance: If there are any performance claims of the software, have they been confirmed? (If there are no claims, please check off this item.)

Documentation

A statement of need: Do the authors clearly state what problems the software is designed to solve and who the target audience is?
Installation instructions: Is there a clearly-stated list of dependencies? Ideally these should be handled with an automated package management solution.
Example usage: Do the authors include examples of how to use the software (ideally to solve real-world analysis problems).
Functionality documentation: Is the core functionality of the software documented to a satisfactory level (e.g., API method documentation)?
Automated tests: Are there automated tests or manual steps described so that the functionality of the software can be verified?
Community guidelines: Are there clear guidelines for third parties wishing to 1) Contribute to the software 2) Report issues or problems with the software 3) Seek support

Software paper

Summary: Has a clear description of the high-level functionality and purpose of the software for a diverse, non-specialist audience been provided?
A statement of need: Does the paper have a section titled 'Statement of need' that clearly states what problems the software is designed to solve, who the target audience is, and its relation to other work?
State of the field: Do the authors describe how this software compares to other commonly-used packages?
Quality of writing: Is the paper well written (i.e., it does not require editing for structure, language, or writing quality)?
References: Is the list of references complete, and is everything cited appropriately that should be cited (e.g., papers, datasets, software)? Do references in the text use the proper citation syntax?

garretrc · 2022-09-09T19:51:20Z

thanks for the reminder, I'll start taking a look at the repo and docs today but may not be able to test functionality until next week

IndrajeetPatil · 2022-09-15T12:00:24Z

Version: 0.5.0

@osorensen We needed to make a new CRAN release. Therefore, can you please bump the version to 0.6.0? Thanks.

osorensen · 2022-09-15T12:03:50Z

@IndrajeetPatil, we set the final version once the paper is ready for acceptance, so no need to bump the version right now.

osorensen · 2022-09-15T12:04:18Z

But I can do it anyway:

osorensen · 2022-09-15T12:04:29Z

@editorialbot set 0.6.0 as version

editorialbot · 2022-09-15T12:04:31Z

Done! version is now 0.6.0

osorensen · 2022-09-26T13:38:32Z

👋 @tomfaulkenberry, @garretrc, could you please update us on how it's going with your reviews?

Feel free to add comments to your reviews in this thread, or to open issues in the source repository

garretrc · 2022-09-26T13:46:06Z

I've had the chance to incoporate many of the datawizard functions into my own workflow to test the functionality, I'll wrap up checking some of the remaining functions that I haven't been able to test today.

This package has a lot of useful functions even for experienced R users, so I'm also thinking about ways this package could best communicate that.

garretrc · 2022-09-26T19:36:47Z

Review finished. Everything I've tested returns the correct results and runs quickly. A couple comments:

I don't think the readme highlights the benefits of datawizard for an experienced R user (yes, a lot of the functions are quick to code up on your own, but you might save 5-10 minutes/a trip to stack overflow by using some datawizard "magic"). However, the easystats ecosystem page addresses this concern a bit, and the datawizard readme seems very detailed already. If I think of a better way to highlight this I'll raise an issue in the repo.
When using some of the functions, it's not clear from the tooltip in RStudio what the arguments should be. For example, when I use rescale(), center(), slide(), and others, only the x argument is included in the tooltip. So, when using the package I found myself needing to consult the help documentation in cases where I was expecting to just glance at the tooltip. My understanding is that these functions have different arguments for different data types and different operations, so the second argument will change for different use cases. The flexibility of each function to work with different data types without needing something like dplyr/data.table/base R data.frame manipulation seems more important to the package's mission than the tooltips, but just wanted to share this experience as a user.

The above points are small nitpicks, but I think the selection of functions in the package provides an accessible route to many common data manipulation tasks for new R users. For experienced R users who have previously coded the included operations on their own/through tidyverse or other packages, it seems just as fast to learn a new datawizard function than to look up/code up an equivalent solution. The datawizard functions are well-curated toward tasks that are annoying to remember/implement, so I believe they will be easier to recall and quicker to access than equivalent solutions from more general packages.

osorensen · 2022-09-26T19:47:19Z

Thanks a lot for your review @garretrc!

IndrajeetPatil · 2022-09-27T16:03:48Z

Dear @garretrc,

Thanks a lot for your wonderful assessment of {datawizard}. We are delighted to hear that you found the package useful, and think that it is equally accessible and useful for both naive and advanced R users.

I will respond to your nitpicks point-by-point.

You are absolutely right that we could be doing a better job of outlining functionality offered by the package. I have created an issue to remind us (Improve README easystats/datawizard#271). Of course, if you think of some concrete suggestions, feel free to either comment on that issue or create a new issue.
Unfortunately, this is something that is beyond our control, and is a result of RStudio IDE works.

In this video, I demonstrate how, depending on the supplied argument, if you hit tab, the IDE will provide the accurate argument list. But, if put the cursor on the function name and hit tab, it always displays the same tooltip, irrespective of which S3 method is dispatched.

ide.mov

Thanks again for your review, and let us know if you have any other comments or suggestions.

garretrc · 2022-09-27T16:15:17Z

@IndrajeetPatil I'll leave a comment on that issue if I think of anything!

I didn't know you could supply the first argument like that to change the tooltip, it definitely improves the experience. Not really in your control to make sure a user inputs the first argument before pressing tab.

osorensen · 2022-10-04T12:54:06Z

@tomfaulkenberry could you please update us on how it's going with your review?

editorialbot · 2022-10-04T16:00:56Z

👋 @openjournals/joss-eics, this paper is ready to be accepted and published.

Check final proof 👉📄 Download article

If the paper PDF and the deposit XML files look good in openjournals/joss-papers#3578, then you can now move forward with accepting the submission by compiling again with the command @editorialbot accept

IndrajeetPatil · 2022-10-05T12:35:50Z

Check final proof

@osorensen I have checked the final proofs and everything looks good to me.

Let me know if there is anything else that I need to do. Thanks.

arfon · 2022-10-09T08:08:13Z

@editorialbot accept

editorialbot · 2022-10-09T08:08:14Z

Doing it live! Attempting automated processing of paper acceptance...

editorialbot · 2022-10-09T08:09:52Z

🐦🐦🐦 👉 Tweet for this paper 👈 🐦🐦🐦

editorialbot · 2022-10-09T08:09:53Z

🚨🚨🚨 THIS IS NOT A DRILL, YOU HAVE JUST ACCEPTED A PAPER INTO JOSS! 🚨🚨🚨

Here's what you must now do:

Check final PDF and Crossref metadata that was deposited 👉 Creating pull request for 10.21105.joss.04684 joss-papers#3593
Wait a couple of minutes, then verify that the paper DOI resolves https://doi.org/10.21105/joss.04684
If everything looks good, then close this review issue.
Party like you just published a paper! 🎉🌈🦄💃👻🤘

Any issues? Notify your editorial technical team...

arfon · 2022-10-09T12:22:13Z

@tomfaulkenberry, @garretrc – many thanks for your reviews here and to @osorensen for editing this submission! JOSS relies upon the volunteer effort of people like you and we simply wouldn't be able to do this without you ✨

@IndrajeetPatil – your paper is now accepted and published in JOSS ⚡🚀💥

editorialbot · 2022-10-09T12:22:14Z

🎉🎉🎉 Congratulations on your paper acceptance! 🎉🎉🎉

If you would like to include a link to your paper from your README use the following code snippets:

Markdown:
[![DOI](https://joss.theoj.org/papers/10.21105/joss.04684/status.svg)](https://doi.org/10.21105/joss.04684)

HTML:
<a style="border-width:0" href="https://doi.org/10.21105/joss.04684">
  <img src="https://joss.theoj.org/papers/10.21105/joss.04684/status.svg" alt="DOI badge" >
</a>

reStructuredText:
.. image:: https://joss.theoj.org/papers/10.21105/joss.04684/status.svg
   :target: https://doi.org/10.21105/joss.04684

This is how it will look in your documentation:

We need your help!

The Journal of Open Source Software is a community-run journal and relies upon volunteer effort. If you'd like to support us please consider doing either one (or both) of the the following:

Volunteering to review for us sometime in the future. You can add your name to the reviewer list here: https://joss.theoj.org/reviewer-signup.html
Making a small donation to support our running costs here: https://numfocus.org/donate-to-joss

xuanxu · 2024-03-22T09:23:58Z

Hey @arfon, the value for the archive is incorrect, it should be set to 10.5281/zenodo.7143971 and then reaccept the paper. Can you take care of this?

As reported here by @sdruskat

arfon · 2024-04-11T07:18:17Z

@editorialbot set 10.5281/zenodo.7143971 as archive

editorialbot · 2024-04-11T07:18:20Z

Done! archive is now 10.5281/zenodo.7143971

arfon · 2024-04-11T07:18:27Z

@editorialbot reaccept

editorialbot · 2024-04-11T07:18:30Z

Rebuilding paper!

editorialbot · 2024-04-11T07:19:02Z

⚠️ Couldn't update published paper. An error happened.

xuanxu · 2024-04-12T08:18:58Z

The error is caused by the name of the directory containing the paper having a space in it : /JOSS files/.
Please @IndrajeetPatil can you rename that folder?

…#4684 (comment)

etiennebacher · 2024-04-12T08:53:17Z

@xuanxu I've done it

arfon · 2024-04-12T09:17:17Z

@editorialbot reaccept

editorialbot · 2024-04-12T09:17:19Z

Rebuilding paper!

editorialbot · 2024-04-12T09:19:37Z

🌈 Paper updated!

New PDF and metadata files 👉 openjournals/joss-papers#5241

editorialbot added R review TeX Track: 5 (DSAIS) Data Science, Artificial Intelligence, and Machine Learning waitlisted Submissions in the JOSS backlog due to reduced service mode. labels Aug 19, 2022

editorialbot assigned osorensen Aug 19, 2022

editorialbot mentioned this issue Aug 19, 2022

[PRE REVIEW]: datawizard: An R Package for Easy Data Preparation and Statistical Transformations #4659

Closed

IndrajeetPatil mentioned this issue Aug 19, 2022

Writing a JOSS paper easystats/datawizard#59

Closed

editorialbot added the recommend-accept Papers recommended for acceptance in JOSS. label Oct 4, 2022

osorensen removed the waitlisted Submissions in the JOSS backlog due to reduced service mode. label Oct 7, 2022

editorialbot added accepted published Papers published in JOSS labels Oct 9, 2022

arfon closed this as completed Oct 9, 2022

This was referenced Dec 12, 2023

[PRE REVIEW]: REDCapTidieR: Extracting complex REDCap databases into tidy tables #6131

Closed

[PRE REVIEW]: dwctaxon, an R package for editing and validating taxonomic data in Darwin Core format #6163

Closed

sneakers-the-rat mentioned this issue Mar 22, 2024

Archive DOI not validated openjournals/buffy#103

Open

etiennebacher added a commit to easystats/datawizard that referenced this issue Apr 12, 2024

Rename "JOSS files" folder to "JOSS_files", openjournals/joss-reviews…

f044e18

…#4684 (comment)

editorialbot mentioned this issue Sep 9, 2024

[PRE REVIEW]: JMDSFCv1.0: an Interactive R/Shiny Application for Dataset Format Conversion with Real-Time Progress Monitoring #7202

Closed

editorialbot mentioned this issue Nov 9, 2024

[PRE REVIEW]: rdata: A Python library for R datasets #7442

Closed

[REVIEW]: datawizard: An R Package for Easy Data Preparation and Statistical Transformations #4684

[REVIEW]: datawizard: An R Package for Easy Data Preparation and Statistical Transformations #4684

Comments

editorialbot commented Aug 19, 2022 • edited Loading

Status

Reviewer instructions & questions

Checklists

editorialbot commented Aug 19, 2022

editorialbot commented Aug 19, 2022

editorialbot commented Aug 19, 2022

editorialbot commented Aug 19, 2022

editorialbot commented Aug 19, 2022

garretrc commented Aug 19, 2022 • edited Loading

Review checklist for @garretrc

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

osorensen commented Sep 9, 2022

tomfaulkenberry commented Sep 9, 2022

tomfaulkenberry commented Sep 9, 2022 • edited Loading

Review checklist for @tomfaulkenberry

Conflict of interest

Code of Conduct

General checks

Functionality

Documentation

Software paper

garretrc commented Sep 9, 2022 • edited Loading

IndrajeetPatil commented Sep 15, 2022

osorensen commented Sep 15, 2022

osorensen commented Sep 15, 2022

osorensen commented Sep 15, 2022

editorialbot commented Sep 15, 2022

osorensen commented Sep 26, 2022

garretrc commented Sep 26, 2022

garretrc commented Sep 26, 2022

osorensen commented Sep 26, 2022

IndrajeetPatil commented Sep 27, 2022

garretrc commented Sep 27, 2022

osorensen commented Oct 4, 2022

editorialbot commented Oct 4, 2022

IndrajeetPatil commented Oct 5, 2022

arfon commented Oct 9, 2022

editorialbot commented Oct 9, 2022

editorialbot commented Oct 9, 2022

editorialbot commented Oct 9, 2022

arfon commented Oct 9, 2022

editorialbot commented Oct 9, 2022

xuanxu commented Mar 22, 2024 • edited Loading

arfon commented Apr 11, 2024

editorialbot commented Apr 11, 2024

arfon commented Apr 11, 2024

editorialbot commented Apr 11, 2024

editorialbot commented Apr 11, 2024

xuanxu commented Apr 12, 2024

etiennebacher commented Apr 12, 2024

arfon commented Apr 12, 2024

editorialbot commented Apr 12, 2024

editorialbot commented Apr 12, 2024

editorialbot commented Aug 19, 2022 •

edited

Loading

garretrc commented Aug 19, 2022 •

edited

Loading

tomfaulkenberry commented Sep 9, 2022 •

edited

Loading

garretrc commented Sep 9, 2022 •

edited

Loading

xuanxu commented Mar 22, 2024 •

edited

Loading