-
diff --git a/search.json b/search.json
index 6ef11ee..76a9622 100644
--- a/search.json
+++ b/search.json
@@ -627,7 +627,7 @@
"href": "presentations/2023-10-09_nhs-r_conf_sd_in_health_social_care/index.html#in-r---flows",
"title": "System Dynamics in health and care",
"section": "in R - flows",
- "text": "in R - flows\nEasy to do with count, or group_by and summarise\n\n\n admit_d <- spell_dates |> \n group_by(date_admit) |>\n count(date_admit)\n\nhead(admit_d)\n\n\n# A tibble: 6 × 2\n# Groups: date_admit [6]\n date_admit n\n <date> <int>\n1 2022-01-01 23\n2 2022-01-02 31\n3 2022-01-03 25\n4 2022-01-04 20\n5 2022-01-05 23\n6 2022-01-06 23"
+ "text": "in R - flows\nEasy to do with count, or group_by and summarise\n\n\n admit_d <- spell_dates |> \n group_by(date_admit) |>\n count(date_admit)\n\nhead(admit_d)\n\n\n# A tibble: 6 × 2\n# Groups: date_admit [6]\n date_admit n\n <date> <int>\n1 2022-01-01 22\n2 2022-01-02 25\n3 2022-01-03 23\n4 2022-01-04 24\n5 2022-01-05 33\n6 2022-01-06 25"
},
{
"objectID": "presentations/2023-10-09_nhs-r_conf_sd_in_health_social_care/index.html#in-r---occupancy",
@@ -648,7 +648,7 @@
"href": "presentations/2023-10-09_nhs-r_conf_sd_in_health_social_care/index.html#longer-time-periods---flows",
"title": "System Dynamics in health and care",
"section": "Longer Time Periods - flows",
- "text": "Longer Time Periods - flows\nUse lubridate::floor_date to generate the date at start of week/month\n\nadmit_wk <- spell_dates |> \n mutate(week_start = floor_date(\n date_admit, unit = \"week\", week_start = 1 # start week on Monday\n )) |> \n count(week_start) # could add other parameters such as provider code, TFC etc\n\nhead(admit_wk)\n\n\n\n# A tibble: 6 × 2\n week_start n\n <date> <int>\n1 2021-12-27 54\n2 2022-01-03 177\n3 2022-01-10 184\n4 2022-01-17 180\n5 2022-01-24 178\n6 2022-01-31 172\n\n\n\nMight run SD model in weeks or months - e.g. months for care homes Use lubridate to create new variable with start date of week/month/year etc"
+ "text": "Longer Time Periods - flows\nUse lubridate::floor_date to generate the date at start of week/month\n\nadmit_wk <- spell_dates |> \n mutate(week_start = floor_date(\n date_admit, unit = \"week\", week_start = 1 # start week on Monday\n )) |> \n count(week_start) # could add other parameters such as provider code, TFC etc\n\nhead(admit_wk)\n\n\n\n# A tibble: 6 × 2\n week_start n\n <date> <int>\n1 2021-12-27 47\n2 2022-01-03 198\n3 2022-01-10 193\n4 2022-01-17 198\n5 2022-01-24 221\n6 2022-01-31 184\n\n\n\nMight run SD model in weeks or months - e.g. months for care homes Use lubridate to create new variable with start date of week/month/year etc"
},
{
"objectID": "presentations/2023-10-09_nhs-r_conf_sd_in_health_social_care/index.html#longer-time-periods---occupancy",
@@ -1131,14 +1131,14 @@
"href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#and-create-our-first-test-5",
"title": "Unit testing in R",
"section": "… and create our first test",
- "text": "… and create our first test\n\ntest_that(\"my_function correctly divides values\", {\n expect_equal(\n my_function(4, 2),\n 2\n )\n expect_equal(\n my_function(1, 4),\n 0.25\n )\n expect_equal(\n my_function(c(4, 1), c(2, 4)),\n c(2, 0.25)\n )\n})\n\nTest passed 🌈"
+ "text": "… and create our first test\n\ntest_that(\"my_function correctly divides values\", {\n expect_equal(\n my_function(4, 2),\n 2\n )\n expect_equal(\n my_function(1, 4),\n 0.25\n )\n expect_equal(\n my_function(c(4, 1), c(2, 4)),\n c(2, 0.25)\n )\n})\n\nTest passed 😀"
},
{
"objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#other-expect_-functions",
"href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#other-expect_-functions",
"title": "Unit testing in R",
"section": "other expect_*() functions…",
- "text": "other expect_*() functions…\n\ntest_that(\"my_function correctly divides values\", {\n expect_lt(\n my_function(4, 2),\n 10\n )\n expect_gt(\n my_function(1, 4),\n 0.2\n )\n expect_length(\n my_function(c(4, 1), c(2, 4)),\n 2\n )\n})\n\nTest passed 🥳\n\n\n\n{testthat} documentation"
+ "text": "other expect_*() functions…\n\ntest_that(\"my_function correctly divides values\", {\n expect_lt(\n my_function(4, 2),\n 10\n )\n expect_gt(\n my_function(1, 4),\n 0.2\n )\n expect_length(\n my_function(c(4, 1), c(2, 4)),\n 2\n )\n})\n\nTest passed 🎊\n\n\n\n{testthat} documentation"
},
{
"objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#arrange-act-assert",
@@ -1187,7 +1187,7 @@
"href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#testing-edge-cases",
"title": "Unit testing in R",
"section": "Testing edge cases",
- "text": "Testing edge cases\n\n\nRemember the validation steps we built into our function to handle edge cases?\n\nLet’s write tests for these edge cases:\nwe expect errors\n\n\ntest_that(\"my_function works\", {\n expect_error(my_function(5, 0))\n expect_error(my_function(\"a\", 3))\n expect_error(my_function(3, \"a\"))\n expect_error(my_function(1:2, 4))\n})\n\nTest passed 🎉"
+ "text": "Testing edge cases\n\n\nRemember the validation steps we built into our function to handle edge cases?\n\nLet’s write tests for these edge cases:\nwe expect errors\n\n\ntest_that(\"my_function works\", {\n expect_error(my_function(5, 0))\n expect_error(my_function(\"a\", 3))\n expect_error(my_function(3, \"a\"))\n expect_error(my_function(1:2, 4))\n})\n\nTest passed 🥳"
},
{
"objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#another-simple-example",
@@ -1201,7 +1201,7 @@
"href": "presentations/2023-08-23_nhs-r_unit-testing/index.html#another-simple-example-1",
"title": "Unit testing in R",
"section": "Another (simple) example",
- "text": "Another (simple) example\n\nmy_new_function <- function(x, y) {\n if (x > y) {\n \"x\"\n } else {\n \"y\"\n }\n}\n\n\n\ntest_that(\"it returns 'x' if x is bigger than y\", {\n expect_equal(my_new_function(4, 3), \"x\")\n})\n\nTest passed 🥇\n\ntest_that(\"it returns 'y' if y is bigger than x\", {\n expect_equal(my_new_function(3, 4), \"y\")\n expect_equal(my_new_function(3, 3), \"y\")\n})\n\nTest passed 🎊"
+ "text": "Another (simple) example\n\nmy_new_function <- function(x, y) {\n if (x > y) {\n \"x\"\n } else {\n \"y\"\n }\n}\n\n\n\ntest_that(\"it returns 'x' if x is bigger than y\", {\n expect_equal(my_new_function(4, 3), \"x\")\n})\n\nTest passed 🎊\n\ntest_that(\"it returns 'y' if y is bigger than x\", {\n expect_equal(my_new_function(3, 4), \"y\")\n expect_equal(my_new_function(3, 3), \"y\")\n})\n\nTest passed 😀"
},
{
"objectID": "presentations/2023-08-23_nhs-r_unit-testing/index.html#how-to-design-good-tests",
@@ -1649,21 +1649,21 @@
"href": "style/project_structure.html",
"title": "Project Structure",
"section": "",
- "text": "as a package\nR/ for scripts\nsplit files into separate scripts\nuse {renv}\nuse {targets}\n\n\n\nRStudio projects are a great way to organise your analytical projects into discrete units that are easier to work on and share.\n\n\n\n\n\n\n\n\n\n\n\n\nOne of the most common issues you will face when using a project someone else has created, or you created previously, is maintaining the required packages to run the project. Knowing what packages are needed to run a particular project isn’t always obvious, and over time packages can change rendering code that once worked unusable.\n{renv} solves this problem by:\n\nkeeping track of the packages that are required for a particular project\nlogging the installed version of all of the packages\nmaintaining a per-project library of packages, so projects don’t interfere with one another\n\nIt’s a good idea to use {renv} for all projects."
+ "text": "Analytical projects should be self-contained and portable. This means that all the materials required for an analysis should be organised into a single folder that can be shared in its entirety and be re-run by other people, ideally via GitHub.\nWe recommend RStudio Projects as a system for creating standardised project structures that meet these goals. The {usethis} package contains a number of helper functions to help get you started, including usethis::create_project().\n\n\nOne of the most common issues you’ll face when using a project someone else has created, or you created previously, is maintaining the required packages to run the project. Knowing what packages are needed to run a particular project isn’t always obvious, and over time packages can change, rendering code that once worked unusable.\nThe {renv} R package helps solve this problem by:\n\nKeeping track of the packages that are required for a particular project.\nLogging the installed version of all of the packages.\nMaintaining a per-project library of packages, so projects don’t interfere with one another.\n\n\n\n\nIt’s helpful to split discrete analytical tasks into separate script files, which can make it easier to handle the codebase in context and provide an obvious order of operations. For example, 01_read.R, 02_wrangle.R, 03_model.R, etc.\nYou could still forget to re-run one of the numbered files, however, or it may take a long time to re-run all the steps again if you only make one small change to the code. This is where a workflow manager is useful.\nWe recommend the {targets} R package as a workflow manager. You write a series of steps and {targets} automatically recognises all the relationships between functions and objects as a graph. This means {targets} knows the order that things should be run and knows which bits of code need to be re-run if there are upstream changes. It’s a well-documented and supported package.\n\n\n\nIt’s beneficial to convert code into discrete functions where possible. This makes it easier to:\n\nreduce the chance of errors, because you’ll avoid repetitive and mistake-prone copy-pasting of code\nunderstand your scripts, because code can be condensed into a simpler calls that are easier to read\nreuse your code, because functions allow you to consistently call the same code more than once and can be copied into other projects\ndebug, because the source of an error can be more easily traced and your code can be tested more easily\n\nConsider the DRY (Don’t Repeat Yourself) principle when deciding whether or not to convert some code into a function. It may be better to write a function if you’ve used the same piece of code more than once in an analysis, especially if it contains many lines.\nFunction names should be short but descriptive and should contain a verb that describes what the function does. For example, get_geospatial_data() may be better than the generic get_data(), which is certainly better than the uninformative data().\nIn a project, it’s conventional to put your functions in a folder called R in the project’s root directory. You can group functions into separate R scripts with meaningful names to make it easier to organise them (read-data.R, model.R, etc). You can then source() these function scripts into your analytical scripts as required.\n\n\n\n\nIt may be beneficial to gather your functions into a discrete package so that you and others can install and reuse them for other projects.\nThe {usethis} package has a number of shortcuts to help you set up a package. You can begin with usethis::create_package() to generate the basic structure and then usethis::use_r and usethis::use_test() to add scripts and {testthat} tests into the correct folder structure.\nWe recommend you include a number of extra files in your package to make its purpose clear and to encourage collaboration. This includes:\n\na README file to describe the purpose of your package and provide some simple examples, which you can set up with usethis::use_readme_md() or usethis::use_readme_rmd() if it contains R code that you want to execute\na NEWS file with usethis::use_news_md(), which is used to communicate the latest changes to your package\na CODE_OF_CONDUCT file with usethis::use_code_of_conduct to explain to collaborators how they should engage with your project\nvignettes with usethis::use_vignette(), which are short documents that let you mix code with prose to describe how to use the functions in your package\n\nWe recommend semantic versioning as you develop your package. In this system, the version number is composed of three digits (like ‘1.2.3’) that are each incremented as you make major breaking changes, minor changes and patches or bug fixes. The usethis::use_version() function can help you to do this and to automatically update the DESCRIPTION and NEWS file.\nUse {pkgdown} to autogenerate a website from your package’s documentation. This lets people see your documentation rendered nicely on the internet, without the need to install the package. You can serve this site on the web and update it automatically using GitHub Pages and GitHub Actions."
},
{
- "objectID": "style/project_structure.html#use-rstudio-projects",
- "href": "style/project_structure.html#use-rstudio-projects",
+ "objectID": "style/project_structure.html#rstudio-projects",
+ "href": "style/project_structure.html#rstudio-projects",
"title": "Project Structure",
"section": "",
- "text": "RStudio projects are a great way to organise your analytical projects into discrete units that are easier to work on and share."
+ "text": "Analytical projects should be self-contained and portable. This means that all the materials required for an analysis should be organised into a single folder that can be shared in its entirety and be re-run by other people, ideally via GitHub.\nWe recommend RStudio Projects as a system for creating standardised project structures that meet these goals. The {usethis} package contains a number of helper functions to help get you started, including usethis::create_project().\n\n\nOne of the most common issues you’ll face when using a project someone else has created, or you created previously, is maintaining the required packages to run the project. Knowing what packages are needed to run a particular project isn’t always obvious, and over time packages can change, rendering code that once worked unusable.\nThe {renv} R package helps solve this problem by:\n\nKeeping track of the packages that are required for a particular project.\nLogging the installed version of all of the packages.\nMaintaining a per-project library of packages, so projects don’t interfere with one another.\n\n\n\n\nIt’s helpful to split discrete analytical tasks into separate script files, which can make it easier to handle the codebase in context and provide an obvious order of operations. For example, 01_read.R, 02_wrangle.R, 03_model.R, etc.\nYou could still forget to re-run one of the numbered files, however, or it may take a long time to re-run all the steps again if you only make one small change to the code. This is where a workflow manager is useful.\nWe recommend the {targets} R package as a workflow manager. You write a series of steps and {targets} automatically recognises all the relationships between functions and objects as a graph. This means {targets} knows the order that things should be run and knows which bits of code need to be re-run if there are upstream changes. It’s a well-documented and supported package.\n\n\n\nIt’s beneficial to convert code into discrete functions where possible. This makes it easier to:\n\nreduce the chance of errors, because you’ll avoid repetitive and mistake-prone copy-pasting of code\nunderstand your scripts, because code can be condensed into a simpler calls that are easier to read\nreuse your code, because functions allow you to consistently call the same code more than once and can be copied into other projects\ndebug, because the source of an error can be more easily traced and your code can be tested more easily\n\nConsider the DRY (Don’t Repeat Yourself) principle when deciding whether or not to convert some code into a function. It may be better to write a function if you’ve used the same piece of code more than once in an analysis, especially if it contains many lines.\nFunction names should be short but descriptive and should contain a verb that describes what the function does. For example, get_geospatial_data() may be better than the generic get_data(), which is certainly better than the uninformative data().\nIn a project, it’s conventional to put your functions in a folder called R in the project’s root directory. You can group functions into separate R scripts with meaningful names to make it easier to organise them (read-data.R, model.R, etc). You can then source() these function scripts into your analytical scripts as required."
},
{
- "objectID": "style/project_structure.html#renv",
- "href": "style/project_structure.html#renv",
+ "objectID": "style/project_structure.html#packages",
+ "href": "style/project_structure.html#packages",
"title": "Project Structure",
"section": "",
- "text": "One of the most common issues you will face when using a project someone else has created, or you created previously, is maintaining the required packages to run the project. Knowing what packages are needed to run a particular project isn’t always obvious, and over time packages can change rendering code that once worked unusable.\n{renv} solves this problem by:\n\nkeeping track of the packages that are required for a particular project\nlogging the installed version of all of the packages\nmaintaining a per-project library of packages, so projects don’t interfere with one another\n\nIt’s a good idea to use {renv} for all projects."
+ "text": "It may be beneficial to gather your functions into a discrete package so that you and others can install and reuse them for other projects.\nThe {usethis} package has a number of shortcuts to help you set up a package. You can begin with usethis::create_package() to generate the basic structure and then usethis::use_r and usethis::use_test() to add scripts and {testthat} tests into the correct folder structure.\nWe recommend you include a number of extra files in your package to make its purpose clear and to encourage collaboration. This includes:\n\na README file to describe the purpose of your package and provide some simple examples, which you can set up with usethis::use_readme_md() or usethis::use_readme_rmd() if it contains R code that you want to execute\na NEWS file with usethis::use_news_md(), which is used to communicate the latest changes to your package\na CODE_OF_CONDUCT file with usethis::use_code_of_conduct to explain to collaborators how they should engage with your project\nvignettes with usethis::use_vignette(), which are short documents that let you mix code with prose to describe how to use the functions in your package\n\nWe recommend semantic versioning as you develop your package. In this system, the version number is composed of three digits (like ‘1.2.3’) that are each incremented as you make major breaking changes, minor changes and patches or bug fixes. The usethis::use_version() function can help you to do this and to automatically update the DESCRIPTION and NEWS file.\nUse {pkgdown} to autogenerate a website from your package’s documentation. This lets people see your documentation rendered nicely on the internet, without the need to install the package. You can serve this site on the web and update it automatically using GitHub Pages and GitHub Actions."
},
{
"objectID": "blogs/index.html",
diff --git a/sitemap.xml b/sitemap.xml
index 4912fe3..db5916f 100644
--- a/sitemap.xml
+++ b/sitemap.xml
@@ -2,110 +2,110 @@
https://the-strategy-unit.github.io/data_science/blogs/posts/2023-04-26_alternative_remotes.html
- 2024-01-17T12:17:18.252Z
+ 2024-01-18T10:04:36.426Z
https://the-strategy-unit.github.io/data_science/blogs/posts/2023-04-26-reinstalling-r-packages.html
- 2024-01-17T12:17:16.976Z
+ 2024-01-18T10:04:35.142Z
https://the-strategy-unit.github.io/data_science/blogs/posts/2023-03-24_hotfix-with-git.html
- 2024-01-17T12:16:58.748Z
+ 2024-01-18T10:04:16.898Z
https://the-strategy-unit.github.io/data_science/style/git_and_github.html
- 2024-01-17T12:16:56.092Z
+ 2024-01-18T10:04:14.258Z
https://the-strategy-unit.github.io/data_science/style/data_storage.html
- 2024-01-17T12:16:55.408Z
+ 2024-01-18T10:04:13.470Z
https://the-strategy-unit.github.io/data_science/presentations/2023-08-24_coffee-and-coding_geospatial/index.html
- 2024-01-17T12:16:54.724Z
+ 2024-01-18T10:04:12.786Z
https://the-strategy-unit.github.io/data_science/presentations/2023-10-17_conference-check-in-app/index.html
- 2024-01-17T12:16:31.948Z
+ 2024-01-18T10:03:58.466Z
https://the-strategy-unit.github.io/data_science/presentations/2023-07-11_haca-nhp-demand-model/index.html
- 2024-01-17T12:16:31.156Z
+ 2024-01-18T10:03:57.666Z
https://the-strategy-unit.github.io/data_science/presentations/2023-02-01_what-is-data-science/index.html
- 2024-01-17T12:16:27.688Z
+ 2024-01-18T10:03:53.826Z
https://the-strategy-unit.github.io/data_science/presentations/2023-10-09_nhs-r_conf_sd_in_health_social_care/index.html
- 2024-01-17T12:16:25.832Z
+ 2024-01-18T10:03:52.010Z
https://the-strategy-unit.github.io/data_science/presentations/2023-03-23_coffee-and-coding/index.html
- 2024-01-17T12:16:19.271Z
+ 2024-01-18T10:03:45.354Z
https://the-strategy-unit.github.io/data_science/presentations/2023-05-15_text-mining/index.html
- 2024-01-17T12:16:15.523Z
+ 2024-01-18T10:03:41.318Z
https://the-strategy-unit.github.io/data_science/presentations/2023-03-23_collaborative-working/index.html
- 2024-01-17T12:16:14.691Z
+ 2024-01-18T10:03:40.546Z
https://the-strategy-unit.github.io/data_science/about.html
- 2024-01-17T12:16:13.407Z
+ 2024-01-18T10:03:39.314Z
https://the-strategy-unit.github.io/data_science/index.html
- 2024-01-17T12:16:14.235Z
+ 2024-01-18T10:03:40.142Z
https://the-strategy-unit.github.io/data_science/presentations/2023-05-23_data-science-for-good/index.html
- 2024-01-17T12:16:15.175Z
+ 2024-01-18T10:03:40.982Z
https://the-strategy-unit.github.io/data_science/presentations/index.html
- 2024-01-17T12:16:18.083Z
+ 2024-01-18T10:03:44.166Z
https://the-strategy-unit.github.io/data_science/presentations/2023-08-23_nhs-r_unit-testing/index.html
- 2024-01-17T12:16:21.996Z
+ 2024-01-18T10:03:48.090Z
https://the-strategy-unit.github.io/data_science/presentations/2023-03-09_coffee-and-coding/index.html
- 2024-01-17T12:16:27.228Z
+ 2024-01-18T10:03:53.370Z
https://the-strategy-unit.github.io/data_science/presentations/2023-02-23_coffee-and-coding/index.html
- 2024-01-17T12:16:28.024Z
+ 2024-01-18T10:03:54.162Z
https://the-strategy-unit.github.io/data_science/presentations/2023-03-09_midlands-analyst-rap/index.html
- 2024-01-17T12:16:31.500Z
+ 2024-01-18T10:03:58.014Z
https://the-strategy-unit.github.io/data_science/presentations/2023-08-02_mlcsu-ksn-meeting/index.html
- 2024-01-17T12:16:32.372Z
+ 2024-01-18T10:03:58.954Z
https://the-strategy-unit.github.io/data_science/style/style_guide.html
- 2024-01-17T12:16:55.076Z
+ 2024-01-18T10:04:13.138Z
https://the-strategy-unit.github.io/data_science/style/project_structure.html
- 2024-01-17T12:16:55.692Z
+ 2024-01-18T10:04:13.858Z
https://the-strategy-unit.github.io/data_science/blogs/index.html
- 2024-01-17T12:16:56.712Z
+ 2024-01-18T10:04:14.890Z
https://the-strategy-unit.github.io/data_science/blogs/posts/2024-01-17_nearest_neighbour.html
- 2024-01-17T12:17:15.840Z
+ 2024-01-18T10:04:33.994Z
https://the-strategy-unit.github.io/data_science/blogs/posts/2024-01-10-advent-of-code-and-test-driven-development.html
- 2024-01-17T12:17:17.280Z
+ 2024-01-18T10:04:35.462Z
diff --git a/style/project_structure.html b/style/project_structure.html
index a14d62d..d82bcb2 100644
--- a/style/project_structure.html
+++ b/style/project_structure.html
@@ -188,12 +188,13 @@
On this page
@@ -205,39 +206,53 @@
On this page
Project Structure
-
-- as a package
-- R/ for scripts
-- split files into separate scripts
-- use
{renv}
-- use
{targets}
-
-
-Use RStudio Projects
-RStudio projects are a great way to organise your analytical projects into discrete units that are easier to work on and share.
-
-
-Separate scripts
+
+RStudio Projects
+Analytical projects should be self-contained and portable. This means that all the materials required for an analysis should be organised into a single folder that can be shared in its entirety and be re-run by other people, ideally via GitHub.
+We recommend RStudio Projects as a system for creating standardised project structures that meet these goals. The {usethis} package contains a number of helper functions to help get you started, including usethis::create_project()
.
+
+Dependency management
+One of the most common issues you’ll face when using a project someone else has created, or you created previously, is maintaining the required packages to run the project. Knowing what packages are needed to run a particular project isn’t always obvious, and over time packages can change, rendering code that once worked unusable.
+The {renv}
R package helps solve this problem by:
+
+- Keeping track of the packages that are required for a particular project.
+- Logging the installed version of all of the packages.
+- Maintaining a per-project library of packages, so projects don’t interfere with one another.
+
-
-Functions
+
+Workflow management
+It’s helpful to split discrete analytical tasks into separate script files, which can make it easier to handle the codebase in context and provide an obvious order of operations. For example, 01_read.R
, 02_wrangle.R
, 03_model.R
, etc.
+You could still forget to re-run one of the numbered files, however, or it may take a long time to re-run all the steps again if you only make one small change to the code. This is where a workflow manager is useful.
+We recommend the {targets} R package as a workflow manager. You write a series of steps and {targets} automatically recognises all the relationships between functions and objects as a graph. This means {targets} knows the order that things should be run and knows which bits of code need to be re-run if there are upstream changes. It’s a well-documented and supported package.
-
-Standardise Folder Structures
+
+Functions
+It’s beneficial to convert code into discrete functions where possible. This makes it easier to:
+
+- reduce the chance of errors, because you’ll avoid repetitive and mistake-prone copy-pasting of code
+- understand your scripts, because code can be condensed into a simpler calls that are easier to read
+- reuse your code, because functions allow you to consistently call the same code more than once and can be copied into other projects
+- debug, because the source of an error can be more easily traced and your code can be tested more easily
+
+Consider the DRY (Don’t Repeat Yourself) principle when deciding whether or not to convert some code into a function. It may be better to write a function if you’ve used the same piece of code more than once in an analysis, especially if it contains many lines.
+Function names should be short but descriptive and should contain a verb that describes what the function does. For example, get_geospatial_data()
may be better than the generic get_data()
, which is certainly better than the uninformative data()
.
+In a project, it’s conventional to put your functions in a folder called R
in the project’s root directory. You can group functions into separate R scripts with meaningful names to make it easier to organise them (read-data.R
, model.R
, etc). You can then source()
these function scripts into your analytical scripts as required.
-
-{renv}
-One of the most common issues you will face when using a project someone else has created, or you created previously, is maintaining the required packages to run the project. Knowing what packages are needed to run a particular project isn’t always obvious, and over time packages can change rendering code that once worked unusable.
-{renv}
solves this problem by:
-
-- keeping track of the packages that are required for a particular project
-- logging the installed version of all of the packages
-- maintaining a per-project library of packages, so projects don’t interfere with one another
-
-It’s a good idea to use {renv}
for all projects.
-
-{targets}
+
+Packages
+It may be beneficial to gather your functions into a discrete package so that you and others can install and reuse them for other projects.
+The {usethis} package has a number of shortcuts to help you set up a package. You can begin with usethis::create_package()
to generate the basic structure and then usethis::use_r
and usethis::use_test()
to add scripts and {testthat} tests into the correct folder structure.
+We recommend you include a number of extra files in your package to make its purpose clear and to encourage collaboration. This includes:
+
+- a README file to describe the purpose of your package and provide some simple examples, which you can set up with
usethis::use_readme_md()
or usethis::use_readme_rmd()
if it contains R code that you want to execute
+- a NEWS file with
usethis::use_news_md()
, which is used to communicate the latest changes to your package
+- a CODE_OF_CONDUCT file with
usethis::use_code_of_conduct
to explain to collaborators how they should engage with your project
+- vignettes with
usethis::use_vignette()
, which are short documents that let you mix code with prose to describe how to use the functions in your package
+
+We recommend semantic versioning as you develop your package. In this system, the version number is composed of three digits (like ‘1.2.3’) that are each incremented as you make major breaking changes, minor changes and patches or bug fixes. The usethis::use_version()
function can help you to do this and to automatically update the DESCRIPTION and NEWS file.
+Use {pkgdown} to autogenerate a website from your package’s documentation. This lets people see your documentation rendered nicely on the internet, without the need to install the package. You can serve this site on the web and update it automatically using GitHub Pages and GitHub Actions.