-
Notifications
You must be signed in to change notification settings - Fork 0
/
3.workflows-solutions.Rmd
60 lines (49 loc) · 2.03 KB
/
3.workflows-solutions.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
---
title: "Solutions to exercise: workflows, filtering and sorting"
author: "Mark Dunning and Matt Eldridge"
date: '`r format(Sys.time(), "Last modified: %d %b %Y")`'
output: html_document
---
## Part I -- Workflows using pipes
1. Read in the patients dataset and rewrite the following cleaning steps as a workflow using the `%>%` operator.
```{r eval = FALSE}
library(tidyverse)
patients <- read_tsv("patient-data.txt")
patients <- mutate(patients, Smokes = Smokes %in% c("TRUE", "Yes"))
patients <- mutate(patients, Height = as.numeric(str_remove(Height, pattern = "cm$")))
patients <- mutate(patients, Weight = as.numeric(str_remove(Weight, pattern = "kg$")))
patients <- mutate(patients, BMI = Weight / (Height / 100) ** 2)
patients <- mutate(patients, Overweight = BMI > 25)
```
```{r}
library(tidyverse)
patients <- read_tsv("patient-data.txt") %>%
mutate(Smokes = Smokes %in% c("TRUE", "Yes")) %>%
mutate(Height = as.numeric(str_remove(Height, pattern = "cm$"))) %>%
mutate(Weight = as.numeric(str_remove(Weight, pattern = "kg$"))) %>%
mutate(BMI = Weight / (Height / 100) ** 2) %>%
mutate(Overweight = BMI > 25)
patients
```
2. Add a step to the workflow to round the Height, Weight and BMI to 1 decimal place.
```{r}
patients <- read_tsv("patient-data.txt") %>%
mutate(Smokes = Smokes %in% c("TRUE", "Yes")) %>%
mutate(Height = as.numeric(str_remove(Height, pattern = "cm$"))) %>%
mutate(Weight = as.numeric(str_remove(Weight, pattern = "kg$"))) %>%
mutate(BMI = Weight / (Height / 100) ** 2) %>%
mutate(Overweight = BMI > 25) %>%
mutate_at(vars(Height, Weight, BMI), round, digits = 1)
patients
```
## Part II - Filtering rows
3. Filter for female patients from New York or New Jersey.
```{r}
filter(patients, Sex == "Female", State == "New York" | State == "New Jersey")
filter(patients, Sex == "Female", State %in% c("New York", "New Jersey"))
filter(patients, Sex == "Female", str_starts(State, "New "))
```
4. Filter for overweight smokers that are still alive.
```{r}
filter(patients, Overweight & Smokes == "Smoker" & !Died)
```