From 407ee966470e8cb7e066a560ec6c28110dabc3c4 Mon Sep 17 00:00:00 2001 From: Carl Boettiger Date: Mon, 9 Oct 2017 21:15:46 -0700 Subject: [PATCH] Initial commit --- .gitignore | 5 + .travis.yml | 8 ++ DESCRIPTION | 3 + README.md | 34 ++++++ assignment/fish-assignment.Rmd | 211 +++++++++++++++++++++++++++++++++ tests/render_rmds.R | 5 + 6 files changed, 266 insertions(+) create mode 100644 .gitignore create mode 100644 .travis.yml create mode 100644 DESCRIPTION create mode 100644 README.md create mode 100644 assignment/fish-assignment.Rmd create mode 100644 tests/render_rmds.R diff --git a/.gitignore b/.gitignore new file mode 100644 index 0000000..f732fe3 --- /dev/null +++ b/.gitignore @@ -0,0 +1,5 @@ +.Rproj.user +.Rhistory +.RData +.Ruserdata +.Rbuildignore diff --git a/.travis.yml b/.travis.yml new file mode 100644 index 0000000..948b12b --- /dev/null +++ b/.travis.yml @@ -0,0 +1,8 @@ +# R for travis: see documentation at https://docs.travis-ci.com/user/languages/r + +language: R +sudo: false +cache: packages +script: + - R -f tests/render_rmds.R + diff --git a/DESCRIPTION b/DESCRIPTION new file mode 100644 index 0000000..90c2dae --- /dev/null +++ b/DESCRIPTION @@ -0,0 +1,3 @@ +Package: compendium +Version: 0.1.0 +Depends: rmarkdown, tidyverse diff --git a/README.md b/README.md new file mode 100644 index 0000000..58fb008 --- /dev/null +++ b/README.md @@ -0,0 +1,34 @@ + +*add travis-ci badge here* + +## Team Members: + +- full name, github handle +- full name, github handle + +This repository is a template for your team's repository. + +## assignment + +All work for this assignment should be in the `assignment` directory. You will work in the `.Rmd` notebook, and commit your rendered output files (`.md` and associated files) in the `assignment` directory as well. + +## Special files + +All team repositories will also include most of the special files found here: + +### Common files + +- `README.md` this file, a general overview of the repository in markdown format. +- `.gitignore` Optional file, ignore common file types we don't want to accidentally commit to GitHub. Most projects should use this. +- `.Rproj` Optional, an R-Project file created by RStudio for it's own configuration. Some people prefer to `.gitignore` this file. + + +### Infrastructure for Testing + +- `.travis.yml`: A configuration file for automatically running [continuous integration](https://travis-ci.com) checks to verify reproducibility of all `.Rmd` notebooks in the repo. If all `.Rmd` notebooks can render successfully, the "Build Status" badge above will be green (`build success`), otherwise it will be red (`build failure`). +- `DESCRIPTION` a metadata file for the repository, based on the R package standard. It's main purpose here is as a place to list any additional R packages/libraries needed for any of the `.Rmd` files to run. +- `tests/render_rmds.R` an R script that is run to execute the above described tests, rendering all `.Rmd` notebooks. + + + + diff --git a/assignment/fish-assignment.Rmd b/assignment/fish-assignment.Rmd new file mode 100644 index 0000000..2da7b35 --- /dev/null +++ b/assignment/fish-assignment.Rmd @@ -0,0 +1,211 @@ +--- +output: github_document +--- + +```{r include = FALSE} +knitr::opts_chunk$set(message = FALSE) +``` +# Unit 3: Fisheries Collapse Module + +This module will focus on understanding and replicating +fisheries stock assessment data and fisheries collapse. Follow along with Carl, +as he live codes the first exercise and then complete the second as your group assignment. + +## The Database + +We will use data from the [RAM Legacy Stock Assessment Database](http://ramlegacy.marinebiodiversity.ca/ram-legacy-stock-assessment-database) + +First, load in the necessary librarys. Note that this time we need a package we +haven't used before `readxl`. This package is useful for reading in .xls or +.xlsx files. As always if you want more info on a package run `?readxl` after +loading it. + +```{r message = FALSE} +library("tidyverse") +library("readxl") +``` + +## Reading in the tables + +```{r} +#download.file("https://depts.washington.edu/ramlegac/wordpress/databaseVersions/RLSADB_v3.0_(assessment_data_only)_excel.zip", +# "ramlegacy.zip") +path <- unzip("ramlegacy.zip") #unzip the .xls files +sheets <- readxl::excel_sheets(path) #use the readxl package to identify sheet names +ram <- lapply(sheets, readxl::read_excel, path = path) #read the data from all 3 sheets into a list +names(ram) <- sheets # give the list of datatables their assigned sheet names + +## check your names +names(ram) + +## check your data +head(ram$area) + +``` + + + +# Exercise 1: Investigating the North-Atlantic Cod + +First, We seek to replicate the following figure from the Millenium Ecosystem Assessment Project using the RAM data. + +![](http://berkeley.carlboettiger.info/espm-88b/fish/img/codcollapse.jpg) + + +## Task 1: Joining the necessary data + +To replicate this plot, we need a table with the following columns: `"country"`, `"ssb_unit"`, `"catch_landings_unit"`, `"scientificname"`, `"commonname"`, `"year"`, `"ssb"`, and `"TC"`. + +Using the `select()` and `join()` functions you were introduced to in in Module 1, +build a tidy table with the desired columns. + +```{r} + +``` + +## Task 2: Mapping the Area table to marine regions + +In order to replicate the collapse of Atlantic Cod, +we need to be able to map area table from the Ram database to the marine regions. + +*As an aside, this database is unclear what kind of areas the `area` table is using, they do not appear to be LMEs, EEZs, or other obvious marine region classification. Regardless, we will use them to extract the North America cod stocks.* + +Write code to pull all marine areas (listed in `ram$area`) that contain a certain substring +in their name -- ex. "Georges Bank". +Hint: you want want to consider functions `filter()` or `grep()` + +```{r} + +``` + +We are interested in mapping the data from just the areas where Atlantic Cod are found. +Using the table you built above, pull out distinct areas that contain +Atlantic Cod populations into a new tidytable. +Hint: you may want to use functions like `filter()` or `distinct()` + +```{r} + +``` + +## Task 3: Subsetting our data by regional id + +Using bracket notation and or the `filter()` and `pull()` functions, try pulling +certain subsets of ids from your table of cod areas. ex. the first 8 ids, or the ids of areas just within a certain country. + +Create a vector of ids of areas with Atlantic Cod and in Canada. + +```{r} + +``` + + +## Task 4: Plotting Total Catch in Canada + +Calculate and plot the catch in million tons (MT) of Atlantic Cod from +Canada using the data table and vector of ids you created above. +Hint: you may want to use functions like `group_by()`, `filter()`, and/or `summarise()` + +```{r } + + +``` + +**Question:** How does this graph compare to the one presented above? + +------ + +# Exercise 2: Group Assignment + +## Stock Collapses + +We seek to replicate the temporal trend in stock declines shown in [Worm et al 2006](http://doi.org/10.1126/science.1132294): + +![](http://berkeley.carlboettiger.info/espm-88b/img/worm2006.png) + +**Question 1:** What years does this plot include? What is it plotting? + +## Task 1: Plotting total taxa caught worldwide 1950-2006 + +Adapting the table you created in the first exercise, select and +manipulate the necessary columns to plot the number of total taxa caught each year +from 1950 til 2006 using `geom_point()`. + +Hint: you may want to use functions like `group_by()`, `tally()` and be sure to +carefully consider how to handle or omit missing values. + +```{r} + +``` + +## Task 2: Removing incomplete datasets + +Species can either have missing data (within a series) or a time range +that just doesn't span the full interval. Grouping by stockid instead of year, +build a character vector containing only those stockids that have data for the +full range (1950-2006). + + +```{r} + +``` + +**Question 2:** How many taxa have data for the full range? + +```{r} + +``` + + +## Task 3: Which fisheries have collapsed? + +A fishery may be considered *collapsed* when total catch (TC) falls +below 10% of its peak. For those stocks with complete data sets, create a new +tidy table including columns: `stockid`, `TC`, `year`, `collapsed`, and `cumulative`, +where `collapsed` is a logical (True or False) for whether or not that fishery could +be considered collapsed in that year, and `cumulative` is the count of total years +the fishery has been collapsed at that point in time. + +```{r} + +``` + +## Task 4: Plotting total catch + +Using `geom_area()` plot the TC per stockid acros all years. +```{r} + +``` + +## Task 5: Calculating percent collapsed + +To replicate the original plot, we must calculate the percent of taxa +collapsed over time. +Using the `summarise()` function, and only the core stocks that have data +across the full interval, build a new tidy table +that gives the fraction of all stocks that are collapsed in each year and +include a cumulative column that gives the fraction of all years (between 1950 and each year) +that has experience at least one collapse. + +Hint: when logical vectors are summed or converted to numerics, TRUE = 1 and FALSE = 0. + +```{r} + +``` + +## Task 6: Plotting proportion collapsed over time. + +Using `geom_line` twice to plot two individual lines (of different +colors please), plot the cumulative number of collapsed fisheries through time +and the fraction of collapsed fishers through time on the same graph. + +Hint: try using `scale_y_reverse()` to flip the y axis for a different perspective +on these fractions. + +```{r} + + +``` + +**Question 3:** What does this graph show us? How is it presenting information differently than the original graph for this exercise? Is it presenting the same information? + + diff --git a/tests/render_rmds.R b/tests/render_rmds.R new file mode 100644 index 0000000..4f5f686 --- /dev/null +++ b/tests/render_rmds.R @@ -0,0 +1,5 @@ + +## call rmarkdown on all .Rmd files +f <- list.files(recursive = TRUE) +Rmds <- f[grepl(".Rmd$", f)] +lapply(Rmds, rmarkdown::render) \ No newline at end of file