Skip to content

Commit

Permalink
update mancofibdata with area and work with fibmatrix and fibmatmap
Browse files Browse the repository at this point in the history
  • Loading branch information
fawda123 committed Oct 11, 2024
1 parent 6bf2756 commit e424fac
Show file tree
Hide file tree
Showing 13 changed files with 219 additions and 46 deletions.
78 changes: 56 additions & 22 deletions R/anlz_fibmap.R
Original file line number Diff line number Diff line change
@@ -1,16 +1,22 @@
#' Assign threshold categories to Fecal Indicator Bacteria (FIB) data
#'
#' @param fibdata input FIB \code{data.frame} as returned by \code{\link{read_importfib}}
#' @param fibdata input FIB \code{data.frame} as returned by \code{\link{read_importfib}} or \code{\link{read_importwqp}}, see details
#' @param yrsel optional numeric value to filter output by years in \code{fibdata}
#' @param mosel optional numeric value to filter output by month in \code{fibdata}
#' @param areasel optional character string to filter output by stations in the \code{area} column of \code{fibdata}, see details
#' @param assf logical indicating if the data are further processed as a simple features object with additional columns for \code{\link{show_fibmap}}
#'
#' @details This function is used to create FIB categories for mapping using \code{\link{show_fibmap}}. Categories based on relevant thresholds are assigned to each observation. The categories are specific to E. coli or Enterococcus and are assigned based on the station class as freshwater (\code{class} as 1 or 3F) or marine (\code{class} as 2 or 3M), respectively. A station is categorized into one of four ranges defined by the thresholds as noted in the \code{cat} column of the output, with corresponding colors appropriate for each range as noted in the \code{col} column of the output.
#'
#' The \code{areasel} argument can indicate valid entries in the \code{area} column of \code{fibdata}. For example, use either \code{"Alafia River"} or \code{"Hillsborough River"} for the corresponding river basins, where rows in \code{fibdata} are filtered based on the the selection. All stations are returned if this argument is set as \code{NULL} (default). The Alafia River basin includes values in the \code{area} column of \code{fibdata} as \code{"Alafia River"} and \code{"Alafia River Tributary"}. The Hillsborough River basin includes values in the \code{area} column of \code{fibdat} as \code{"Hillsborough River"}, \code{"Hillsborough River Tributary"}, \code{"Lake Thonotosassa"}, \code{"Lake Thonotosassa Tributary"}, and \code{"Lake Roberta"}. Not all areas may be present based on the selection. All valid options for \code{areasel} include \code{"Alafia River"}, \code{"Hillsborough River"}, \code{"Big Bend"}, \code{"Cockroach Bay"}, \code{"East Lake Outfall"}, \code{"Hillsborough Bay"}, \code{"Little Manatee"}, \code{"Lower Tampa Bay"}, \code{"McKay Bay"}, \code{"Middle Tampa Bay"}, \code{"Old Tampa Bay"}, \code{"Palm River"}, \code{"Tampa Bypass Canal"}, or \code{"Valrico Lake"}. One to any of the options can be used.
#' Data from Manatee County (21FLMANA_WQX) returned by \code{\link{read_importwqp}} can be used with this function. Data from other organization have returned by this function have not been tested.
#'
#' @return A \code{data.frame} if similar to \code{fibdata} if \code{assf = FALSE} with additional columns describing station categories and optionally filtered by arguments passed to the function. A \code{sf} object if \code{assf = TRUE} with additional columns for \code{\link{show_fibmap}}.
#' The \code{areasel} argument can indicate valid entries in the \code{area} column of \code{fibdata} (from \code{\link{read_importfib}}) or \code{mancofibdata} (from \code{\link{read_importwqp}}). For example, use either \code{"Alafia River"} or \code{"Hillsborough River"} for the corresponding river basins, where rows in \code{fibdata} are filtered based on the the selection. All stations are returned if this argument is set as \code{NULL} (default). The Alafia River basin includes values in the \code{area} column of \code{fibdata} as \code{"Alafia River"} and \code{"Alafia River Tributary"}. The Hillsborough River basin includes values in the \code{area} column of \code{fibdata} as \code{"Hillsborough River"}, \code{"Hillsborough River Tributary"}, \code{"Lake Thonotosassa"}, \code{"Lake Thonotosassa Tributary"}, and \code{"Lake Roberta"}. Not all areas may be present based on the selection.
#'
#' All valid options for \code{areasel} for \code{fibdata} include \code{"Alafia River"}, \code{"Hillsborough River"}, \code{"Big Bend"}, \code{"Cockroach Bay"}, \code{"East Lake Outfall"}, \code{"Hillsborough Bay"}, \code{"Little Manatee"}, \code{"Lower Tampa Bay"}, \code{"McKay Bay"}, \code{"Middle Tampa Bay"}, \code{"Old Tampa Bay"}, \code{"Palm River"}, \code{"Tampa Bypass Canal"}, or \code{"Valrico Lake"}. One to any of the options can be used.
#'
#' Valid entries for \code{areasel} for \code{mancofibdata} include 'Big Slough', 'Bowlees Creek', 'Braden River', 'Bud Slough', 'Cedar Creek', 'Clay Gully', 'Cooper Creek', 'Curiosity Creek', 'Frog Creek', 'Gamble Creek', 'Gap Creek', 'Gates Creek', 'Gilley Creek', 'Hickory Hammock Creek', 'Lake Manatee', 'Little Manatee River', 'Lower Manatee River', 'Lower Tampa Bay', 'Manatee River Estuary', 'Mcmullen Creek', 'Mill Creek', 'Mud Lake Slough', 'Myakka River', 'Nonsense Creek', 'Palma Sola Bay', 'Piney Point Creek', 'Rattlesnake Slough', 'Sugarhouse Creek', 'Upper Manatee River', 'Ward Lake', or 'Williams Creek'. One to any of the options can be used.
#'
#' @return A \code{data.frame} if similar to \code{fibdata} or \code{mancofibdata} if \code{assf = FALSE} with additional columns describing station categories and optionally filtered by arguments passed to the function. A \code{sf} object if \code{assf = TRUE} with additional columns for \code{\link{show_fibmap}}.
#'
#' @export
#'
Expand All @@ -31,24 +37,52 @@ anlz_fibmap <- function(fibdata, yrsel = NULL, mosel = NULL, areasel = NULL, ass

cols <- c('#2DC938', '#E9C318', '#EE7600', '#CC3231')

out <- fibdata %>%
select(area, epchc_station, class, yr, mo, Latitude, Longitude, ecoli, entero) %>%
dplyr::mutate(
ind = dplyr::case_when(
class %in% c('1', '3F') ~ 'E. coli',
class %in% c('2', '3M') ~ 'Enterococcus'
),
cat = dplyr::case_when(
ind == 'E. coli' ~ cut(ecoli, breaks = levs$ecolilev, right = F, labels = levs$ecolilbs),
ind == 'Enterococcus' ~ cut(entero, breaks = levs$enterolev, right = F, levs$enterolbs)
),
col = dplyr::case_when(
ind == 'E. coli' ~ cut(ecoli, breaks = levs$ecolilev, right = F, labels = cols),
ind == 'Enterococcus' ~ cut(entero, breaks = levs$enterolev, right = F, cols)
),
col = as.character(col)
)
# check if epchc data
isepchc <- exists("epchc_station", fibdata)

# check if manco data
ismanco <- exists("manco_station", fibdata)

if(isepchc)
out <- fibdata %>%
select(area, station = epchc_station, class, yr, mo, Latitude, Longitude, ecoli, entero) %>%
dplyr::mutate(
ind = dplyr::case_when(
class %in% c('1', '3F') ~ 'E. coli',
class %in% c('2', '3M') ~ 'Enterococcus'
),
cat = dplyr::case_when(
ind == 'E. coli' ~ cut(ecoli, breaks = levs$ecolilev, right = F, labels = levs$ecolilbs),
ind == 'Enterococcus' ~ cut(entero, breaks = levs$enterolev, right = F, levs$enterolbs)
),
col = dplyr::case_when(
ind == 'E. coli' ~ cut(ecoli, breaks = levs$ecolilev, right = F, labels = cols),
ind == 'Enterococcus' ~ cut(entero, breaks = levs$enterolev, right = F, cols)
),
col = as.character(col)
)

if(ismanco)
out <- fibdata %>%
dplyr::select(area, station = manco_station, class, yr, mo, Latitude, Longitude, var, val) %>%
dplyr::filter(var %in% c('ecoli', 'entero')) %>%
dplyr::select(-uni, -qual, -Sample_Depth_m) %>%
dplyr::pivot_wider(names_from = 'var', values_from = 'val')
dplyr::mutate(
ind = dplyr::case_when(
class %in% 'Fresh' ~ 'E. coli',
class %in% 'Estuary' ~ 'Enterococcus'
),
cat = dplyr::case_when(
ind == 'E. coli' ~ cut(ecoli, breaks = levs$ecolilev, right = F, labels = levs$ecolilbs),
ind == 'Enterococcus' ~ cut(entero, breaks = levs$enterolev, right = F, levs$enterolbs)
),
col = dplyr::case_when(
ind == 'E. coli' ~ cut(ecoli, breaks = levs$ecolilev, right = F, labels = cols),
ind == 'Enterococcus' ~ cut(entero, breaks = levs$enterolev, right = F, cols)
),
col = as.character(col)
)
# filter by year
if(!is.null(yrsel)){
yrsel <- match.arg(as.character(yrsel), unique(out$yr))
Expand All @@ -64,7 +98,7 @@ anlz_fibmap <- function(fibdata, yrsel = NULL, mosel = NULL, areasel = NULL, ass
}

# filter by area
if(!is.null(areasel)){
if(!is.null(areasel) & isepchc){
areasls <- list(
`Alafia River` = c('Alafia River', 'Alafia River Tributary'),
`Hillsborough River` = c('Hillsborough River', 'Hillsborough River Tributary', 'Lake Thonotosassa',
Expand Down Expand Up @@ -133,7 +167,7 @@ anlz_fibmap <- function(fibdata, yrsel = NULL, mosel = NULL, areasel = NULL, ass
out <- tomap %>%
dplyr::mutate(
grp = factor(grp, levels = levs),
lab = paste0('<html>Station Number: ', epchc_station, '<br>Class: ', cls, ' (<i>', ind, '</i>)<br> Category: ', cat, ' (', conc, '/100mL)</html>')
lab = paste0('<html>Station Number: ', station, '<br>Class: ', cls, ' (<i>', ind, '</i>)<br> Category: ', cat, ' (', conc, '/100mL)</html>')
) %>%
dplyr::select(-colnm, -indnm)

Expand Down
40 changes: 34 additions & 6 deletions R/anlz_fibmatrix.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
#'
#' Analyze Fecal Indicator Bacteria categories over time by station or bay segment
#'
#' @param fibdata input data frame as returned by \code{\link{read_importfib}} or \code{\link{read_importentero}}
#' @param fibdata input data frame as returned by \code{\link{read_importfib}}, \code{\link{read_importentero}}, or \code{\link{read_importwqp}}, see details
#' @param yrrng numeric vector indicating min, max years to include, defaults to range of years in data, see details
#' @param stas optional vector of stations to include, see details
#' @param bay_segment optional vector of bay segment names to include, supercedes \code{stas} if provided, see details
Expand All @@ -25,6 +25,8 @@
#'
#' The default stations for fecal coliform data are those used in TBEP report #05-13 (\url{https://drive.google.com/file/d/1MZnK3cMzV7LRg6dTbCKX8AOZU0GNurJJ/view}) for the Hillsborough River Basin Management Action Plan (BMAP) subbasins if \code{bay_segment} is \code{NULL} and the input data are from \code{\link{read_importfib}}. These include Blackwater Creek (WBID 1482, EPC stations 143, 108), Baker Creek (WBID 1522C, EPC station 107), Lake Thonotosassa (WBID 1522B, EPC stations 135, 118), Flint Creek (WBID 1522A, EPC station 148), and the Lower Hillsborough River (WBID 1443E, EPC stations 105, 152, 137). Other stations can be plotted using the \code{stas} argument.
#'
#' Input from \code{\link{read_importwqp}} for Manatee County (21FLMANA_WQX) FIB data can also be used. The function has not been tested for other organizations.
#'
#' @export
#'
#' @importFrom dplyr "%>%"
Expand All @@ -47,9 +49,8 @@
#' anlz_fibmatrix(enterodata, indic = 'entero', lagyr = 1, subset_wetdry = "wet",
#' temporal_window = 2, wet_threshold = 0.5)
#'
#' # subset to only dry samples
#' anlz_fibmatrix(enterodata, indic = 'entero', lagyr = 1, subset_wetdry = "dry",
#' temporal_window = 2, wet_threshold = 0.5)
#' # Manatee County data
#' anlz_fibmatrix(mancofibdata, indic = 'fcolif', lagyr = 1)
anlz_fibmatrix <- function(fibdata, yrrng = NULL, stas = NULL, bay_segment = NULL, indic,
threshold = NULL, lagyr = 3, subset_wetdry = c("all", "wet", "dry"),
precipdata = NULL, temporal_window = NULL, wet_threshold = NULL,
Expand All @@ -63,6 +64,9 @@ anlz_fibmatrix <- function(fibdata, yrrng = NULL, stas = NULL, bay_segment = NUL
# check if epchc data
isepchc <- exists("epchc_station", fibdata)

# check if manco data
ismanco <- exists('manco_station', fibdata)

# checks for epc data
if(isepchc){

Expand All @@ -82,12 +86,36 @@ anlz_fibmatrix <- function(fibdata, yrrng = NULL, stas = NULL, bay_segment = NUL

}

# checks for manco data
if(ismanco){

# # assign default stations from TBEP report #05-13
# if(is.null(stas))
# stas <- c(143, 108, 107, 135, 118, 148, 105, 152, 137)

# error if subset_wetdry attempted with manco data
if(subset_wetdry %in% c('wet', 'dry'))
stop('Subset to wet or dry samples not supported for Manatee County data')

# error if user tries to subset by bay segment for epchc
if(!is.null(bay_segment))
stop('Bay segment subsetting not applicable for Manatee County data')

fibdata <- fibdata %>%
dplyr::filter(!is.na(val)) %>%
dplyr::filter(var %in% indic) %>%
dplyr::rename(station = manco_station) %>%
dplyr::select(-qual, -uni, -Sample_Depth_m, -class, -var)
names(fibdata)[names(fibdata) == 'val'] <- indic

}

# checks for non-epc data
if(!isepchc){
if(!isepchc & !ismanco){

# check if user tries indic fcolif for enterodata
if(indic == 'fcolif')
stop('fcolif not a valid indicator for non-epchc data')
stop('fcolif not a valid indicator for these data')

# check bay segments
if(!is.null(bay_segment)){
Expand Down
3 changes: 2 additions & 1 deletion R/mancofibdata.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
#'
#' Manatee County FIB data as of 20241011
#'
#' @format A data frame with 12765 rows and 12 variables:
#' @format A data frame with 12765 rows and 13 variables:
#' \describe{
#' \item{manco_station}{chr, Station name}
#' \item{SampleTime}{POSIXct, Date/time of sampling}
Expand All @@ -16,6 +16,7 @@
#' \item{val}{num, Value of variable}
#' \item{uni}{num, Units of variable}
#' \item{qual}{num, Qualifier code}
#' \item{area}{chr, Location name based on USF Water Alas waterbody name}
#' }
#'
#' @concept data
Expand Down
32 changes: 31 additions & 1 deletion R/read_formwqp.R
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
#'
#' @return A data frame containing formatted water quality and station metadata
#'
#' @details This function is used by \code{\link{read_importwqp}} to combine, format, and process data (\code{res}) and station metadata (\code{sta}) obtained from the Water Quality Portal for the selected county and data type. The resulting data frame includes the date, time, station identifier, latitude, longitude, variable name, value, unit, and quality flag.
#' @details This function is used by \code{\link{read_importwqp}} to combine, format, and process data (\code{res}) and station metadata (\code{sta}) obtained from the Water Quality Portal for the selected county and data type. The resulting data frame includes the date, time, station identifier, latitude, longitude, variable name, value, unit, and quality flag. Manatee County FIB data (21FLMANA_WQX) will also include an \code{area} column indicating the waterbody name as used by the USF Water Atlas.
#'
#' @concept read
#'
Expand Down Expand Up @@ -190,6 +190,36 @@ read_formwqp <- function(res, sta, org, type, trace = F){
Sample_Depth_m, var, val, uni, qual) %>%
unique()

# add station areas if fib and manatee county
if(type == 'fib' & org == '21FLMANA_WQX'){

tomtch <- data.frame(
station = c("396", "BC1", "BC2", "BC41",
"BL01", "BL201", "BR1", "BR2", "BR3", "BU01A", "CC1", "CH1",
"D1", "D3", "ER1", "ER2", "FC1", "GA1", "GC1", "GC2", "GP", "LL1",
"LM3", "LM4", "LM5", "LM6", "MC1", "MC2", "MM", "MR1", "MR2",
"MS01", "MS02", "MY01", "MY02A", "MY04", "PP1", "SC1", "TS1",
"TS2", "TS3", "TS4", "TS5", "TS6", "TS7", "UM1", "UM2", "UM3",
"UM4", "WC1"),
area = c("Lower Tampa Bay", "Bowlees Creek", "Bowlees Creek",
"Bowlees Creek", "Big Slough", "Big Slough", "Braden River",
"Braden River", "Braden River", "Bud Slough", "Curiosity Creek",
"Palma Sola Bay", "Little Manatee River", "Little Manatee River",
"Ward Lake", "Ward Lake", "Frog Creek", "Gates Creek", "Gamble Creek",
"Gamble Creek", "Gap Creek", "Braden River", "Braden River",
"Manatee River Estuary", "Lower Manatee River", "Mill Creek",
"Mill Creek", "Mill Creek", "Mcmullen Creek", "Clay Gully", "Myakka River",
"Mud Lake Slough", "Mud Lake Slough", "Myakka River", "Myakka River",
"Myakka River", "Piney Point Creek", "Sugarhouse Creek", "Rattlesnake Slough",
"Cedar Creek", "Cooper Creek", "Cooper Creek", "Hickory Hammock Creek",
"Braden River", "Nonsense Creek", "Lower Manatee River", "Lake Manatee",
"Gilley Creek", "Upper Manatee River", "Williams Creek")
)

out <- dplyr::left_join(out, tomtch, by = 'station', relationship = 'many-to-one')

}

# rename station column based on org
names(out)[names(out) %in% 'station'] <- stanm

Expand Down
Loading

0 comments on commit e424fac

Please sign in to comment.