-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathLab5_code_update.rmd
235 lines (184 loc) · 11.9 KB
/
Lab5_code_update.rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
---
title: 'Lab 5: Streamflow Analysis'
author: "Gabriella Zuccolotto"
date: "2022-11-11"
output:
html_document: default
pdf_document: default
---
```{r setup, include=FALSE}
# load libraries into R
library(tidyverse)
library(ggplot2)
library(lubridate)
library(dataRetrieval)
library(zoo)
library(viridis)
library(knitr)
library(rmarkdown)
```
```{r pre-lab, include=TRUE}
#####################################################################
### PRE-LAB: Load data and clean
#####################################################################
### Load in daily river discharge data from the internet using the dataRetrieval package with the function
### readNWISdv. This function takes arguments of a USGS sitenumber, the parameter Code
### (here we use "00060" which is the code for discharge) and it loads all available
### daily discharge values over the period of record. The column with discharge (ft3/s) data
### will be named X_00060_00003. These are the same sites from Lab#1, plus we added the rio grande.
### RUN THIS and finish the other two.
# Load french creek near Wattsburgh PA
fr <- readNWISdv(siteNumbers = "03021350", parameterCd = "00060")
# Load allegheny river near Natrona PA
al <- readNWISdv(siteNumbers = "03049500", parameterCd = "00060")
# Load data for the Rio Grande River
rio <- readNWISdv("08330000", parameterCd = "00060")
### Change the name of column X_00060_00003 to "Q", and add a column called "name"
### of the river name to identify rather than using USGS site number. Do this for each site.
### (HINT: you could USE rename() function and mutate() to add a column)
### FINISH THIS CODE
fr <- fr %>%
rename(Q=X_00060_00003)%>%
mutate("Name"= "French Creek")
al <- al %>%
rename(Q=X_00060_00003)%>%
mutate("Name"="Allegheny River")
rio <- rio %>%
rename(Q=X_00060_00003)%>%
mutate("Name"="Rio Grande")
### Now we can merge the three sites and do all analyses on them at once.
### ADD a column for day of year (doy) AND a column for year.
### lets also make sure there are no NAs in the Q data. The "!" means "is not" so
### !is.na() means is not NA. We are filtering to only rows that have no NA in the Q column.
### FINISH code:
data <- bind_rows(fr, al, rio) %>%
mutate(doy = lubridate::yday(Date))%>%
mutate(year = lubridate::year(Date)) %>%
filter(!is.na(Q))
```
```{r analysis 1a, include=TRUE}
####################################################################################
### ANALYSIS 1: Flow-Duration Analysis
####################################################################################
### Plot 1a: Plot the entire time series of each river using ggplot and facet_wrap
ggplot(data = data, aes(x=Date, y=Q))+
labs(title = "Streamflow Comparisons",
y="Q (cfs)", x="Date") +
geom_line() +
facet_wrap(~ Name, scales="free")
```
##### QUESTION 1a: Make a few brief observations about the time series. Notice any patterns or trends in high flow, low flow, or anything else. If so, make a hypothesis why?
The Rio Grande, a large river in the southwestern United States, has a more similar discharge profile to French Creek than the Allegheny River, despite the fact that French Creek is much smaller in size. This speaks to the climate differences between the northeastern and southwestern United States. In the northeast, the Allegheny River and its small tributary, French Creek, both exhibit flashy, rain driven discharge patterns. In the Rio Grande,discharge values are lower due to higher evaporation potential and generally dry condition compared to the Allegheny and French Creek. The smoother curve of the Rio Grande discharge data and lack of flashy events compared to the northeast rivers signifies that this may be a groundwater, rather than rain driven, system. It also appears as though maximum discharge values in the Rio Grande are decreasing over time indicating gradual drying.
```{r analysis 1b, include=TRUE}
### Calculate the return period and exceedence probability for all daily discharge data
### for all 3 sites. Use whatever approach you want, could use code from Lab 2
### HINT: Use the n() function to find the length of the data for each group
### FINISH THIS CODE
data_fdc <- data %>%
group_by(Name) %>%
arrange(desc(Q)) %>%
mutate(rank=1:n())%>%
mutate(Tr=(n()+1)/rank) %>%
mutate(prob=1/Tr)%>%
ungroup()
### PLOT 1b: Make a Flow-Duration Curve of Q vs. exceedence probability for all 3 sites.
### You could use ggplot and facet_wrap to do both at the same time.
### Add nice x and y axis titles and put the y axis (e.g. Discharge) in log10 which is typical.
### but also look at the graph without a log10 transformation
ggplot(data = data_fdc, aes(x=prob, y=log10(Q)))+
labs(title = "Flow Duration Curves",
y="Discharge(cfs)", x="Exceedence Probability") +
geom_point() +
facet_wrap(~ Name, scales="free")
```
##### QUESTION 1b: Interpret any differences in the FDC curves between three different rivers.e.g. What could you infer just by looking at the shape of this graph?
Maximum flows are highest in the Allegheny, followed by Rio Grande and then French Creek. The shape of the curves indicates that low flows happen frequently, and sometimes dry completely, in the Rio Grande, but these conditions are not reached in the other two systems. They have year round flow. Because French Creek is a tributary of the Allegheny, and they experience the same climate conditions, their FDC are very similar shapes with different magnitudes.
```{r analysis 2, include=TRUE}
########################################################################
### ANALYSIS 2: Low Flow Analysis
#######################################################################
### The 1Q10 and 7Q10 are both hydrologically based design flows.
### The 1Q10 is the lowest 1-day average flow that occurs (on average) once every 10 years
### The 7Q10 is the lowest 7-day average flow that occurs (on average) once every 10 years."
### EPA https://www.epa.gov/ceam/definition-and-characteristics-low-flows#1Q10
### The 10 in 7Q10 means there is a 10 percent chance that the associated 7-day average flow
### or below will occur in any given year.
### To calculate the 7Q10 we need to calculate the 7 day (rolling) mean Q, and then
### find the minimum 7 day rolling mean Q that occurs within each year.
### So this is essentialy the same as doing flow frequency analysis on peak flows
### to estimate floods, but instead we are looking and flow droughts.
### Fill in the missing arguments to calculate the 7 day rolling mean Q for each site
### look up the function by typing ?rollmean() into the console or searching it on the web
### FINISH THIS CODE
data_low <- data %>%
group_by(Name) %>%
mutate(xdaymean = rollmean(x = Q ,
k = 7 ,
fill = NA,
na.rm = F,
align = "right")) %>%
ungroup()
### Lets make a quick plot for 1 year to see what a rolling mean looks like
### RUN THIS CODE
ggplot(data_low %>%
filter(Date >= mdy("01-01-2018") & Date <= mdy("12-31'2018"))) +
geom_line(aes(x=Date, y= Q), color="black") +
geom_line(aes(x=Date, y= xdaymean), color="red") +
facet_wrap(~Name, scales="free")
### CALCULATE the yearly minimum flow for each site and year. FILL IN the missing
### arguments
### FINISH THIS CODE
yearlyMin <- data_low %>%
mutate(year=year(Date)) %>%
group_by(Name, year) %>%
summarize(minQ = min(xdaymean, na.rm=T),
lenDat = length(Q),
lenNAs = sum(is.na(xdaymean))) %>%
filter(lenDat > 328 & lenNAs / lenDat < 0.1) %>%
ungroup()
### CALCULATE the rank, return interval, and exceedence probability for the 7Q10
### for each site by grouping by name or site_no and using same calculations
### as before.
### FINISH AND ADD NEW CODE
yearlyMin <- yearlyMin %>%
group_by(Name) %>%
mutate(rank=rank(minQ, ties.method="first")) %>%
mutate(Tr=(length(rank)+1)/rank)%>%
mutate(prob=1/Tr) %>%
ungroup()
### PLOT 2: Plot the minimum Q vs. return interval for all three sites
ggplot(data = yearlyMin, aes(x=Tr, y=minQ))+
labs(y="Discharge(cfs)", x="Return Interval") +
geom_point() +
facet_wrap(~ Name, scales="free")
```
##### QUESTION 2a: Approximately what is the 7Q10 from looking at the graph of each river?
Allegheny River: 1500 cfs
French Creek: 5 cfs
Rio Grande: 2 cfs
##### QUESTION 2b: A company wants to build a brewery near the Allegheny Riverand it will discharge about 1000 ft3/s of waste water into the river at all times. The EPA rules state that this brewery discharge cannot make up more than 50% of the total river discharge at any given time. Looking at the plot of minimum 7 day Q vs. return interval, approximately what is the return interval for a river flow that is composed of 50% wasterwater discharge? What is the probability the brewery will violate the rule during any given year? Would you give them the the permit to discharge? Why or why not?
If 50% of the discharge is composed of wastewater (1000 cfs), the total discharge of the river cannot fall below 2000 cfs. The return interval of a 2000 cfs discharge in the Allegheny River, composed of 50% wastewater, is approximately 4 years. The probability of a discharge event of this size occurring in a year is equal to 1/T (where T equals the return interval of 4 years) or 25%. I would not give them the permit to discharge because a 25% chance of exceeding the limit each year feels too high to be worthwhile.
```{r analysis 3, include=TRUE}
######################################################################
### ANALYSIS 3: Flow regimes
######################################################################
### LOAD in some watershed characteristics data for the three sites
### click on the sites object and look at it.
### the drainage_area_va column is the watershed size in square miles (mi2).
### RUN THIS CODE
sites <- readNWISsite(siteNumbers = unique(data$site_no))
### PLOT 3: Make a plot of the annual flow hydrograph (Q vs day of year)
### where each year is color coded. Do this for each site.
### You could use ggplot and facet_wrap or whatever plotting method you prefer.
### Make any other plots or inspect the data that may help you intepretation
### WRITE NEW CODE
fun_color_range <- colorRampPalette(c("red", "yellow")) # Create color generating function
colorscheme <- fun_color_range(80)
ggp <- ggplot(data = data, aes(x=doy, y=Q, group_by=year, color=year))+
labs(y="Discharge", x="Day of the Year") +
geom_line() +
facet_wrap(~ Name, scales="free")
ggp + scale_colour_gradientn(colors = colorscheme)
```
##### QUESTION 3: MAKE some interpretations about the the flow regimes for each river (e.g. is flow have a distinct seasonality or not? why? When does the highest and lowest flows occur (if there is a norm), and why? How "flashy" are the flow regimes? Why is one river more flashy than another? etc.) Consider looking at these annual hydrographs and the watershed size in the sites dataframe to help your answers.
Allegheny River and French Creek both exhibit similar seasonality with flows decreasing throughout the late summer months, however, in general, flashy rainfall-driven discharge events are observed year round. The Rio Grande exhibits a much more snow driven seasonality, with highest flows occurring gradually throughou the spring/summer months as snowpack is melting. The Rio Grande is much less flashier because discharge events are more driven by gradual snowmelt from high elevation headwaters than rain events. All three rivers exhibit lowest flows in late summer months (July/August/September). It should also be noted that drying of the Rio Grande can also be observed in these time-series hydrographs, as the magnitude and duration of discharge patterns in the spring/summer months are become smaller/shorter in recent years.