Skip to content

Commit

Permalink
Merge pull request #58 from ronnyhdez/T57
Browse files Browse the repository at this point in the history
initial steps to create plots with quality observations after filter
  • Loading branch information
ronnyhdez authored Jul 28, 2023
2 parents 92f1721 + 83533cd commit 67103b1
Show file tree
Hide file tree
Showing 10 changed files with 305 additions and 38 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ vignettes/*.pdf
_book/
data_ard
data
data_satellite_processed

# Folders created with outputs
chapter_2_brete_files/
Expand Down
1 change: 1 addition & 0 deletions R/calculate_indices.R
1 change: 1 addition & 0 deletions R/create_bit_string.R
1 change: 1 addition & 0 deletions R/filter_quality_pixels.R
1 change: 1 addition & 0 deletions R/scale_reflectance_bands.R
2 changes: 1 addition & 1 deletion abstract.qmd
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
# Abstract {.unnumbered}

Methods to quantify Gross Primary Production (GPP) are classified into two categories: Eddy Covariance techniques (EC) and satellite data-driven. EC techniques can measure carbon fluxes directly, albeit of spatial constraints. Satellite data-driven methods are promising because they overcome spatial constraints associated with EC techniques. However, there are challenges associated with an increase in uncertainty when estimating GPP from satellite-driven products such as mixed pixels, cloud cover, and the ability of the sensor to retrieve vegetation under saturation conditions in high biomass environments. Therefore an effort to analyze and quantify the uncertainty of GPP products derived from satellite platforms is needed. Here we present how commonly used satellite vegetation indices (NDVI, EVI, fPAR, and NIRv) with different spatial resolutions can impact the uncertainty in the GPP estimation compared with direct methods such as eddy covariance measurements. We conduct this study on three different sites: University of Michigan Biological Station (USA), the Borden Forest Research Station flux-site (Canada) and Bartlett Experimental Forest (USA).
Methods to quantify Gross Primary Production (GPP) are classified into two categories: Eddy Covariance techniques (EC) and satellite data-driven. EC techniques can measure carbon fluxes directly, albeit of spatial constraints. Satellite data-driven methods are promising because they overcome spatial constraints associated with EC techniques. However, there are challenges associated with an increase in uncertainty when estimating GPP from satellite-driven products such as mixed pixels, cloud cover, and the ability of the sensor to retrieve vegetation under saturation conditions in high biomass environments. Therefore an effort to analyze and quantify the uncertainty of GPP products derived from satellite platforms is needed. Here we present how commonly used satellite vegetation indices (NDVI, EVI, CCI, kNDVI, and NIRv) with different spatial resolutions can impact the uncertainty in the GPP estimation compared with direct methods such as eddy covariance measurements. We conduct this study on three different sites: University of Michigan Biological Station (USA), the Borden Forest Research Station flux-site (Canada) and Bartlett Experimental Forest (USA).
20 changes: 20 additions & 0 deletions appendices.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -165,3 +165,23 @@ tribble(
)
```

```{r}
#| label: fig-complete_quality_pixels
#| fig-cap: "Total number of observations (pixels) from MODIS classified as high quality (used in the analysis) or other quality (filtered out from the analysis)"
#| echo: false
#| message: false
#| warning: false
source("scripts/quality_observations.R")
all %>%
filter(quality == "high") %>%
mutate(year_mon = zoo::as.yearmon(date)) %>%
ggplot(aes(x = year_mon, fill = site)) +
geom_bar(position = "stack") +
scale_fill_viridis_d(begin = 0.2, end = 0.8) +
labs(x = "Date",
y = "Total observations (pixels)",
fill = "Site") +
theme_bw(base_size = 12)
```

130 changes: 97 additions & 33 deletions chapter_2_lm.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -47,9 +47,11 @@ conversion efficiency explains the amount of carbon that a specific type of
vegetation can fix per unit of solar radiation [@monteith_solar_1972]. The
Moderate Resolution Imaging Spectroradiometer (MODIS) GPP product uses an
algorithm based on this radiation conversion efficiency concept, relating the
absorbed photosynthetically active radiation (APAR) with the LUE term [@heinsch_evaluation_2006].
absorbed photosynthetically active radiation (APAR) with the LUE term [@heinsch_evaluation_2006] as shown in @eq-gpp .

GPP = PAR \* fAPAR \* LUE
$$
GPP = PAR \times fAPAR \times LUE
$$ {#eq-gpp}
Where PAR is the incident photosynthetically active radiation and fAPAR is the
fraction of the PAR that is effectively absorbed by plants. fAPAR can be used to
Expand Down Expand Up @@ -136,6 +138,8 @@ satellite vegetation indices with in-situ GPP data from the Bartlett
Experimental Forest (USA), the Borden Forest Research Station flux-site
(Canada), and the University of Michigan Biological Station (USA).
\newpage
## Methods
```{r libraries and sources}
Expand Down Expand Up @@ -239,18 +243,18 @@ borden_gpp_trends
bartlett_gpp_trends
```
The GPP for each of the sites can
<!-- The GPP for each of the sites can -->
- include description of the ONEFlux processing (gap filling, data quality,
gpp estimation, gpp uncertainty with DT and NT)
<!-- - include description of the ONEFlux processing (gap filling, data quality, -->
<!-- gpp estimation, gpp uncertainty with DT and NT) -->
### Satellite imagery
We used Google Earth Engine (GEE) to retrieve data from the Terra Moderate Resolution
Imaging Spectroradiometer (MODIS), specifically the collection MOD09GA Version
6.1 product. A square polygon with an area of 3km surrounding the EC tower was
defined for each site, and the complete data pixel values within this polygon
was extracted for analysis.
We used Google Earth Engine (GEE) to retrieve data from the Terra Moderate
Resolution Imaging Spectroradiometer (MODIS), specifically the collection
MOD09GA Version 6.1 product. A square polygon with an area of 3km surrounding
the EC tower was defined for each site, and the complete data pixel values
within this polygon was extracted for analysis.
The MODIS (MOD09GA Version 6.1 product) contains the surface spectral reflectance
from bands to 1 through 7 with a spatial resolution of 500m, with corrections
Expand All @@ -264,9 +268,10 @@ were filtered using four bit-encoded variables: `state_1km`, `qc_500m`,
`q_scan`, and `g_flags` along with each of the bands bits quality indicators.
These variables provided information about the observation quality. The
bit-encoded variables were transformed into categorical strings, and only the
categories indicating the best quality were selected to filter the pixels. For
filtering, the `state_1km` and `qc_500m` variables were used, as `q_scan` was
not informative and `g_flags` had the same value for all observations. The
categories indicating the best quality were selected to filter the pixels (@fig-quality_pixels).
For filtering, the `state_1km` and `qc_500m` variables were used, as `q_scan`
was not informative and `g_flags` had the same value for all observations. The
specific bit strings selected for `state_1km` are shown in
@tbl-state_1km_bitstrings and for `qc_500m` in @tbl-qc_scan_bit_strings.
Expand All @@ -279,20 +284,25 @@ consistency in the data.
According to the documentation, any scaled value that fell outside the range of
`0` to `1` was considered a fill value or uncorrected Level 1B data and was
subsequently discarded. These values were deemed unreliable or lacking
meaningful information for the analysis.
meaningful information for the analysis. The complete selected high quality
pixels for the complete period per each of the sites is shown in @fig-complete_quality_pixels
Once all the band values were scaled within the valid range the vegetation
indices such as `NDVI` (@eq-ndvi), `NIRv` (@eq-nirv), `EVI` (@eq-evi),
`kNDVI` (@eq-kndvi), and `CCI` (@eq-cci) were calculated. However, since the
MODIS product with a spatial resolution of 250m (`MODIS/061/MOD09A1`) does not
include the blue band EVI calculations were not performed for this dataset.
`kNDVI` (@eq-kndvi), and `CCI` (@eq-cci) were calculated and then matched with
the corresponding date in the flux datasets. The resulting values used for the
analysis are shown in @fig-quality_pixels
<!-- However, since the -->
<!-- MODIS product with a spatial resolution of 250m (`MODIS/061/MOD09A1`) does not -->
<!-- include the blue band EVI calculations were not performed for this dataset. -->
$$
NDVI = \frac{B02 - B01}{B02 + B01}
NDVI = \frac{NIR - Red}{NIR + Red}
$$ {#eq-ndvi}
$$
NIRv = B02\times\frac{B02 - B01}{B02 + B01}
NIRv = NIR\times\frac{NIR - Red}{NIR + Red}
$$ {#eq-nirv}
$$
Expand Down Expand Up @@ -338,14 +348,14 @@ $$ {#eq-cci}
In the study, three datasets were prepared for each site: a daily dataset, a
weekly dataset, and a monthly dataset. These datasets were generated from the
satellite images and the ONEFluxprocess in order to capture variations in
vegetation indices (VI), band values, and Gross Primary Production (GPP) over
different time scales.
satellite imagery data with the selected high quality pixels and the
ONEFluxprocess data in order to capture variations in vegetation indices (VI),
band values, and Gross Primary Production (GPP) over different time scales.
The daily dataset included the VI values, band values, and GPP measurements
collected on a daily basis. This dataset provided a high-resolution
representation of the variables, allowing for a detailed analysis of their daily
fluctuations.
The daily dataset included the VI values, band values only from the high quality
pixels, and GPP measurements derived from the ONEFlux process collected on a
daily basis. This dataset provided a high-resolution representation of the
variables, allowing for a detailed analysis of their daily fluctuations.
The weekly and monthly datasets were derived from the corresponding daily
dataset. These datasets contained summarized values of the VI's and band values,
Expand All @@ -372,7 +382,64 @@ insignificant in terms of representing meaningful vegetation productivity. GPP
values below this threshold were considered to be either below the detection
limit or not enough to contribute significantly to the overall analysis.
- Data matching of satellite data with flux data per day, week and month
```{r}
#| label: fig-quality_pixels
#| fig-cap: "Total number of observations (pixels) from MODIS classified as high quality (used in the analysis) or other quality (filtered out from the analysis) per site."
#| echo: false
#| message: false
#| warning: false
source("scripts/quality_observations.R")
all %>%
ggplot(aes(x = site, fill = quality)) +
geom_bar(position = "stack") +
scale_fill_viridis_d(begin = 0.34, end = 0.8) +
labs(x = "Site",
y = "Total observations (pixels)",
fill = "Quality") +
theme_bw(base_size = 12)
```
```{r}
#| label: fig-obs_used_analysis
#| fig-cap: "Monthly distribution of high-quality MODIS observations after joining with flux observations containing Gross Primary Productivity (GPP) values higher than 1."
#| echo: false
#| message: false
#| warning: false
# Dataset used to analyze the data
daily_plot_500 %>%
# Dataset have all indices per row, so for the same date there are 5 obs
filter(index == "ndvi_mean") %>%
group_by(site, date) %>%
tally() %>%
group_by(site) %>%
mutate(year_mon = zoo::as.yearmon(date)) %>%
ggplot(aes(x = year_mon, fill = site)) +
geom_bar(position = "stack") +
scale_fill_viridis_d(begin = 0.2, end = 0.8) +
labs(x = "Date",
y = "Total observations",
fill = "Site") +
theme_bw(base_size = 12)
# daily_plot_500 %>%
# pivot_wider(names_from = index, values_from = value) %>%
# select(date, total_obs, site) %>%
# mutate(year_mon = zoo::as.yearmon(date)) %>%
# group_by(site, year_mon) %>%
# tally() %>%
# ungroup() %>%
# mutate(year_mon = as.factor(year_mon)) %>%
# ggplot(aes(x = year_mon, y = n, fill = site)) +
# geom_bar(stat = "identity", position = "stack") +
# scale_fill_viridis_d(begin = 0.2, end = 0.8) +
# labs(x = "Date",
# y = "Total observations",
# fill = "Site") +
# theme_bw(base_size = 12) +
# theme(axis.text.x = element_text(angle = 90, h = 1))
```
### Data Analysis
Expand All @@ -399,9 +466,9 @@ values for both daily and weekly datasets, a Generalized Additive Model (GAM)
was developed to estimate Gross Primary Production (GPP)
```{r}
```{r gpp_vi_realtions_all_sites}
#| label: fig-gpp_vi_relation
#| fig-cap: "GPP VI's relation at different time scales per site"
#| fig-cap: "Relationship between MODIS 500m derived vegetation indices and GPP. Every observation corresponds to the observed GPP from a flux tower site (Borden Forest Research Station, Bartlett Experimental Forest or U. Michigan Biological Station) The summarized vegetation indices NDVI (Normalized difference vegetation index), NIRv (Near Infrared vegetation), CCI (Chlorophyll/Carotenoid Index), kNDVI (Normalized Difference Vegetation Index based on kernel methods), and EVI (Enhanced Vegetation Index) per date (daily, weekly, and monthly). The number of pixels corresponds to the number of observations used to obtain the mean of the vegetation index. The red line indicates the Generalized Additive Model for the daily and weekly relations and the Lineal Model for monthly values"
#| fig-width: 7
#| fig-height: 9
#| echo: false
Expand Down Expand Up @@ -501,8 +568,6 @@ plot_grid(daily_grid,
## Results
### Monthly GPP and VI's relations
The @tbl-lm_model_results provides a summary of linear models used for
Expand Down Expand Up @@ -656,10 +721,9 @@ cci_glance <- cci_lm %>%
```
```{r lm_monthly_table}
#| label: tbl-lm_model_results
#| tbl-cap: "Summary of Linear models for GPP estimation using the vegetation indices (Per site)"
#| tbl-cap: "Summary of Linear models for GPP estimation using the vegetation indices on a monthly basis (per site)."
#| echo: false
#| message: false
#| warning: false
Expand Down
8 changes: 4 additions & 4 deletions list_abbreviations.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ library(tibble)
library(gt)
tribble(
~"abb", ~"meaning",
~"Abbreviation", ~"Full phrase ",
"GPP" , "Gross Primary Production",
"EC", "Eddy Covariance",
"VI", "Vegetation Index",
Expand All @@ -42,9 +42,9 @@ tribble(
"CCI", "Chlorophyll/Carotenoid Index",
"kNDVI", "Normalized Difference Vegetation Index based on kernel methods"
) %>%
arrange(abb) %>%
gt()
arrange(Abbreviation) %>%
gt() %>%
tab_options(column_labels.hidden = TRUE)
```


Loading

0 comments on commit 67103b1

Please sign in to comment.