Skip to content

Commit

Permalink
dhmv: vignette
Browse files Browse the repository at this point in the history
  • Loading branch information
falkmielke committed Oct 11, 2024
1 parent 5cb2eda commit 074248c
Show file tree
Hide file tree
Showing 2 changed files with 397 additions and 0 deletions.
Binary file added man/figures/qgis_lead.avif
Binary file not shown.
397 changes: 397 additions & 0 deletions vignettes/spatial_dhmv_query.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,397 @@
---
title: "Dissecting a WCS Query: DHMV Case Study."
output: rmarkdown::html_vignette
vignette: >
%\VignetteIndexEntry{spatial_dhmv_query}
%\VignetteEngine{knitr::rmarkdown}
%\VignetteEncoding{UTF-8}
---

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>"
)
```


<a id="org86b1211"></a>

# DHMV Data

Flemish government agencies host a lot of useful web services and data. Some of those are available via web services and can be found here:

- <https://www.vlaanderen.be/datavindplaats/catalogus>


This vignette demonstrates the **query of elevation data from the Digital Elevation Model**, "Digitaal HoogteModel Vlaanderen" (DHMV). We will do it "the complicated way", assembling a web api query. [In the end](#orgd59526e), you will see how to use `inbospatial` to get the same result in a single function call.

This vignette is somewhat related to, but goes beyond the use of Web Feature Services (WFS) for spatial data, [see this tutorial by Thierry Onkelinx](https://inbo.github.io/tutorials/tutorials/spatial_wfs_services). In particular, I will document some paths and workarounds to explore if errors occur in the default `ows4R` procedure.

**TL;DR:** see the `?inbospatial:get_coverage_wcs` function to query DHMV (or other WCS) data of a given point in Flanders. An example can be found below.


<a id="org75073fb"></a>

# Situation/Motivation

The purpose of the code below is to query elevation data for an arbitrary location in Flanders. It can be found here:

- <https://www.vlaanderen.be/datavindplaats/catalogus/wcs-digitaal-hoogtemodel-vlaanderen>

Unfortunately, documentation about the API is sparse, so let's hope it complies with given standards.

We will use Web Coverage Services (WCS), as standardized by the Open Geospatial Consortium (OGC). More info on these services can be found [in the "Geocomputation with R" book, for example](https://r.geocompx.org/read-write#geographic-web-services). The go-to R package for WCS is `ows4R` by Emmanuel Blondel and contributors, [available on github](https://eblondel.github.io/ows4R).

Unfortunately, neither the geocomputation book's example, nor the vignette in [the `ows4R` package](https://eblondel.github.io/ows4R/articles/wcs.html), nor [the INBO tutorial](https://inbo.github.io/tutorials/tutorials/spatial_wfs_services) could be transferred to the DHMV usecase. The code simply did not work.

Therefore, I ended up going back to the nitty-gritty aspects of WCS by **manually building a query**.


<a id="orge6d66b9"></a>

# Limits of `ows4R` Package

Before revealing [the solution](#org803f24d), I will document the error and my path to working around it: the core problem was not trivial to see, and you might run into similar issues.

Because, in fact, the `ows4R` package seems functional. In the tutorial mentioned above, [the `ows4R` package](https://cran.r-project.org/web/packages/ows4R/index.html) is used to query data from web services. However, `ows4R` cannot handle all niche server API's.

First, we need to load some packages.

```{r setup}
require("httr")
require("sf")
require("ows4R")
require("terra")
```

We can connect a WCS client.

```{r}
WCS <- WCSClient$new("https://geo.api.vlaanderen.be/DHMV/wcs", "2.0.1", logger = "INFO")
```


We can easily query the capabilities:

```{r}
caps <- WCS$getCapabilities()
```

This would grab the same info as you can find in this xml file: <https://geo.api.vlaanderen.be/DHMV/wcs?service=WCS&request=getcapabilities>

Note that the *capabilities* xml holds a lot of valuable information:

- First and foremost, search it for `CoverageID`; in this case, I was interested in `DHMVII_DTM_1m`: grid coverage of the LIDAR elevation data for Flanders.
- We can also find that the Coordinate Reference System (CRS) `crsSupported` is EPSG:31370, aka. [Belgian Lambert 72](https://epsg.io/31370). Always good to know.
- Finally, the capabilities xml provides a `version`: 2.0.1 at the time of writing. Yet keep that in mind for a bit.

The "capabilities" are, per OGC standards, your go-to place for finding API documentation.

In R, we can receive further information conveniently via `ows4R`. This works via the "coverage summary" &#x2026;

```{r}
feature <- "DHMVII_DTM_1m"
dtm_smry <- caps$findCoverageSummaryById(feature, exact = TRUE)
dtm_smry
```

&#x2026; and the coverage description.

```{r}
# Get description from the WCS client
dtm_des <- WCS$describeCoverage(feature)
dtm_des
```


Finally, it is important to know which dimensions are available: `LON/LAT`, `x/y`, or maybe `time`?

```{r}
dtm_dims <- dtm_smry$getDimensions()
dtm_dims
```


So far, so good.

Now, following the tutorials to get some real data:

```{r}
x_test <- 148600
y_test <- 208900
radius <- 1 # (+/- m)
hopo_boxed <- OWSUtils$toBBOX(
xmin = x_test - radius,
xmax = x_test + radius,
ymin = y_test - radius,
ymax = y_test + radius
)
```

I will write the file to disk:

```{r, eval=FALSE}
tryCatch({
dtm_data = dtm_smry$getCoverage(
bbox = hopo_boxed,
filename = tempfile(fileext = ".tif")
)
dtm_data
}, error = function (err) {message(err)})
```

&#x2026; or, maybe not. The function fails, somehow.

*Status quo:*

- I receive a `tif` file (because *I called it* "tif"), supposedly some kind of GeoTIFF image (because I *asked* for `image/tiff`).
- I also receive a couple of error messages and warnings:
- *"Start tag expected, '<' not found"*
- *"cannot open this file as SpatRaster:&#x2026;"*
- *"not recognized as being in a supported file format. (GDAL error 4)"*

Maybe most useful, `ows4R` briefly displays the URL it used to attempt data query: `https://geo.api.vlaanderen.be/DHMV/wcs?service=WCS&version=2.0.1&coverageId=DHMVII_DTM_1m&subset=x(148599,148601)&subset=y(208899,208901)&format=image/tiff&request=GetCoverage`

Let's break this URL down:

- the desired `service` is indeed `WCS`,
- `version` is `2.0.1` (I would guess)
- `request`, `coverageID`, and `format` seem to be as expected
- there are `subset` components for the respective dimensions.

Feel free to try yourself to paste the URL to a browser and download the file.

Still, the downloaded file is unreadable. Subtle symptom: even if one adjusts the `subset` (i.e. `bbox` argument, above), the received tif does not change in size; the download is bbox-agnostic, so to speak.

Opening the file with a text editor such as [vim](https://www.vim.org), we see an XML part, some generic binary stuff, some hints that this [should be a GML](https://en.wikipedia.org/wiki/Geography_Markup_Language).

There are [sample geotiffs online](https://github.com/mommermi/geotiff_sample/blob/master/sample.tif), and they look different, or maybe not. Gml is clearly not geotiff, except maybe for the binary part (which I tried to extract, to no avail).

The file does not download with an extension, and it is still unclear what it actually is.


<a id="orgfaa5a3d"></a>

# The Lead: QGIS

Well, there is advanced software to open GeoTIFF'esque files. Or geography markup. At least some software might be able to identify what we actually have here.

First, if R failed, Python might work.

```
import rasterio
import rasterio.plot
data_name = "/tmp/test_hopo.tif"
tiff = rasterio.open(data_name)
rasterio.plot.show(tiff, title = "will this work? no!")
```

&#x2026; but it doesn't recognize the file type as raster data.

How about **[QGIS](https://qgis.org)?** *"Spatial Without Compromise"*, they advertise. Unfortunately, that program also was unable to open the file downloaded in R.

However, QGIS can actually connect to WCS directly. This does not help much if you need the data in R. Yet, why not.

Look and behold: qgis can actually connect to <https://geo.api.vlaanderen.be/DHMV/wcs> and query elevation data for Flanders; good indication that the `geo.api` is, somehow, functional. I could zoom and move, export images. I could even export the data. I stopped exporting the data when it threatened to entirely filled my tiny system partition (the raster data is more than 100GB) - good confirmation that we want web services for this.

And, luckily, at some careless map movement, I received an error output from QGIS. With it, QGIS gave me a working URL to query DHMV:

- `https://geo.api.vlaanderen.be/DHMV/wcs?SERVICE=WCS&VERSION=1.0.0&REQUEST=GetCoverage&FORMAT=GeoTIFF&COVERAGE=DHMVII_DTM_1m&BBOX=162064,165735,170953,168882&CRS=EPSG:31370&RESPONSE_CRS=EPSG:31370&WIDTH=1622&HEIGHT=574`


```{r echo=FALSE, out.width='90%'}
knitr::include_graphics("../man/figures/qgis_lead.avif")
```

Let's dissect this URL.


<a id="org803f24d"></a>

# The Solution: Query Building

If you start at the question mark of the URL, and split at every ampersand, you can extract the following components from the (correct) QGIS URL:

| **correct:** | |
|---------------------- |--------------------------------- |
| SERVICE | WCS |
| VERSION | 1.0.0 |
| REQUEST | GetCoverage |
| FORMAT | GeoTIFF |
| COVERAGE | DHMVII_DTM_1m |
| BBOX | 162064,165735,170953,168882 |
| CRS | EPSG:31370 |
| RESPONSE _CRS | EPSG:31370 |
| WIDTH | 1622 |
| HEIGHT | 574 |

Compare this to the (unsuccessful) `ows4R` attempt:

| **non-functional:** | |
|------------------- |--------------------------------- |
| service | WCS |
| version | 2.0.1 |
| coverageId | DHMVII_DTM_1m |
| subset | x(148599,148601) |
| subset | y(208899,208901) |
| format | image/tiff |
| request | GetCoverage |

You can selectively shuffle and adjust components, generalizing the working query and learning about the WCS API. Which feels a bit like [fuzzing](https://en.wikipedia.org/wiki/Fuzzing) the `geo.api`.

Here is what I found:

- `service` (`WCS`) and `request` (`DHMVII_DTM_1m`) were correct; the latter could be swapped for different datasets
- `coverage` is the correct keyword for reading `DHMVII_DTM_1m`
- `version` must be `1.0.0`; it currently does not work with 2.0.1
- instead of subsetting, we can use a `bbox`&#x2026;
- &#x2026; but a `width` and `height` of the output image are mandatory; alternatively there are `resx` and `resy` (not shown)
- we must specify a `CRS` (`EPSG:31370`), optionally also a `response_crs`
- the `format` is `GeoTIFF`; slashes are weird in URLs anyways

After some more trial and error, I could construct working URLs in R and generalize this to a function.

```{r}
get_elevation_wcs <- function(x, y, radius = 1, file = NULL) {
# Please note that this function lacks a lot of
# assertions and documentation.
# get bbox
bbox = paste(x - radius, x + radius,
y - radius, y + radius,
sep = ",")
# the URL components
base_url = "https://geo.api.vlaanderen.be"
endpoint = "/DHMV/wcs"
# the query parameters which worked in QGIS
elevation_query = list(
service = "WCS",
version = "1.0.0",
request = "GetCoverage",
format = "GeoTIFF",
coverage = "DHMVII_DTM_1m",
bbox = bbox,
width = 2*radius,
height = 2*radius,
crs = "EPSG:31370"
)
# get wcs data
if (is.null(file)) {
file = tempfile(fileext = ".tiff")
}
res = GET(url = modify_url(base_url, path = endpoint),
query = elevation_query,
write_disk(file))
# note: saving this to a file is optional,
# but might prevent double download or loss of data
# re-read raster file
data = rast(file)
return(data)
}
```

Example usage:

```{r}
x_test <- 148600
y_test <- 208900
radius <- 10 # (+/- m)
test_raster <- get_elevation_wcs(x_test, y_test, radius = radius)
# we can extract a value at a point:
extract(test_raster, cbind(x_test, y_test))[[1]]
```


Voila! We can now get the elevation of a given location in Flanders via DHMV web services.


<a id="orgd59526e"></a>

# The "inbospatial" Way

My colleague [Hans Van Calster](https://github.com/hansvancalster) had been facing the same problems with related API's a year earlier, and he wrote a function for it in [the \`inbospatial\` package](https://github.com/inbo/inbospatial).

Here is how to use it, for the same case as above:

```{r}
require("sf")
require("inbospatial")
bbox <- sf::st_bbox(
c(xmin = 148000, xmax = 149000, ymin = 208000, ymax = 209000),
crs = sf::st_crs(31370)
)
hopo_raster <- get_coverage_wcs(
wcs = "DHMV",
bbox = bbox,
layername = "DHMVII_DTM_1m",
resolution = 1
)
plot(hopo_raster, col = gray.colors(256))
```

The function gives you quite some options of API's to query data from.


| **wcs** | **layername** | **reference** |
|------------------- |--------------------------------- |---- |
| dtm | `EL.GridCoverage.DTM` | [here](https://www.vlaanderen.be/datavindplaats/catalogus/hoogte-dtm) |
| dsm | `EL.GridCoverage.DSM` | [here](https://www.vlaanderen.be/datavindplaats/catalogus/hoogte-dsm) |
| dhmv | `DHMVI_DTM_5m` | [here](https://metadata.vlaanderen.be/srv/dut/catalog.search#/metadata/9b0f82c7-57c4-463a-8918-432e41a66355) |
| dhmv | `DHMVII_DTM_1m` | [here](https://metadata.vlaanderen.be/srv/dut/catalog.search#/metadata/f52b1a13-86bc-4b64-8256-88cc0d1a8735) |
| dhmv | `DHMVII_DSM_1m` | [here](https://metadata.vlaanderen.be/srv/dut/catalog.search#/metadata/0da2e5e4-6886-426b-bb82-c0ffe6faeff6) |

At a quick glance, we found that the `dtm` wcs seems to include canopy points, which is not according to definition; see here: <https://www.neonscience.org/resources/learning-hub/tutorials/chm-dsm-dtm> A general advice is to confirm your data for a region you know.

In addition, there are `oi-omz` and `oi-omw` for "zummer" and winter orthomosaic images.

The burden of choice!


<a id="org575ca41"></a>

# Post Mortem

The value of this vignette seems to be limited: in essence, it provides the specific query function arguments for elevation model queries with the `geo.api` of the Flemish digital services. Quite a niche use case.

However, this is also a story of tracing errors in existing tools, and scraping other tools for hints.

Can we actually trace the error back to a source? It was pure co-incidence that QGIS provided a working link, or are there general strategies?

In retrospective, the initial issue was worsened by unfavorable circumstances.

- `ows4R` retrieved a file, despite the wrong query; yet the error was only thrown downstream.
- Documentation for the `geo.api` is lacking (or I did not find it); in particularly I missed usage examples, vignettes, or tests.

The colleagues at "Digitaal Vlaanderen" or the DHMV group certainly had enough work on their table, yet more documentation would be desirable. Same for `ows4R`, judging by their [github issue list](https://github.com/eblondel/ows4R/issues) (see e.g. [this one](https://github.com/eblondel/ows4R/issues/114)). On the other hand, [QGIS is free- and open source](https://en.wikipedia.org/wiki/QGIS), its queries seem to be well maintained and are accessible, errors are transparent, I could have read the source code - a good example piece of software.

Yet in general, if you run into issues like this, do not hesitate to contact support or file a github issue.

You should not need to fuzz an API.

Thank you for reading, and may all your spatial api queries be successful!


# Table of Contents

- [DHMV Data](#org86b1211)
- [Situation/Motivation](#org75073fb)
- [Limits of `ows4R` Package](#orge6d66b9)
- [The Lead: QGIS](#orgfaa5a3d)
- [The Solution: Query Building](#org803f24d)
- [The "inbospatial" Way](#orgd59526e)
- [Post Mortem](#org575ca41)

0 comments on commit 074248c

Please sign in to comment.