Skip to content

Commit

Permalink
Create w05_dataarch.qmd
Browse files Browse the repository at this point in the history
  • Loading branch information
g-filomena committed Mar 12, 2024
1 parent cc60748 commit a480eef
Showing 1 changed file with 214 additions and 0 deletions.
214 changes: 214 additions & 0 deletions lectures/w05_dataarch.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,214 @@
---
title: "Geographic Data Science"
subtitle: "Point Patterns"
author: "Elisabetta Pietrostefani & Carmen Cabrera-Arnau"
format:
revealjs:
navigation-mode: grid
align-items: center;
---

# The *point* of points

# Points like polygons

- Points *can* represent "fixed" entities
- In this case, points are qualitatively similar to polygons/lines
- The goal here is, taking location fixed, to model other aspects of the data

# Points like polygons

Examples: - Cities (in most cases) - Buildings - Polygons represented as their centroid - ...

# When points are not polygons

Point data are not only a different geometry than polygons or lines... <br> ... Points can also represent a fundamentally different way to approach spatial analysis

# Points unlike polygons

# A few examples

#

<center>

<img alt="centered image" data-src="./figs/l09_crime.png" width="60%" height="60%"/>

<center>

#

<center>

<img alt="centered image" data-src="./figs/l09_trees.png" width="65%" height="65%"/>

<center>

#

<center>

<img alt="centered image" data-src="./figs/l09_mapbox.png" width="65%" height="65%"/>

<center>

# Points patterns

# Points patterns

Distribution of **points** over a portion of **space** Assumption is a point can happen anywhere on that space, but only happens in specific locations

- **Unmarked**: locations only
- **Marked**: values attached to each point

# Point Pattern Analysis

Describe, characterize, and explain point patterns, focusing on their **generating process**

- Visual exploration
- Clustering properties and clusters
- Statistical modeling of the underlying processes

# Visualization of Point Patterns

# Visualization of PPs

Four routes (today):

- One-to-one mapping -- "Scatter plot"
- Aggregate -- "Histogram"
- Smooth -- KDE
- Smooth -- Interpolation

# One-to-one

- Intuitive
- Effective in small datasets
- Limited as size increases until useless

# One-to-one

<center>

<img alt="centered image" data-src="./figs/l09_liv_pts.png" width="50%" height="50%"/>

<center>

# Aggregation

# Points meet polygons

- Use polygon boundaries and count points per area \[Insert your skills for choropleth mapping here!!!\]
- But, the polygons need to *"make sense"* (their delineation needs to relate to the point generating process)

#

<center>

<img data-src="./figs/l09_liv_pts.png" height="400"/>   <img data-src="./figs/l09_liv_cho.png" height="400"/>

<center>

# Hex-binning

If no polygon boundary seems like a good candidate for aggregation... ...draw a hexagonal (or squared) tesselation!!!

Hexagons...

- Are regular
- Exhaust the space (Unlike circles)
- Have many sides (minimize boundary problems)

#

<center>

<img data-src="./figs/l09_liv_pts.png" height="300"/>   <img data-src="./figs/l09_liv_hex_empty.png" height="300"/>    <img data-src="./figs/l09_liv_hex_filled.png" height="300"/>

<center>

# But

- (Arbitrary) aggregation may induce MAUP
- Points usually represent events that affect only part of the population and hence are best considered as rates

# Kernet Density Estimation (KDE)

# KDE

Estimate the **(continuous)** observed distribution of a variable

- Probability of finding an observation at a given point
- "Continuous histogram"
- Solves (much of) the MAUP problem, but not the underlying population issue

# Bivariate (spatial) KDE

Probability of finding observations at a given point in space

- **Bivariate** version: distribution of pairs of values
- In **space**: values are coordinates (XY), locations
- Continuous "version" of a choropleth

#

<center>

<img alt="centered image" data-src="./figs/l09_kde2d.png" width="75%" height="75%"/>

<center>

#

<center>

<img data-src="./figs/l09_liv_pts.png" height="350"/>   <img data-src="./figs/l09_liv_kde.png" height="350"/>

<center>

# Interpolation

- Estimating values spatially continuous variables for spatial locations where they **have not** been observed, based on observations.
- **Geostatistics**, is concerned with the modelling, prediction and simulation of spatially continuous phenomena.

# Inverse Distance Weighting (IDW)

- We observe a property of a phenomenon $Z(s)$ at a **limited** number of sample locations, and are interested in the property value at **all** locations.
- Have to predict it for unobserved locations.

# Kriging

If we were predicting prices

$$Price_i = \sum^N_{j=1} w_j * Price_j + \epsilon_i$$

- with $w_j = (\frac{1}{d_{ij}})^2$ for all $i$ and $j \neq i$
- $d$ the distance between $i$ and $j$.

#

<center>

<img alt="centered image" data-src="./figs/l09_idw.png" width="75%" height="75%"/>

<center>

# Parametres

- **Variable**: for example price
- **Nearest Neighbours** : the number of nearest observations that should be used
- **idp** : set inverse distance power to 2

A super useful link [here](https://gisgeography.com/inverse-distance-weighting-idw-interpolation/)

# Parametres

idp = 1 <img data-src="./figs/L09_pw1.png" height="250"/> <br> idp = 2 <img data-src="./figs/L09_pw2.png" height="250"/>

# Density-Based Spatial Clustering of Applications with Noise, or DBSCAN

# Questions

#

<a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/"><img src="https://i.creativecommons.org/l/by-sa/4.0/88x31.png" alt="Creative Commons License" style="border-width:0"/></a><br />[Geographic Data Science]{xmlns:dct="http://purl.org/dc/terms/" property="dct:title"} by <a xmlns:cc="http://creativecommons.org/ns#" href="http://pietrostefani.com" property="cc:attributionName" rel="cc:attributionURL">Elisabetta Pietrostefani</a> is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-sa/4.0/">Creative Commons Attribution-ShareAlike 4.0 International License</a>.

0 comments on commit a480eef

Please sign in to comment.