-
Notifications
You must be signed in to change notification settings - Fork 2
/
MDS_linkGramDistanceMatrix.Rmd
88 lines (66 loc) · 1.33 KB
/
MDS_linkGramDistanceMatrix.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
---
title: "MDS: Link Squared Distance Matrix and Gram Matrix in Practice"
author: "Lieven Clement"
date: "statOmics, Ghent University (https://statomics.github.io)"
---
```{r, child="_setup.Rmd"}
```
```{r echo=FALSE}
library(tidyverse)
```
# Data
Part of the iris dataset
```{r}
X <- iris[1:5,1:4] %>% as.matrix
X
```
# Centering
\[
\mathbf{H} = \mathbf{I} - \frac{1}{n} \mathbf{1}\mathbf{1}^T ,
\]
\[
\mathbf{X}_c = \mathbf{H}\mathbf{X}
\]
\[
\mathbf{H}\mathbf{X}_c = \mathbf{X}_c
\]
```{r}
H <- diag(nrow(X)) - matrix(1/nrow(X),nrow=nrow(X),ncol=nrow(X))
Xc <- H%*%X
colMeans(Xc)
H%*%Xc-Xc
```
# Gram matrix
**Gram matrix**: $\mathbf{G}=\mathbf{X}\mathbf{X}^T$
Here we work on centered data so
$\mathbf{G}=\mathbf{X}_c\mathbf{X}_c^T$
```{r}
G <- Xc%*%t(Xc)
```
# Squared Distance matrix
\[
\mathbf{D}_X = \mathbf{N} - 2\mathbf{X}\mathbf{X}^T + \mathbf{N}^T ,
\]
```{r}
N <- matrix(diag(G),nrow(Xc),nrow(Xc))
N
dist2 <- N-2*G+t(N)
dist2
dist(Xc)^2
```
# Link Gram Matrix and Squared Distance Matrix
\[\mathbf{G} =\mathbf{X}_c\mathbf{X}_c^T=-\frac{1}{2}\mathbf{H}\mathbf{D}_X\mathbf{H}\]
```{r}
G
-1/2*H%*%dist2%*%H
-1/2*H%*%dist2%*%H - G
```
# Show that N cancels out when multiplying with H
\[ \mathbf{H N H} = \mathbf{0}\]
\[ \mathbf{H N}^T \mathbf{H} = \mathbf{0}\]
```{r}
N%*%H
H%*%t(N)
```
```{r, child="_session-info.Rmd"}
```