generated from r4ds/bookclub-template
-
Notifications
You must be signed in to change notification settings - Fork 2
/
13_backpropagation--mlp.Rmd
134 lines (88 loc) · 2.77 KB
/
13_backpropagation--mlp.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
# Backpropagation & MLP
**Learning objectives:**
- Linear Models and ReLU
- Multi Layer Perceptron (MLP) from scratch
## Linear Models and ReLU {-}
```{r}
library(tidyverse)
theme_set(theme_bw())
x <- seq(-2, 2, by = 0.01) + 1
y <- x
gg_line <- ggplot() +
geom_line(aes(x = x, y = y)) +
ggtitle("Example linear model for y given x")
gg_line
```
Using only lines we want to draw a curve like the following:
```{r}
gg_line +
geom_line(aes(x = x, y = (-(x-1)^2+2)), col = "blue")
```
- Can't add a bunch of lines because we will just get a line back
## Linear Models and ReLU {-}
What if we create piecewise lines like:
```{r}
gg_line +
geom_line(aes(x = x, y = -0.5*ifelse(x>0, x, 0) - 1), col = "red")
```
Then add them together to get:
```{r}
re_lu1 <- x - 0.5*ifelse(x>0, x, 0) - 1
ggplot() +
geom_line(aes(x = x, y = re_lu1)) +
ggtitle("Example linear model for y given x")
```
Again:
```{r}
gg_line +
geom_line(aes(x = x, y = -0.5*ifelse(x>0, x, 0) - 1), col = "red")+
geom_line(aes(x = x, y = -0.5*ifelse(x-1>0, x-1, 0) - 1), col = "blue")
```
```{r}
re_lu2 <- x - 0.5*ifelse(x-1>0, x-1, 0) - 1
ggplot() +
geom_line(aes(x = x, y = re_lu1), col = "red") +
geom_line(aes(x = x, y = re_lu2), col = "blue") +
ggtitle("Example linear model for y given x")
```
## Multi Layer Perceptron (MLP) from scratch {-}
Go to notebook
## Loss function from scratch - Mean Squared Error (MSE) {-}
Go to notebook; show screenshot
## Gradients and backpropagation diagram {-}
Reminders about SGD:
- move parameters (weights and biases) in (neg.) direction of gradients
- gradient matrix for every change in input what's the change in output (possibly more than 1)
## Matrix calculus resources {-}
[Matrix Calculus for Deep Learning](https://explained.ai/matrix-calculus/)
## Gradients and backpropagation code {-}
Go to notebook
## Chain rule visualized + how it applies {-}
[The Intuitive Notion of the Chain Rule](https://webspace.ship.edu/msrenault/geogebracalculus/derivative_intuitive_chain_rule.html)
Go through chain rule for $3x^2 + 9$?
## Using Python’s built in debugger {-}
- Add line to lin-grad
- press h for help
- explore shape of objects
- einsum demo to make change:
```python
w.g = (inp.unsqueeze(-1) * out.g.unsqueeze(1)).sum(0)
w.g = [email protected]
```
## Refactoring the code {-}
- create a class for the functions
- __call__ : advantage to use it like a function, but instead is an instance of the class
## Minibatch Training {-}
- refresh on cross entropy with Excel file
- tricks with indexing vs one-hot-encoding
- refresh on log and exponent rules
- safe logsumexp by subtracting max
## Meeting Videos {-}
### Cohort 1 {-}
`r knitr::include_url("https://www.youtube.com/embed/URL")`
<details>
<summary> Meeting chat log </summary>
```
LOG
```
</details>