Skip to content

Commit

Permalink
deploy: 43f7931
Browse files Browse the repository at this point in the history
  • Loading branch information
reveurmichael committed Oct 27, 2023
1 parent 7aaf8cb commit 2f53674
Show file tree
Hide file tree
Showing 283 changed files with 5,534 additions and 3,166 deletions.
Binary file removed _images/01_def.png
Binary file not shown.
Binary file removed _images/02_multioutput.png
Binary file not shown.
Binary file removed _images/03_direct.png
Binary file not shown.
Binary file removed _images/04_recursive.png
Binary file not shown.
Binary file removed _images/05_dirrec.png
Binary file not shown.
Binary file modified _images/kernel-method_11_0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified _images/kernel-method_15_0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified _images/kernel-method_19_0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified _images/kernel-method_26_0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed _images/linear-regression-from-scratch_12_1.png
Binary file not shown.
Binary file added _images/linear-regression-from-scratch_19_0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed _images/linear-regression-from-scratch_4_0.png
Binary file not shown.
Binary file added _images/linear-regression-from-scratch_6_0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed _images/linear-regression-from-scratch_6_1.png
Binary file not shown.
Binary file added _images/linear-regression-from-scratch_8_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes
Binary file added _images/time-series_11_0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/time-series_19_0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/time-series_25_0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/time-series_27_0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/time-series_37_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/time-series_41_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added _images/time-series_7_0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified _images/tools-of-the-trade_13_0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified _images/visualization-relationships_12_0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified _images/visualization-relationships_16_0.png
Binary file modified _images/visualization-relationships_18_1.png
Binary file modified _images/visualization-relationships_20_0.png
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"cells": [
{
"cell_type": "markdown",
"id": "c796e548",
"id": "375b52a9",
"metadata": {},
"source": [
"# Create a regression model\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"cells": [
{
"cell_type": "markdown",
"id": "b274befa",
"id": "4073789d",
"metadata": {},
"source": [
"# Exploring visualizations\n",
Expand Down

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
Expand Up @@ -32,22 +32,6 @@
"— Deep Learning, Ian Goodfellow, Yoshua Bengio, Aaron Courville"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Difference between a Loss Function and a Cost Function\n",
"\n",
"A loss function evaluates the error for a single training example, and it is occasionally referred to as an error function. In contrast, a cost function represents the **average loss** across the entire training dataset. Optimization strategies are designed to minimize this cost function.\n",
"\n",
"For a simple sample:\n",
"\n",
"The corresponding cost function of L1 Loss is the Mean of these Squared Errors (MSE).\n",
"You can see the difference of [Mathematical Expression](#regression-loss-functions)\n",
"\n",
"However, these terms are frequently used interchangeably in practical settings, they aren't precisely equivalent. From a definitional standpoint, the cost function represents an aggregation or average of the loss functions."
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -89,81 +73,6 @@
"- L2 Regularization (Ridge): Curbs the unchecked growth of model parameters without nullifying them, ensuring the model remains generalized without undue complexity."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Empirical Risk and Structural Risk\n",
"\n",
"### Definition\n",
"\n",
"Perhaps you've heard of these two concepts before. In the realms of machine learning and statistics, the concepts of empirical risk and structural risk are intricately tied to loss functions. **However, these terms aren't directly categories of loss functions perse.** Let's first clarify these concepts:\n",
"\n",
"1. **Empirical Risk:** Refers to the average loss of a model over a given dataset. Minimizing empirical risk focuses on reducing errors explicitly on the training data.\n",
"2. **Structural Risk:** Introduces a regularization term in addition to empirical risk, aiming to prevent overfitting. Minimizing structural risk strikes a balance between the empirical risk and the complexity of the model.\n",
"\n",
"Given these definitions:\n",
"\n",
"- **Empirical Risk:** Loss functions directly related to dataset performance fall under this category. From the ones been listed, regression losses (e.g., MSE, MAE, Huber Loss, L1 Loss, L2 Loss, Smooth L1 Loss), classification losses (e.g., Cross Entropy Loss, Hinge Loss, Log Loss), and structured losses (e.g., CTC or Image Segmentation Loss) can be seen as manifestations of empirical risk.\n",
"\n",
"- **Structural Risk:** Regularization losses, like L1 and L2 regularization, form part of structural risk. They don't measure the model's performance on the data directly but rather serve to rein in model complexity.\n",
"\n",
"### A Detaphor\n",
"\n",
"Maybe it's still abtract. So, now imagine you're a tailor trying to make a dress for a client.\n",
"\n",
"- **Empirical Risk:** This is like ensuring the dress fits the client perfectly based on a single fitting session. You measure every contour and make the dress to match those exact measurements. The dress is a perfect fit for the client on that particular day.\n",
"\n",
"However, what if the client gains or loses a little weight or wants to move more comfortably? A dress tailored too tightly to the exact measurements might not be very adaptable or comfortable in various situations.\n",
"\n",
"- **Structural Risk:** Now, consider that you decide to allow a bit more flexibility in the dress. You make it slightly adjustable, perhaps with some elastic portions. This way, even if the client's measurements change a bit, the dress will still fit comfortably. You're sacrificing a tiny bit of the \"perfect\" fit for the adaptability and general comfort.\n",
"\n",
"In the context of machine learning:\n",
"\n",
"Relying solely on **Empirical Risk** would be like fitting the dress exactly to the client's measurements, risking overfitting. If the data changes slightly, the model might perform poorly.\n",
"\n",
"Factoring in **Structural Risk** ensures the model isn't overly tailored to the training data and can generalize well to new, unseen data. It's about ensuring a balance between a perfect fit and adaptability.\n",
"\n",
"### Mathematical Explanation\n",
"\n",
"Now you have a general understanding of the meaning of empirical risk and structural risk. Let's delve into a more mathematical perspective:\n",
"\n",
"Given a dataset $\\mathcal{D}$ comprising input-output pairs $(x_1, y_1)$, $(x_2, y_2)$, ... $(x_n, y_n)$ and a model $f$ parameterized by $\\theta$, the empirical risk and structural risk can be formally defined as follows:\n",
"\n",
"**Empirical Risk(Cost Function):**\n",
"$$\n",
"R_{emp}(f) = \\frac{1}{n} \\sum_{i=1}^{n} L(y_i, f(x_i; \\theta))\n",
"$$\n",
"Where:\n",
"\n",
"- **$L$ is the loss function**, measuring the discrepancy between the predicted value $f(x_i; \\theta)$ and the actual output $y_i$.\n",
"\n",
"Empirical risk quantifies how well the model fits the given dataset, representing the average loss of the model on the training data.\n",
"\n",
"**Structural Risk(Objective Function):**\n",
"$$\n",
"R_{struc}(f) = R_{emp}(f) + \\lambda R_{reg}(\\theta)\n",
"$$\n",
"Where:\n",
"- $R_{reg}(\\theta)$ is the regularization term, penalizing the complexity of the model.\n",
"- $\\lambda$ is a regularization coefficient determining the weight of the regularization term relative to the empirical risk.\n",
"\n",
"Structural risk is a combination of the empirical risk and a penalty for model complexity. It strikes a balance between fitting the training data (empirical risk) and ensuring the model isn't overly complex (which can lead to overfitting).\n",
"\n",
"**Differences and Relations:**\n",
"\n",
"1. **Empirical Risk** focuses solely on minimizing the error on the training data without considering model complexity or how it generalizes to unseen data.\n",
"2. **Structural Risk** takes into account both the empirical risk and the complexity of the model. By introducing a regularization term, it ensures that the model doesn't become overly complex and overfit the training data. Thus, it balances performance on training data with generalization to new data.\n",
"\n",
"In essence, while empirical risk aims for performance on the current dataset, structural risk aims for good performance on new data by penalizing overly complex models.\n",
"\n",
"### Cost Function and Objective Function\n",
"\n",
"The empirical risk and cost functions are in many cases the same and represent the average loss on the training data.\n",
"\n",
"Structural risk is often viewed as an objective function, especially when regularization is considered. But the term \"objective function\" is broader and is not limited to structural risk but can also include other optimization objectives."
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -271,34 +180,14 @@
"metadata": {},
"source": [
"### Classification Loss Functions\n",
"\n",
"1. **Cross Entropy Loss**\n",
"\n",
"$$\n",
"L(y, p) = - \\sum_{i=1}^{C} y_i \\log(p_i)\n",
"$$\n",
"\n",
"Where $y_i$ is the actual label (0 or 1) and $p_i$ is the predicted probability for the respective class."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"y_true = tf.constant([[0, 1], [1, 0], [1, 0]])\n",
"y_pred = tf.constant([[0.05, 0.95], [0.1, 0.9], [0.8, 0.2]])\n",
"loss = tf.keras.losses.CategoricalCrossentropy()(y_true, y_pred)\n",
"\n",
"print(loss.numpy())"
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"2. **Hinge Loss**\n",
"1. **Hinge Loss**\n",
"\n",
"$$\n",
"L(y, \\hat{y}) = \\max(0, 1 - y \\cdot \\hat{y})\n",
Expand All @@ -320,205 +209,6 @@
"print(loss.numpy())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"3. **Binary Cross Entropy(Log Loss)**\n",
"\n",
"Mathematically, it is the preferred loss function under the inference framework of maximum likelihood. It is the loss function to be evaluated first and only changed if you have a good reason.\n",
"\n",
"Cross-entropy will calculate a score that summarizes the average difference between the actual and predicted probability distributions for predicting class 1. The score is minimized and a perfect cross-entropy value is 0.\n",
"\n",
"This YouTube video by Andrew Ng explains very well Binary Cross Entropy Loss (make sure that you have access to YouTube for this web page to render correctly):"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from IPython.display import HTML\n",
"\n",
"display(HTML(\n",
" \"\"\"\n",
" <iframe src=\"https://www.youtube.com/embed/SHEPb1JHw5o\" allowfullscreen></iframe>\n",
" \"\"\"\n",
"))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"y_true = tf.constant([0, 1, 0])\n",
"y_pred = tf.constant([0.05, 0.95, 0.1])\n",
"loss = tf.keras.losses.BinaryCrossentropy()(y_true, y_pred)\n",
"\n",
"print(loss.numpy())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"4. **Multi-Class Cross-Entropy Loss**\n",
"\n",
"Mathematically, it is the preferred loss function under the inference framework of maximum likelihood. It is the loss function to be evaluated first and only changed if you have a good reason.\n",
"\n",
"Cross-entropy will calculate a score that summarizes the average difference between the actual and predicted probability distributions for all classes in the problem. The score is minimized and a perfect cross-entropy value is 0."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"y_true = [[1, 0, 0],\n",
" [0, 1, 0],\n",
" [0, 0, 1],\n",
" [1, 0, 0],\n",
" [0, 1, 0]]\n",
"\n",
"# Mock predicted probabilities from a model\n",
"y_pred = [[0.7, 0.2, 0.1],\n",
" [0.2, 0.5, 0.3],\n",
" [0.1, 0.2, 0.7],\n",
" [0.6, 0.3, 0.1],\n",
" [0.1, 0.6, 0.3]]\n",
"\n",
"y_true = tf.constant(y_true, dtype=tf.float32)\n",
"y_pred = tf.constant(y_pred, dtype=tf.float32)\n",
"\n",
"loss = tf.reduce_mean(-tf.reduce_sum(y_true * tf.math.log(y_pred), axis=1))\n",
"\n",
"print(\"Multi-Class Cross-Entropy Loss:\", loss.numpy())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Structured Loss Functions\n",
"\n",
"1. **CTC Loss (Connectionist Temporal Classification)**\n",
"\n",
"Used for sequence-to-sequence problems, like speech recognition."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"\n",
"y_true = np.array([[1, 2]]) # (batch, timesteps)\n",
"y_pred = np.array([[[0.1, 0.6, 0.3], [0.3, 0.1, 0.6]]]) # (batch, timesteps, num_classes)\n",
"logit_length = [2]\n",
"label_length = [2]\n",
"loss = tf.keras.backend.ctc_batch_cost(y_true, y_pred, logit_length, label_length)\n",
"\n",
"print(loss.numpy())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"2. **Dice Loss, IoU Loss**\n",
"\n",
"Used for image segmentation tasks."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def dice_loss(y_true, y_pred):\n",
" numerator = 2 * tf.reduce_sum(y_true * y_pred, axis=-1)\n",
" denominator = tf.reduce_sum(y_true + y_pred, axis=-1)\n",
" return 1 - (numerator + 1) / (denominator + 1)\n",
"\n",
"y_true = tf.constant([[1, 0, 1], [0, 1, 0]])\n",
"y_pred = tf.constant([[0.8, 0.2, 0.6], [0.3, 0.7, 0.1]])\n",
"loss = dice_loss(y_true, y_pred)\n",
"\n",
"print(loss.numpy())"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def iou_loss(y_true, y_pred):\n",
" intersection = tf.reduce_sum(y_true * y_pred, axis=[1, 2, 3])\n",
" union = tf.reduce_sum(y_true, axis=[1, 2, 3]) + tf.reduce_sum(y_pred, axis=[1, 2, 3]) - intersection\n",
" return 1. - (intersection + 1) / (union + 1)\n",
"\n",
"# For simplicity, using 2D tensors. Typically, these are images (3D tensors).\n",
"y_true = tf.constant([[1, 0, 1], [0, 1, 0]])\n",
"y_pred = tf.constant([[0.8, 0.2, 0.6], [0.3, 0.7, 0.1]])\n",
"loss = iou_loss(y_true[tf.newaxis, ...], y_pred[tf.newaxis, ...]) # Add batch dimension\n",
"\n",
"print(loss.numpy())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Regularization\n",
"\n",
"1. **L1 Regularization (Lasso)**\n",
"\n",
"Produces sparse model parameters."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from keras.regularizers import l1\n",
"\n",
"model = tf.keras.models.Sequential([\n",
" tf.keras.layers.Dense(64, activation='relu', kernel_regularizer=l1(0.01), input_shape=(10,))\n",
"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"2. **L2 Regularization (Ridge)**\n",
"\n",
"Prevents model parameters from becoming too large but doesn't force them to become exactly zero."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from keras.regularizers import l2\n",
"\n",
"model = tf.keras.models.Sequential([\n",
" tf.keras.layers.Dense(64, activation='relu', kernel_regularizer=l2(0.01), input_shape=(10,))\n",
"])"
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down
2 changes: 1 addition & 1 deletion _sources/assignments/ml-fundamentals/parameter-play.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"cells": [
{
"cell_type": "markdown",
"id": "77832121",
"id": "c4f4e7c1",
"metadata": {},
"source": [
"# Parameter play\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"cells": [
{
"cell_type": "markdown",
"id": "304850dd",
"id": "be4b65ba",
"metadata": {},
"source": [
"# Regression with Scikit-learn\n",
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"cells": [
{
"cell_type": "markdown",
"id": "7798e2d4",
"id": "46278fc3",
"metadata": {},
"source": [
"# Retrying some regression\n",
Expand Down
Loading

0 comments on commit 2f53674

Please sign in to comment.