This project is inspired of the original Dive Into Deep Learning book by Aston Zhang, Zack C. Lipton, Mu Li, Alex J. Smola and all the community contributors. We have made an effort to modify the book and convert the MXnet code snippets into PyTorch.
Note: Some ipynb notebooks may not be rendered perfectly in Github. We suggest cloning
the repo or using nbviewer to view the notebooks.
-
Ch02 Installation
-
Ch03 Introduction
-
Ch04 The Preliminaries: A Crashcourse
-
Ch05 Linear Neural Networks
- 5.1 Linear Regression
- 5.2 Linear Regression Implementation from Scratch
- 5.3 Concise Implementation of Linear Regression
- 5.4 Softmax Regression
- 5.5 Image Classification Data (Fashion-MNIST)
- 5.6 Implementation of Softmax Regression from Scratch
- 5.7 Concise Implementation of Softmax Regression
-
Ch06 Multilayer Perceptrons
- 6.1 Multilayer Perceptron
- 6.2 Implementation of Multilayer Perceptron from Scratch
- 6.3 Concise Implementation of Multilayer Perceptron
- 6.4 Model Selection Underfitting and Overfitting
- 6.5 Weight Decay
- 6.6 Dropout
- 6.7 Forward Propagation Backward Propagation and Computational Graphs
- 6.8 Numerical Stability and Initialization
- 6.9 Considering the Environment
- 6.10 Predicting House Prices on Kaggle
-
Ch07 Deep Learning Computation
- 7.1 Layers and Blocks
- 7.2 Parameter Management
- 7.3 Deferred Initialization
- 7.4 Custom Layers
- 7.5 File I/O
- 7.6 GPUs
-
Ch08 Convolutional Neural Networks
-
Ch09 Modern Convolutional Networks
-
Ch10 Recurrent Neural Networks
- 10.1 Sequence Models
- 10.2 Language Models
- 10.3 Recurrent Neural Networks
- 10.4 Text Preprocessing
- 10.5 Implementation of Recurrent Neural Networks from Scratch
- 10.6 Concise Implementation of Recurrent Neural Networks
- 10.7 Backpropagation Through Time
- 10.8 Gated Recurrent Units (GRU)
- 10.9 Long Short Term Memory (LSTM)
- 10.10 Deep Recurrent Neural Networks
- 10.11 Bidirectional Recurrent Neural Networks
- 10.12 Machine Translation and DataSets
- 10.13 Encoder-Decoder Architecture
- 10.14 Sequence to Sequence
- 10.15 Beam Search
-
Ch11 Attention Mechanism
- 11.1 Attention Mechanism
- 11.2 Sequence to Sequence with Attention Mechanism
- 11.3 Transformer
-
Ch12 Optimization Algorithms
- 12.1 Optimization and Deep Learning
- 12.2 Convexity
- 12.3 Gradient Descent
- 12.4 Stochastic Gradient Descent
- 12.5 Mini-batch Stochastic Gradient Descent
- 12.6 Momentum
- 12.7 Adagrad
- 12.8 RMSProp
- 12.9 Adadelta
- 12.10 Adam
-
Please feel free to open a Pull Request to contribute a notebook in PyTorch for the rest of the chapters.
-
Strictly follow the naming conventions for the IPython Notebooks and the subsections.
-
Also, if you think there's any section that requires more/better explanation, please use the issue tracker to open an issue and let us know about the same. We'll get back as soon as possible.
-
Find some code that needs improvement and submit a pull request.
-
Find a reference that we missed and submit a pull request.
-
Try not to submit huge pull requests since this makes them hard to understand and incorporate. Better send several smaller ones.
If you like this repo and find it useful, please consider (★) starring it, so that it can reach a broader audience.
[1] Original Book Dive Into Deep Learning -> Github Repo
[2] Deep Learning - The Straight Dope
[3] PyTorch - MXNet Cheatsheet
If you use this work or code for your research please cite the original book with the following bibtex entry.
@book{zhang2019dive,
title={Dive into Deep Learning},
author={Aston Zhang and Zachary C. Lipton and Mu Li and Alexander J. Smola},
note={\url{http://www.d2l.ai}},
year={2019}
}