-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
037e0a7
commit 29fc32d
Showing
106 changed files
with
7,178 additions
and
3 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,68 @@ | ||
# cutagi-doc | ||
Documentation Website for cuTAGI | ||
<!------------------------------------------------------------------- | ||
File: tutorial.md | ||
Description: FNN tutorial with 1D data | ||
Authors: Miquel Florensa & Luong-Ha Nguyen & James-A. Goulet | ||
Created: March 04, 2023 | ||
Updated: March 04, 2023 | ||
Contact: [email protected] & [email protected] & [email protected] | ||
Copyright (c) 2023 Miquel Florensa & Luong-Ha Nguyen & James-A. Goulet. Some rights reserved. | ||
--------------------------------------------------------------------> | ||
|
||
# py/cuTAGI Documentation | ||
|
||
> cuTAGI is an open-source Bayesian neural networks library that is based on Tractable Approximate Gaussian Inference (TAGI) theory. It supports various neural network architectures such as full-connected, convolutional, and transpose convolutional layers, as well as skip connections, pooling and normalization layers. cuTAGI is capable of performing different tasks such as supervised, unsupervised, and reinforcement learning. This library has a python API called pyTAGI that allows users to easily use the C++ and CUDA libraries. | ||
|
||
## Getting Started | ||
|
||
To get started with using our library, check out our: | ||
|
||
- [installation guide](guide/install.md) for Windows, MacOS, and Linux (CPU + GPU). | ||
- [quick tutorial](guide/quick-tutorial.md) for a 1D toy problem. | ||
|
||
## Examples | ||
|
||
In this section, you will find a series of [examples](examples/examples.md) for each available architecture that you can use as a starting point. | ||
|
||
## API | ||
|
||
Check out our [API reference](api/api.md) for a complete list of all the functions and classes in our library. | ||
|
||
## Modules | ||
|
||
pyTAGI already includes a set of modules that allow users to make their own models. Check out our [modules reference](modules/modules.md) for a list of classes and functions. | ||
|
||
## Contributing | ||
|
||
We welcome contributions from the community by 1) forking the project, 2) Create a feature branch, and 3) Commit your changes. | ||
|
||
## Support | ||
|
||
If you run into any issues or have any questions, please [open an issue](https://github.com/lhnguyen102/cuTAGI/issues) or contact us at *[email protected]* or *[email protected]*. | ||
|
||
## Citation | ||
|
||
``` | ||
@misc{cutagi2022, | ||
Author = {Luong-Ha Nguyen and James-A. Goulet}, | ||
Title = {cu{TAGI}: a {CUDA} library for {B}ayesian neural networks with Tractable Approximate {G}aussian Inference}, | ||
Year = {2022}, | ||
journal = {GitHub repository}, | ||
howpublished = {https://github.com/lhnguyen102/cuTAGI} | ||
} | ||
``` | ||
|
||
## References | ||
|
||
* [Tractable approximate Gaussian inference for Bayesian neural networks](https://www.jmlr.org/papers/volume22/20-1009/20-1009.pdf) (James-A. Goulet, Luong-Ha Nguyen, and Said Amiri. JMLR, 2021) | ||
* [Analytically tractable hidden-states inference in Bayesian neural networks](https://www.jmlr.org/papers/volume23/21-0758/21-0758.pdf) (Luong-Ha Nguyen and James-A. Goulet. JMLR, 2022) | ||
* [Analytically tractable inference in deep neural networks](https://arxiv.org/pdf/2103.05461.pdf) (Luong-Ha Nguyen and James-A. Goulet. ArXiv 2021) | ||
* [Analytically tractable Bayesian deep Q-Learning](https://arxiv.org/pdf/2106.11086.pdf) (Luong-Ha Nguyen and James-A. Goulet. ArXiv, 2021) | ||
|
||
|
||
## License | ||
|
||
cuTAGI is licensed under the [MIT License](https://github.com/lhnguyen102/cuTAGI/blob/main/LICENSE). | ||
|
||
## Acknowledgement | ||
We would like to say a big thank you to Miquel Florensa who wrote and put together this document all by himself, showing hard work and a commitment to sharing clear and detailed information. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
- [**Home**](/) | ||
- [About py/cuTAGI](about.md) | ||
- [Our Team](team.md) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
<!------------------------------------------------------------------- | ||
File: about.md | ||
Description: | ||
Authors: Miquel Florensa & Luong-Ha Nguyen & James-A. Goulet | ||
Created: March 04, 2023 | ||
Updated: March 04, 2023 | ||
Contact: [email protected] & [email protected] & [email protected] | ||
Copyright (c) 2023 Miquel Florensa & Luong-Ha Nguyen & James-A. Goulet. Some rights reserved. | ||
--------------------------------------------------------------------> | ||
|
||
# About py/cuTAGI | ||
|
||
The core developpements of py/cuTAGI have been made by Luong-Ha Nguyen building upon the theoretical work done at Polytechnique Montreal in collaboration with James-A. Goulet, Bhargob Deka, Van-Dai Vuong and Miquel Florensa. The project started in 2018 when, from our background with large-scale state-space models, we foresaw that it would be possible to perform analytical Bayesian inference in neural networks (see below our first try at what would become TAGI). | ||
<p align="center"> | ||
<img src="./images/TAGI_2018.png" width="40%" alt="TAGI initial trial iun 2018"> | ||
</p> | ||
Following the early proofs of concepts with small-scale examples with single-layer MLPs, we slowly expanded the developpement of TAGI for CNN, autotoencoders and GANs architctures. Then came proofs of concepts with reinforcement learning toy problems which led to full-scale applications on the Atari and MuJoCo benchmarks. The expansion of TAGI's applicability to new architectures continued with LSTM networks along with unprecedented features with analytical uncertainty quantification for Bayesian neural networks, analytical adversaial attacks, inference-based optimization and general purpose latent-space inference. | ||
|
||
Despite our repeated successes at leveraging analytical inference in neural network, the key limitation remaining was the lack of a efficient and scalalable library for TAGI; as the method does not relies on Backprop nor gradient descent, it is incompatible with traditionnal libraries such as PyTorch or TensorFlow. In 2021, Luong-Ha Nguyen decided to lead the developpement of the new cuTAGI plateform and later on the pyTAGI API with the objective to open the capabilities of TAGI to the entire community. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
- pyTAGI API | ||
|
||
- [Metrics](api/metrics.md) | ||
- [NetProp](api/netprop.md) | ||
- [Param](api/param.md) | ||
- [TAGI Network](api/network.md) | ||
- [TAGI Utils](api/utils.md) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
# API | ||
|
||
- [Metrics](api/metrics.md) | ||
- [NetProp](api/netprop.md) | ||
- [Param](api/param.md) | ||
- [TAGI Network](api/network.md) | ||
- [TAGI Utils](api/utils.md) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,72 @@ | ||
# metric.py | ||
|
||
Measure the accuracy of the prediction. | ||
|
||
<a href="https://github.com/miquelflorensa/cuTAGI/blob/main/pytagi/metric.py" class="github-link"> | ||
<div class="github-icon-container"> | ||
<img src="../images/GitHub-Mark.png" alt="GitHub" height="32" width="64"> | ||
</div> | ||
<div class="github-text-container"> | ||
Github Source code | ||
</div> | ||
</a> | ||
|
||
|
||
## *mse* method | ||
|
||
```python | ||
def mse(prediction: np.ndarray, observation: np.ndarray) -> float: | ||
"""Mean squared error""" | ||
``` | ||
|
||
> Calculates the mean squared error between the prediction and observation arrays. | ||
**Parameters** | ||
- `prediction` (numpy.ndarray): Array containing the predicted values. | ||
- `observation` (numpy.ndarray): Array containing the observed values. | ||
|
||
**Returns** | ||
- `float`: Mean squared error. | ||
|
||
## *log_likelihood* method | ||
|
||
```python | ||
def log_likelihood(prediction: np.ndarray, observation: np.ndarray, std: np.ndarray) -> float: | ||
"""Compute the averaged log-likelihood""" | ||
``` | ||
|
||
> Calculates the averaged log-likelihood between the prediction and observation arrays. | ||
**Parameters** | ||
- `prediction` (numpy.ndarray): Array containing the predicted values. | ||
- `observation` (numpy.ndarray): Array containing the observed values. | ||
- `std` (numpy.ndarray): Array containing the standard deviations. | ||
|
||
**Returns** | ||
- `float`: Averaged log-likelihood. | ||
|
||
## *rmse* method | ||
|
||
```python | ||
def rmse(prediction: np.ndarray, observation: np.ndarray) -> None: | ||
"""Root mean squared error""" | ||
``` | ||
|
||
> Calculates the root mean squared error between the prediction and observation arrays. | ||
**Parameters** | ||
- `prediction` (numpy.ndarray): Array containing the predicted values. | ||
- `observation` (numpy.ndarray): Array containing the observed values. | ||
|
||
## *classification_error* method | ||
|
||
```python | ||
def classification_error(prediction: np.ndarray, label: np.ndarray) -> None: | ||
"""Compute the classification error""" | ||
``` | ||
|
||
> Computes the classification error between the prediction and label arrays. | ||
**Parameters** | ||
- `prediction` (numpy.ndarray): Array containing the predicted values. | ||
- `label` (numpy.ndarray): Array containing the true labels. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,95 @@ | ||
# The NetProp class | ||
|
||
The `NetProp` class is a base class for network properties defined in the backend C++/CUDA layer. It provides various attributes and methods for defining network architecture and properties. | ||
|
||
<a href="https://github.com/miquelflorensa/cuTAGI/blob/main/pytagi/tagi_network.py" class="github-link"> | ||
<div class="github-icon-container"> | ||
<img src="../images/GitHub-Mark.png" alt="GitHub" height="32" width="64"> | ||
</div> | ||
<div class="github-text-container"> | ||
Github Source code | ||
</div> | ||
</a> | ||
|
||
## Attributes | ||
|
||
- `layers`: A list containing different [layers](api/netprop?id=layer-code) of the network architecture. | ||
- `nodes`: A list containing the number of hidden units for each layer. | ||
- `kernels`: A list containing the kernel sizes for convolutional layers. | ||
- `strides`: A list containing the strides for convolutional layers. | ||
- `widths`: A list containing the widths of the images. | ||
- `heights`: A list containing the heights of the images. | ||
- `filters`: A list containing the number of filters (depth of image) for each layer. | ||
- `activation`: A list containing the [activation](api/netprop?id=activation-code) function for each layer. | ||
- `pads`: A list containing the padding applied to the images. | ||
- `pad_types`: A list containing the types of padding. | ||
- `shortcuts`: A list containing the layer indices for residual networks. | ||
- `mu_v2b`: A NumPy array representing the mean of the observation noise squared. | ||
- `sigma_v2b`: A NumPy array representing the standard deviation of the observation noise squared. | ||
- `sigma_v`: A float representing the observation noise. | ||
- `decay_factor_sigma_v`: A float representing the decaying factor for sigma v (default value: 0.99). | ||
- `sigma_v_min`: A float representing the minimum value of the observation noise (default value: 0.3). | ||
- `sigma_x`: A float representing the input noise noise. | ||
- `is_idx_ud`: A boolean indicating whether or not to update only hidden units in the output layers. | ||
- `is_output_ud`: A boolean indicating whether or not to update the output layer. | ||
- `last_backward_layer`: An integer representing the index of the last layer whose hidden states are updated. | ||
- `nye`: An integer representing the number of observations for hierarchical softmax. | ||
- `noise_gain`: A float representing the gain for biases parameters relating to noise's hidden states. | ||
- `noise_type`: A string indicating whether the noise is homoscedastic or heteroscedastic. | ||
- `batch_size`: An integer representing the number of batches of data. | ||
- `input_seq_len`: An integer representing the sequence length for LSTM inputs. | ||
- `output_seq_len`: An integer representing the sequence length for the outputs of the last layer. | ||
- `seq_stride`: An integer representing the spacing between sequences for the LSTM layer. | ||
- `multithreading`: A boolean indicating whether or not to run parallel computing using multiple threads. | ||
- `collect_derivative`: A boolean indicating whether to enable the derivative computation mode. | ||
- `is_full_cov`: A boolean indicating whether to enable the full covariance mode. | ||
- `init_method`: A string representing the initialization method, e.g., He and Xavier. | ||
- `device`: A string indicating either "cpu" or "cuda". | ||
- `ra_mt`: A float representing the momentum for the normalization layer. | ||
|
||
## Example | ||
|
||
```python | ||
from pytagi import NetProp | ||
|
||
class RegressionMLP(NetProp): | ||
"""Multi-layer perceptron for regression task""" | ||
|
||
def __init__(self) -> None: | ||
super().__init__() | ||
self.layers = [1, 1, 1, 1] | ||
self.nodes = [13, 50, 50, 1] | ||
self.activations = [0, 4, 4, 0] | ||
self.batch_size = 10 | ||
self.sigma_v = 0.3 | ||
self.sigma_v_min: float = 0.3 | ||
self.device = "cpu" | ||
``` | ||
|
||
## Layer Code | ||
The following layer codes are used to represent different types of layers in the network: | ||
|
||
- 1: Fully-connected layer | ||
- 2: Convolutional layer | ||
- 21: Transpose convolutional layer | ||
- 3: Max pooling layer (currently not supported) | ||
- 4: Average pooling | ||
- 5: Layer normalization | ||
- 6: Batch normalization | ||
- 7: LSTM layer | ||
|
||
## Activation Code | ||
The following activation codes are used to represent different activation functions: | ||
|
||
- 0: No activation | ||
- 1: Tanh | ||
- 2: Sigmoid | ||
- 4: ReLU | ||
- 5: Softplus | ||
- 6: Leakyrelu | ||
- 7: Mixture ReLU | ||
- 8: Mixture bounded ReLU | ||
- 9: Mixture sigmoid | ||
- 10: Softmax with local linearization | ||
- 11: Remax | ||
- 12: Hierarchical softmax |
Oops, something went wrong.