Skip to content

Commit

Permalink
Init commit
Browse files Browse the repository at this point in the history
  • Loading branch information
lhnguyen102 committed Sep 9, 2023
1 parent 037e0a7 commit 29fc32d
Show file tree
Hide file tree
Showing 106 changed files with 7,178 additions and 3 deletions.
Binary file added .DS_Store
Binary file not shown.
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
MIT License

Copyright (c) 2023 CIVML
Copyright (c) 2023 BayesWorks

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
Expand Down
70 changes: 68 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,68 @@
# cutagi-doc
Documentation Website for cuTAGI
<!-------------------------------------------------------------------
File: tutorial.md
Description: FNN tutorial with 1D data
Authors: Miquel Florensa & Luong-Ha Nguyen & James-A. Goulet
Created: March 04, 2023
Updated: March 04, 2023
Contact: [email protected] & [email protected] & [email protected]
Copyright (c) 2023 Miquel Florensa & Luong-Ha Nguyen & James-A. Goulet. Some rights reserved.
-------------------------------------------------------------------->

# py/cuTAGI Documentation

> cuTAGI is an open-source Bayesian neural networks library that is based on Tractable Approximate Gaussian Inference (TAGI) theory. It supports various neural network architectures such as full-connected, convolutional, and transpose convolutional layers, as well as skip connections, pooling and normalization layers. cuTAGI is capable of performing different tasks such as supervised, unsupervised, and reinforcement learning. This library has a python API called pyTAGI that allows users to easily use the C++ and CUDA libraries.

## Getting Started

To get started with using our library, check out our:

- [installation guide](guide/install.md) for Windows, MacOS, and Linux (CPU + GPU).
- [quick tutorial](guide/quick-tutorial.md) for a 1D toy problem.

## Examples

In this section, you will find a series of [examples](examples/examples.md) for each available architecture that you can use as a starting point.

## API

Check out our [API reference](api/api.md) for a complete list of all the functions and classes in our library.

## Modules

pyTAGI already includes a set of modules that allow users to make their own models. Check out our [modules reference](modules/modules.md) for a list of classes and functions.

## Contributing

We welcome contributions from the community by 1) forking the project, 2) Create a feature branch, and 3) Commit your changes.

## Support

If you run into any issues or have any questions, please [open an issue](https://github.com/lhnguyen102/cuTAGI/issues) or contact us at *[email protected]* or *[email protected]*.

## Citation

```
@misc{cutagi2022,
Author = {Luong-Ha Nguyen and James-A. Goulet},
Title = {cu{TAGI}: a {CUDA} library for {B}ayesian neural networks with Tractable Approximate {G}aussian Inference},
Year = {2022},
journal = {GitHub repository},
howpublished = {https://github.com/lhnguyen102/cuTAGI}
}
```

## References

* [Tractable approximate Gaussian inference for Bayesian neural networks](https://www.jmlr.org/papers/volume22/20-1009/20-1009.pdf) (James-A. Goulet, Luong-Ha Nguyen, and Said Amiri. JMLR, 2021)
* [Analytically tractable hidden-states inference in Bayesian neural networks](https://www.jmlr.org/papers/volume23/21-0758/21-0758.pdf) (Luong-Ha Nguyen and James-A. Goulet. JMLR, 2022)
* [Analytically tractable inference in deep neural networks](https://arxiv.org/pdf/2103.05461.pdf) (Luong-Ha Nguyen and James-A. Goulet. ArXiv 2021)
* [Analytically tractable Bayesian deep Q-Learning](https://arxiv.org/pdf/2106.11086.pdf) (Luong-Ha Nguyen and James-A. Goulet. ArXiv, 2021)


## License

cuTAGI is licensed under the [MIT License](https://github.com/lhnguyen102/cuTAGI/blob/main/LICENSE).

## Acknowledgement
We would like to say a big thank you to Miquel Florensa who wrote and put together this document all by himself, showing hard work and a commitment to sharing clear and detailed information.
3 changes: 3 additions & 0 deletions _sidebar.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
- [**Home**](/)
- [About py/cuTAGI](about.md)
- [Our Team](team.md)
19 changes: 19 additions & 0 deletions about.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
<!-------------------------------------------------------------------
File: about.md
Description:
Authors: Miquel Florensa & Luong-Ha Nguyen & James-A. Goulet
Created: March 04, 2023
Updated: March 04, 2023
Contact: [email protected] & [email protected] & [email protected]
Copyright (c) 2023 Miquel Florensa & Luong-Ha Nguyen & James-A. Goulet. Some rights reserved.
-------------------------------------------------------------------->

# About py/cuTAGI

The core developpements of py/cuTAGI have been made by Luong-Ha Nguyen building upon the theoretical work done at Polytechnique Montreal in collaboration with James-A. Goulet, Bhargob Deka, Van-Dai Vuong and Miquel Florensa. The project started in 2018 when, from our background with large-scale state-space models, we foresaw that it would be possible to perform analytical Bayesian inference in neural networks (see below our first try at what would become TAGI).
<p align="center">
<img src="./images/TAGI_2018.png" width="40%" alt="TAGI initial trial iun 2018">
</p>
Following the early proofs of concepts with small-scale examples with single-layer MLPs, we slowly expanded the developpement of TAGI for CNN, autotoencoders and GANs architctures. Then came proofs of concepts with reinforcement learning toy problems which led to full-scale applications on the Atari and MuJoCo benchmarks. The expansion of TAGI's applicability to new architectures continued with LSTM networks along with unprecedented features with analytical uncertainty quantification for Bayesian neural networks, analytical adversaial attacks, inference-based optimization and general purpose latent-space inference.

Despite our repeated successes at leveraging analytical inference in neural network, the key limitation remaining was the lack of a efficient and scalalable library for TAGI; as the method does not relies on Backprop nor gradient descent, it is incompatible with traditionnal libraries such as PyTorch or TensorFlow. In 2021, Luong-Ha Nguyen decided to lead the developpement of the new cuTAGI plateform and later on the pyTAGI API with the objective to open the capabilities of TAGI to the entire community.
7 changes: 7 additions & 0 deletions api/_sidebar.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
- pyTAGI API

- [Metrics](api/metrics.md)
- [NetProp](api/netprop.md)
- [Param](api/param.md)
- [TAGI Network](api/network.md)
- [TAGI Utils](api/utils.md)
7 changes: 7 additions & 0 deletions api/api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
# API

- [Metrics](api/metrics.md)
- [NetProp](api/netprop.md)
- [Param](api/param.md)
- [TAGI Network](api/network.md)
- [TAGI Utils](api/utils.md)
72 changes: 72 additions & 0 deletions api/metrics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
# metric.py

Measure the accuracy of the prediction.

<a href="https://github.com/miquelflorensa/cuTAGI/blob/main/pytagi/metric.py" class="github-link">
<div class="github-icon-container">
<img src="../images/GitHub-Mark.png" alt="GitHub" height="32" width="64">
</div>
<div class="github-text-container">
Github Source code
</div>
</a>


## *mse* method

```python
def mse(prediction: np.ndarray, observation: np.ndarray) -> float:
"""Mean squared error"""
```

> Calculates the mean squared error between the prediction and observation arrays.
**Parameters**
- `prediction` (numpy.ndarray): Array containing the predicted values.
- `observation` (numpy.ndarray): Array containing the observed values.

**Returns**
- `float`: Mean squared error.

## *log_likelihood* method

```python
def log_likelihood(prediction: np.ndarray, observation: np.ndarray, std: np.ndarray) -> float:
"""Compute the averaged log-likelihood"""
```

> Calculates the averaged log-likelihood between the prediction and observation arrays.
**Parameters**
- `prediction` (numpy.ndarray): Array containing the predicted values.
- `observation` (numpy.ndarray): Array containing the observed values.
- `std` (numpy.ndarray): Array containing the standard deviations.

**Returns**
- `float`: Averaged log-likelihood.

## *rmse* method

```python
def rmse(prediction: np.ndarray, observation: np.ndarray) -> None:
"""Root mean squared error"""
```

> Calculates the root mean squared error between the prediction and observation arrays.
**Parameters**
- `prediction` (numpy.ndarray): Array containing the predicted values.
- `observation` (numpy.ndarray): Array containing the observed values.

## *classification_error* method

```python
def classification_error(prediction: np.ndarray, label: np.ndarray) -> None:
"""Compute the classification error"""
```

> Computes the classification error between the prediction and label arrays.
**Parameters**
- `prediction` (numpy.ndarray): Array containing the predicted values.
- `label` (numpy.ndarray): Array containing the true labels.
95 changes: 95 additions & 0 deletions api/netprop.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
# The NetProp class

The `NetProp` class is a base class for network properties defined in the backend C++/CUDA layer. It provides various attributes and methods for defining network architecture and properties.

<a href="https://github.com/miquelflorensa/cuTAGI/blob/main/pytagi/tagi_network.py" class="github-link">
<div class="github-icon-container">
<img src="../images/GitHub-Mark.png" alt="GitHub" height="32" width="64">
</div>
<div class="github-text-container">
Github Source code
</div>
</a>

## Attributes

- `layers`: A list containing different [layers](api/netprop?id=layer-code) of the network architecture.
- `nodes`: A list containing the number of hidden units for each layer.
- `kernels`: A list containing the kernel sizes for convolutional layers.
- `strides`: A list containing the strides for convolutional layers.
- `widths`: A list containing the widths of the images.
- `heights`: A list containing the heights of the images.
- `filters`: A list containing the number of filters (depth of image) for each layer.
- `activation`: A list containing the [activation](api/netprop?id=activation-code) function for each layer.
- `pads`: A list containing the padding applied to the images.
- `pad_types`: A list containing the types of padding.
- `shortcuts`: A list containing the layer indices for residual networks.
- `mu_v2b`: A NumPy array representing the mean of the observation noise squared.
- `sigma_v2b`: A NumPy array representing the standard deviation of the observation noise squared.
- `sigma_v`: A float representing the observation noise.
- `decay_factor_sigma_v`: A float representing the decaying factor for sigma v (default value: 0.99).
- `sigma_v_min`: A float representing the minimum value of the observation noise (default value: 0.3).
- `sigma_x`: A float representing the input noise noise.
- `is_idx_ud`: A boolean indicating whether or not to update only hidden units in the output layers.
- `is_output_ud`: A boolean indicating whether or not to update the output layer.
- `last_backward_layer`: An integer representing the index of the last layer whose hidden states are updated.
- `nye`: An integer representing the number of observations for hierarchical softmax.
- `noise_gain`: A float representing the gain for biases parameters relating to noise's hidden states.
- `noise_type`: A string indicating whether the noise is homoscedastic or heteroscedastic.
- `batch_size`: An integer representing the number of batches of data.
- `input_seq_len`: An integer representing the sequence length for LSTM inputs.
- `output_seq_len`: An integer representing the sequence length for the outputs of the last layer.
- `seq_stride`: An integer representing the spacing between sequences for the LSTM layer.
- `multithreading`: A boolean indicating whether or not to run parallel computing using multiple threads.
- `collect_derivative`: A boolean indicating whether to enable the derivative computation mode.
- `is_full_cov`: A boolean indicating whether to enable the full covariance mode.
- `init_method`: A string representing the initialization method, e.g., He and Xavier.
- `device`: A string indicating either "cpu" or "cuda".
- `ra_mt`: A float representing the momentum for the normalization layer.

## Example

```python
from pytagi import NetProp

class RegressionMLP(NetProp):
"""Multi-layer perceptron for regression task"""

def __init__(self) -> None:
super().__init__()
self.layers = [1, 1, 1, 1]
self.nodes = [13, 50, 50, 1]
self.activations = [0, 4, 4, 0]
self.batch_size = 10
self.sigma_v = 0.3
self.sigma_v_min: float = 0.3
self.device = "cpu"
```

## Layer Code
The following layer codes are used to represent different types of layers in the network:

- 1: Fully-connected layer
- 2: Convolutional layer
- 21: Transpose convolutional layer
- 3: Max pooling layer (currently not supported)
- 4: Average pooling
- 5: Layer normalization
- 6: Batch normalization
- 7: LSTM layer

## Activation Code
The following activation codes are used to represent different activation functions:

- 0: No activation
- 1: Tanh
- 2: Sigmoid
- 4: ReLU
- 5: Softplus
- 6: Leakyrelu
- 7: Mixture ReLU
- 8: Mixture bounded ReLU
- 9: Mixture sigmoid
- 10: Softmax with local linearization
- 11: Remax
- 12: Hierarchical softmax
Loading

0 comments on commit 29fc32d

Please sign in to comment.