A python library focuses on constructing Deep Probabilistic Models (DPMs). Our developed Pydpm not only provides efficient distribution sampling functions on GPU, but also has included the implementations of existing popular DPMs.
Documentation | Paper [Arxiv] | Tutorials | Benchmarks | Examples |
🔥A new version that does not depend on Pycuda has been released.
🔥An abundance of professional learning materials on Deep Generative Models from the Ermom's group at Stanford University. (CS236 - Fall 2021)
🔥A tutorial of DPMs has been uploaded by Prof. Wilker Aziz (University of Amsterdam).
The current version of PyDPM can be installed under either Windows or Linux system with PyPI.
$ pip install pydpm
For Windows system, we recommed to install Visual Studio 2019 as the compiler equipped with CUDA 11.5 toolkit; For Linux system, we recommed to install the latest version of CUDA toolkit.
The enviroment for testing has been released for easily reproducing our results.
$ conda env create -f enviroment.yaml
The overview of the framework of PyDPM library can be roughly split into four sectors, specifically Sampler, Model, Evaluation, and Example modules, which have been illustrated as follows:
- Sampler module includes both parts of the basic Distribution Sampler and the sophisticate Model Sampler, which can effectively accomplish the sampling requirements of these DPMs constructed on either CPU or GPU;
- Model module contains a wide variety of classical and popular DPMs, which can be directly called as APIs in Python;
- Evaluation module provides a DataLoader sub-module to process data samples in various forms, such as images, text, graphs etc., and also a Metric sub-module to comprehensively evaluate these DPMs after training;
- Example module, for each DPM included in the Model module, we provides a corresponding code demo equipped with a detailed explanation in the official docs.
The workflow of applying PyDPM for downstream tasks, which can be splited into four steps as follows:
- Device deployment of pyDPM can be choose as a platform with either CPU or GPU;
- Mechasnisms of model training or testing includes either or both of Gibbs sampling and back propagation, implemented by pyDPM.sampler and pyTorch respecitveily;
- Model categories in pyDPM mainly include Bayesian Probabilistic Model, Deep-Learning Probabilistic Models, and Hybrid Probabilistic Models;
- Applications of DPMs has included Nature Language Processing (NLP), Graph Neural Network (GNN), and Recommendation System (RS) etc.
The Model module in pyDPM has included a wide variety of popular DPMs, which can be roughly split into several categories, including Bayesian Probabilistic Model, Deep-Learning Probabilistic Models, and Hybrid Probabilistic Models.
Probabilistic Model Name | Abbreviation | Paper Link |
---|---|---|
Latent Dirichlet Allocation | LDA | Blei et al., 2003 |
Poisson Factor Analysis | PFA | Zhou et al., 2012 |
Poisson Gamma Belief Network | PGBN | Zhou et al., 2015 |
Convolutional Poisson Factor Analysis | CPFA | Wang et al., 2019 |
Convolutional Poisson Gamma Belief Network | CPGBN | Wang et al., 2019 |
Factor Analysis | FA | |
Gaussian Mixed Model | GMM | |
Poisson Gamma Dynamical Systems | PGDS | Zhou et al., 2016 |
Deep Poisson Gamma Dynamical Systems | DPGDS | Guo et al., 2018 |
Dirichlet Belief Networks | DirBN | Zhao et al., 2018 |
Deep Poisson Factor Analysis | DPFA | Gan et al., 2015 |
Word Embeddings Deep Topic Model | WEDTM | Zhao et al., 2018 |
Multimodal Poisson Gamma Belief Network | MPGBN | Wang et al., 2018 |
Graph Poisson Gamma Belief Network | GPGBN | Wang et al., 2020 |
Probabilistic Model Name | Abbreviation | Paper Link |
---|---|---|
Restricted Boltzmann Machines | RBM | Hinton et al., 2010 |
Variational Autoencoder | VAE | Kingma et al., 2014 |
Generative Adversarial Network | GAN | Goodfellow et al., 2014 |
Density estimation using Real NVP | RealNVP (2d) | Dinh et al., 2017 |
Denoising Diffusion Probabilistic Models | DDPM | Ho et al., 2020 |
Density estimation using Real NVP | RealNVP (image) | Dinh et al., 2018 |
Conditional Variational Autoencoder | CVAE | Sohn et al., 2015 |
Deep Convolutional Generative Adversarial Networks | DCGAN | Radford et al., 2016 |
Wasserstein Generative Adversarial Networks | WGAN | Arjovsky et al., 2017 |
Information Maximizing Generative Adversarial Nets | InfoGAN | Xi Chen et al., 2016 |
Probabilistic Model Name | Abbreviation | Paper Link |
---|---|---|
Weibull Hybrid Autoencoding Inference | WHAI | Zhang et al., 2018 |
Weibull Graph Attention Autoencoder | WGAAE | Wang et al., 2020 |
Recurrent Gamma Belief Network | rGBN | Guo et al., 2020 |
Multimodal Weibull Variational Autoencoder | MWVAE | Wang et al., 2020 |
Sawtooth Embedding Topic Model | SawETM | Duan et al., 2021 |
TopicNet | TopicNet | Duan et al., 2021 |
Deep Coupling Embedding Topic Model | dc-ETM | Li et al., 2022 |
Topic Taxonomy Mining with Hyperbolic Embedding | HyperMiner | Xu et al., 2022 |
Knowledge Graph Embedding Topic Model | KG-ETM | Wang et al., 2022 |
Variational Edge Parition Model | VEPM | He et al., 2022 |
Generative Text Convolutional Neural Network | GTCNN | Wang et al., 2022 |
🔥Welcome to introduce classical or novel Deep Proabilistic Models for us.
Probabilistic Model Name | Abbreviation | Paper Link |
---|---|---|
Nouveau Variational Autoencoder | NVAE | Vahdat et al., 2020 |
flow-based Variational Autoencoder | f-VAE | Su et al., 2018 |
Score-Based Generative Models | SGM | Bortoli et al., 2022 |
Poisson Flow Generative Models | PFGM | Xu et al., 2022 |
Stable Diffusion | LDM | Rombach et al., 2022 |
Denoising Diffusion Implicit Models | DDIM | Song et al., 2022 |
Vector Quantized Diffusion | VQ-Diffusion | Tang et al., 2023 |
Vector Quantized Variational Autoencoder | VQ-VAE | Aaron van den Oord et al., 2017 |
Conditional Generative Adversarial Nets | cGAN | Mirza et al., 2014 |
Information Maximizing Variational Autoencoders | InfoVAE | zhao et al.,2017 |
Generative Flow | Glow | Kingama et al., 2018 |
Structured Denoising Diffusion Models in Discrete State-Spaces | DP3M | Austin et al., 2021 |
Example: a few code lines to quickly construct and evaluate a 3-layer Bayesian model named PGBN on GPU.
from pydpm.model import PGBN
from pydpm.metric import ACC
# create the model and deploy it on gpu or cpu
model = PGBN([128, 64, 32], device='gpu')
model.initial(train_data)
train_local_params = model.train(train_data, iter_all=100)
train_local_params = model.test(train_data, iter_all=100)
test_local_params = model.test(test_data, iter_all=100)
# evaluate the model with classification accuracy
# the demo accuracy can achieve 0.8549
results = ACC(train_local_params.Theta[0], test_local_params.Theta[0], train_label, test_label, 'SVM')
# save the model after training
model.save()
Example: a few code lines to quickly deploy distribution sampler of Pydpm on GPU.
from pydpm.sampler import Basic_Sampler
sampler = Basic_Sampler('gpu')
a = sampler.gamma(np.ones(100)*5, 1, times=10)
b = sampler.gamma(np.ones([100, 100])*5, 1, times=10)
Compare the distribution sampling efficiency of PyDPM with numpy:
Compare the distribution sampling efficiency of PyDPM with tensorflow and torch:
Compare the distribution sampling efficiency of PyDPM with CuPy and PyCUDA(used by pydpm v1.0):
License: Apache License Version 2.0
Contact: Chaojie Wang [email protected], Wei Zhao [email protected], Xinyang Liu [email protected], Bufeng Ge [email protected], Jiawen Wu [email protected]
Copyright (c), 2020, Chaojie Wang, Wei Zhao, Xinyang Liu, Jiawen Wu, Jie Ren, Yewen Li, Hao Zhang, Bo Chen and Mingyuan Zhou