Skip to content

Implementation of variational autoencoders (VAEs) for generating a large dataset of molecular structures along a reaction coordinate, given a small training dataset.

Notifications You must be signed in to change notification settings

kuntalg97/Variational-Autoencoder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Variational-Autoencoder

This code is a small illustration of generative AI, specifically Variational Autoencoders (VAE) (https://www.ibm.com/think/topics/variational-autoencoder). VAEs are deep learning models that encode input data into a continuous, probabilistic latent space, and following the optimization of a loss function, it generates new data, which are minor variations of the original dataset. VAEs are powerful tools for generating new samples, that very closely resemble the original data.

In quantum chemical calculations, an extensive sampling of the potential energy surface (PES) of molecular systems is critically important. However, such QM calculations are typically expensive. In this illustration, given a small dataset of structures for a reaction, VAEs are used for generating a much larger dataset which very closely resemble the original structures. The original dataset here consists of ~70 configurations of a SN2 reaction along a reaction coordinate. This was constructed from a combination of shell and Tcl scripting, followed by QM minimization using CP2K. However, constructing approximately ~10000 structures along the same reaction coordinate using these methods is non-trivial and expensive. Here, this VAE code accomplishes this by encoding the original set of 70 structures, and by training on just the coordinates, generates ~10K structures, at a fraction of the computational cost. Further refinement of the code, and incorporating on-the-fly QM calculations is currently ongoing.

About

Implementation of variational autoencoders (VAEs) for generating a large dataset of molecular structures along a reaction coordinate, given a small training dataset.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages