Skip to content

Commit

Permalink
fix spelling mistakes
Browse files Browse the repository at this point in the history
  • Loading branch information
NicoRenaud committed Apr 26, 2023
1 parent 6463a85 commit b64dc6a
Showing 1 changed file with 8 additions and 8 deletions.
16 changes: 8 additions & 8 deletions paper/paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,14 +24,14 @@ Quantum Monte-Carlo (QMC) simulations allow to compute the electronic structure
with high accuracy and can be parallelized over large compute resources. QMC relies on the variational principle and optimize a wave function ansatz to minimize the total energy of the quantum system. `QMCTorch`
allows to express this optimization process as a deep-learning problem where the wave function ansatz is encoded
in a physically-motivated neural network. The use of `PyTorch` as a backend to perform the optimization, allows
leveraging automatic differentiation to compute the gradient of the total energy w.r.t. the variational parameters
as well as GPU computing to accelerate the calculation. `QMCTorch` is interfaced with popular quantum chemistry packages
leveraging automatic differentiation to compute the gradients of the total energy w.r.t. the variational parameters
as well as GPU computing to accelerate the calculation. `QMCTorch` is interfaced with popular quantum chemistry packages, such as `pyscf` and `ADF`
to facilitate its utilization.


# Statement of need

`QMCTorch` is a Python package using `PyTorch` [@pytorch] as a backend to perform Quantum Monte-Carlo (QMC) simulations of molecular systems. Many software such as `QMCPack`[@qmcpack], `QMC=Chem` [@qmcchem], `CHAMP` [@champ] provide high-quality implementation of advanced QMC methodologies in low-level languages (C++/Fortran). Python implementations of QMC such as `PAUXY` [@pauxy] and `PyQMC` [@pyqmc] have also been proposed to facilitate the use and development of QMC techniques. Large efforts have been made to leverage recent development of deep learning techniques for QMC simulations. Hence neural-network based wave-function ansatz has been proposed [@paulinet; @ferminet]. These recent advances lead to very interesting results but lead to wave functions that are difficult to understand. `QMCTorch` allows to perform QMC simulations using physically motivated neural network architecture that closely follows the wave function ansatz used by QMC practitioners. As such, it still allows to leverage automatic differentiation for the calculation of the gradients of the total energy w.r.t. the variational parameters and the GPU capabilities offered by `PyTorch` without loosing the physical intuition behind the wave function ansatz. The parallelization over multiple computing nodes, each potentially using GPUs, can be obtained via the `Horovod` library [@horovod] .
`QMCTorch` is a Python package using `PyTorch` [@pytorch] as a backend to perform Quantum Monte-Carlo (QMC) simulations of molecular systems. Many software such as `QMCPack`[@qmcpack], `QMC=Chem` [@qmcchem], `CHAMP` [@champ] provide high-quality implementation of advanced QMC methodologies in low-level languages (C++/Fortran). Python implementations of QMC such as `PAUXY` [@pauxy] and `PyQMC` [@pyqmc] have also been proposed to facilitate the use and development of QMC techniques. Large efforts have been made to leverage recent development of deep learning techniques for QMC simulations. Hence neural-network based wave-function ansatz has been proposed [@paulinet; @ferminet]. These recent advances lead to very interesting results but results in wave functions that are difficult to understand. `QMCTorch` allows to perform QMC simulations using physically motivated neural network architectures that closely follow the wave function ansatz used by QMC practitioners. As such, it allows to leverage automatic differentiation for the calculation of the gradients of the total energy w.r.t. the variational parameters and the GPU capabilities offered by `PyTorch` without loosing the physical intuition behind the wave function ansatz. The parallelization over multiple computing nodes, each potentially using GPUs, can be obtained via the `Horovod` library [@horovod] .


# Wave Function Ansatz
Expand All @@ -41,13 +41,13 @@ The neural network used to encode the wave-function ansatz used in `QMCTorch` is

**Jastrow Factor** The `Jastrow` layer computes the sum of three components: an electron-electron term $K_{ee}$; an electron-nuclei term $K_{en}$; and a three body electron-electron-nuclei term $K_{een}$. The sum is then exponentiated to give the Jastrow factor: $J(r_{ee}, r_{en}) = \exp\left( K_{ee}(r_{ee})+K_{en}(r_{en}) + K_{een}(r_{ee},r_{en})\right)$ where $r_{ee}$ and $r_{en}$ are the electron-electron and electron-nuclei distances. Several well-known Jastrow factor functional forms, as for example the electron-electron Pade-Jastrow: $K(r_{ee}) = \frac{\omega_0 r_{ee}}{1 + \omega r_{ee}}$, where $\omega$ is a variational parameter, are already implemented and available for use. Users can also define their own functional forms for the different kernel functions, $K$, and explore their effects on the resulting optimization.

**Backflow Transformation** The backflow transformation layer, `BF`, creates quasi-particles by mixing the electronic positions of the electrons: $q_i = r_i + \sum_{i\neq j} K_{BF}(r_{ij}(r_i-r_j))$ [@backflow]. Well-known transformations such as the inverse form: $K_{BF} = \frac{\omega}{r_{ij}}$ where $\omega$ is a variational parameter, are already implemented and ready to use. Users can also easily specify the kernel of the backflow transformation, $K_{BF}$ to explore its impact on the wave function optimization.
**Backflow Transformation** The backflow transformation layer, `BF`, creates quasi-particles by mixing the electronic positions of the electrons: $q_i = r_i + \sum_{i\neq j} K_{BF}(r_{ij}(r_i-r_j))$ [@backflow]. Well-known transformations such as: $K_{BF} = \frac{\omega}{r_{ij}}$ where $\omega$ is a variational parameter, are already implemented and ready to use. Users can also easily specify the kernel of the backflow transformation, $K_{BF}$ to explore its impact on the wave function optimization.

**Atomic Orbitals** The Atomic Orbital layer `AO` computes the values of the different atomic orbitals of the system at all the positions $q_e$. Both Slater type orbitals (STOs) and Gaussian type orbitals (GTOs) are supported. The initial parameters of the AOs are extracted from popular quantum chemistry codes, `pyscf` [@pyscf] and `ADF` [@adf]. During the optimization, the parameters of the AOs (exponents, coefficients) are variational parameters that can be optimized to minimize the total energy. GTOs can introduce a significant amount of noise in the QMC simulations. To mitigate this effect, `QMCTorch` offers the possibility to fit GTOs to single exponent STOs.
**Atomic Orbitals** The Atomic Orbital layer `AO` computes the values of the different atomic orbitals of the system at all the positions $q_e$. Both Slater type orbitals (STOs) and Gaussian type orbitals (GTOs) are supported. The initial parameters of the AOs are extracted from popular quantum chemistry codes, `pyscf` [@pyscf] and `ADF` [@adf]. During the optimization, the parameters of the AOs (exponents, coefficients) are variational parameters that can be optimized to minimize the total energy. Since GTOs can introduce a significant amount of noise in the QMC simulations, `QMCTorch` offers the possibility to fit GTOs to single exponent STOs.

**Molecular Orbitals** The Molecular Orbital layer, `MO`, computes the values of all the MOs at the positions of the quasi particles. The MO layer is a simple linear transformation defined by $\textnormal{MO} = \textnormal{AO} \times W^T_{SCF}$, where $W^T_{SCF}$ is the matrix of the MOs coefficients on the AOs. The initial values of these coefficients are obtained from a Hartree-Fock (HF) or Density Functional Theory (DFT) calculation of the system via `pyscf` or `ADF`. These coefficients are then variational parameters that can be optimized to minimize the total energy of the system.

**Slater Determinants** The Slater determinants layer, `SD`, extracts the spin up/down matrices of the different electronic configurations required by the user. Users can freely define the number of electrons as well as the number and types of excitations they want to include in the definition of their wave function ansatz. The `SD` layer will extract the corresponding matrices, multiply their determinants and sum all the terms. The `CI` coefficients of the sum can be freely initialized and optimized to minimize the total energy.
**Slater Determinants** The Slater determinants layer, `SD`, extracts the spin up/down matrices of the different electronic configurations specified by the user. Users can freely define the number of electrons as well as the number and types of excitations they want to include in the definition of their wave function ansatz. The `SD` layer will extract the corresponding matrices, multiply their determinants and sum all the terms. The `CI` coefficients of the sum can be freely initialized and optimized to minimize the total energy.

The Jastrow factor and the sum of Slater determinants are then multiplied to yield the final value of the wave function calculated for the electronic and atomic positions $\Psi(R)$ with $R = \{r_e, R_{at}\}$.

Expand All @@ -57,13 +57,13 @@ QMC simulations use samples of the electronic density to approximate the total e

Each sample, $R_i$, contains the positions of all the electrons contained in the system. The value of local energy of the system is then computed at each sampling point and these values are summed up to compute the total energy of the system: $E = \sum_i \frac{H\Psi(R_i)}{\Psi(R_i)}$, where $H$ is the Hamiltonian of the molecular system: $H = -\frac{1}{2}\sum_i \Delta_i + V_{ee} + V_{en}$, with $\Delta_i$ the Laplacian w.r.t the i-th electron, $V_{ee}$ the coulomb potential between the electrons and $V_{en}$ the electron-nuclei potential. The calculation of the Laplacian of a determinant can either be performed using automatic differentiation but analytical expressions are often preferred as they are computationally more robust and less expensive [@jacobi_trace]. The gradients of the total energy w.r.t the variational parameters of the wave function, i.e. $\frac{\partial E}{\partial \theta_i}$ are simply obtained via automatic differentiation. Thanks to this automatic differentiation, users can define new kernels for the backflow transformation and Jastrow factor without having to derive analytical expressions of the energy gradients.

Any optimizer included in `PyTorch` (or compatible with it) can tbe used to optimize the wave function. This gives users access to a wide range of optimization techniques that they can freely explore for their own use cases. Users can also decide to freeze certain variational parameters, such as the parameters of the atomic orbitals, or defined different learning rates for different layers. Note that the positions of atoms are also variational parameters, and therefore one can perform geometry optimization using `QMCTorch`. At the end of the optimization, all the information relative to the simulations are dumped in a dedicated HDF5 file to enhance reproducibility of the simulations.
Any optimizer included in `PyTorch` (or compatible with it) can then used to optimize the wave function. This gives users access to a wide range of optimization techniques that they can freely explore for their own use cases. Users can also decide to freeze certain variational parameters or defined different learning rates for different layers. Note that the positions of atoms are also variational parameters, and therefore one can perform geometry optimization using `QMCTorch`. At the end of the optimization, all the information relative to the simulations are dumped in a dedicated HDF5 file to enhance reproducibility of the simulations.

# Example

![Snippet of code showing the use of QMCTorch to compute the electronic structure of H2. The left panel shows the optimization of the wave function of LiH and NH3 using atomic atomic orbitals provided by `pyscf`, `ADF` and also a STO fit of the `pyscf` atomic orbitals. \label{fig:results}](qmctorch_results.png)

The left panel of Fig. \ref{fig:results} shows a typical example of `QMCTorch` script. A `Molecule` object is first created by specifying the atomic positions and the calculator required to run the HF or DFT calculations (here `ADF` using a double-zeta basis set). This molecule is then used to create the `SlaterJastrow` wave function ansatz. Other options, such as the required Jastrow kernel, active space, and the use of GPUs can also be specified here. A sampler and optimizer are then defined that are then used with the wave function to instantiate the solver. This solver can then be used to optimize the variational parameters here though 250 epochs.
The left panel of Fig. \ref{fig:results} shows a typical example of `QMCTorch` script. A `Molecule` object is first created by specifying the atomic positions and the calculator required to run the HF or DFT calculations (here `pyscf` using a `sto-3g` basis set). This molecule is then used to create a `SlaterJastrow` wave function ansatz. Other options, such as the required Jastrow kernel, active space, and the use of GPUs can also be specified here. A sampler and optimizer are then defined that are then used with the wave function to instantiate the solver. This solver can then be used to optimize the variational parameters here though 50 epochs.

The right panel of Fig. \ref{fig:results} shows typical optimization runs for two different molecular structures, LiH and NH3 using atomic orbitals provided by `pyscf`, `ADF` and also a STO fit of the `pyscf` atomic orbitals. As seen in this figure, the variance of the local energy values obtained with the GTOs provided by `pyscf` is a limiting factor for the optimization. A simple STO fit of these atomic orbitals leads to variance comparable to those obtained with the STO of `ADF`.

Expand Down

0 comments on commit b64dc6a

Please sign in to comment.