Merge pull request #137 from acfr/update-docs

Minor updates to docs
acfr · Nov 14, 2023 · 2f1301b · 2f1301b
2 parents 669f373 + 26f0a5e
commit 2f1301b
Show file tree

Hide file tree

Showing 2 changed files with 3 additions and 3 deletions.
diff --git a/docs/src/examples/rl.md b/docs/src/examples/rl.md
@@ -2,7 +2,7 @@
 
 *Full example code can be found [here](https://github.com/acfr/RobustNeuralNetworks.jl/blob/main/examples/src/lbdn_rl.jl).*
 
-One of the original motivations for developing `RobustNeuralNetworks.jl` was to guarantee stability and robustness in learning-based control. Some of our recent research (eg: [Wang et al. (2022)](https://ieeexplore.ieee.org/abstract/document/9802667) and [Barbara, Wang & Manchester (2023)](https://doi.org/10.48550/arXiv.2304.06193)) has shown that, with the right controller architecture, we can learn over the space of all stabilising controllers for linear/nonlinear systems using standard reinforcement learning techniques, so long as our control policy is parameterised by a REN (see also [(Convex) Nonlinear Control with REN](@ref)).
+One of the original motivations for developing `RobustNeuralNetworks.jl` was to guarantee stability and robustness in learning-based control. Some of our recent research (eg: [Wang et al. (2022)](https://ieeexplore.ieee.org/abstract/document/9802667) and [Barbara, Wang & Manchester (2023)](https://doi.org/10.48550/arXiv.2304.06193)) has shown that, with the right controller architecture, we can learn over a space of stabilising controllers for linear/nonlinear systems using standard reinforcement learning techniques, so long as our control policy is parameterised by a REN (see also [(Convex) Nonlinear Control with REN](@ref)).
 
 In this example, we'll demonstrate how to train an LBDN controller with *Reinforcement Learning* (RL) for a simple nonlinear dynamical system. This controller will not have any stability guarantees. The purpose of this example is simply to showcase the steps required to set up RL experiments for more complex systems with RENs and LBDNs.
 
@@ -75,7 +75,7 @@ f(x::Matrix,u::Matrix) = [x[2:2,:]; (u[1:1,:] - k*x[1:1,:] - _visc(x[2:2,:]))/m]
 fd(x::Matrix,u::Matrix) = x + dt*f(x,u)
 ```
 
-Reinforcement learning problems generally involve simulating the system over some time horizon and collecting a series of rewards or costs at each time step. Control policies are then trained using approximations of the cost gradient ``\nabla J_\theta`` because it is often difficult (or impossible) to compute the exact gradient. [ReinforcementLearning.jl](https://juliareinforcementlearning.org/) is the home of all things RL in Julia.
+Reinforcement learning problems generally involve simulating the system over some time horizon and collecting a series of rewards or costs at each time step. Control policies are then trained using approximations of the cost gradient ``\nabla J_\theta`` because it is often difficult (or impossible) to compute the exact gradient. See [ReinforcementLearning.jl](https://juliareinforcementlearning.org/) for more RL in Julia.
 
 For this simple example, we can just write a differentiable simulator of the dynamics. The simulator takes a batch of initial states, goal positions, and a controller `model` whose inputs are ``[x; q_\mathrm{ref}]``. It computes a batch of trajectories of states and controls ``z = \{[x_0;u_0], \ldots, [x_{T-1};u_{T-1}]\}`` for later use. To get around the well-known issue of [array mutation with auto-differentiation](https://fluxml.ai/Zygote.jl/stable/limitations/), we use a [Zygote.Buffer](https://fluxml.ai/Zygote.jl/stable/utils/#Zygote.Buffer) to iteratively store the outputs.
 

diff --git a/docs/src/index.md b/docs/src/index.md
@@ -63,4 +63,4 @@ The REN parameterisation was extended to continuous-time systems in [yet to be i
 
 See below for a collection of projects and papers using `RobustNeuralNetworks.jl`.
 
-> N. H. Barbara, R. Wang, and I. R. Manchester, "Learning Over All Contracting and Lipschitz Closed-Loops for Partially-Observed Nonlinear Systems," April 2023. doi: [https://doi.org/10.48550/arXiv.2304.06193](https://doi.org/10.48550/arXiv.2304.06193).
+> N. H. Barbara, R. Wang, and I. R. Manchester, "Learning Over Contracting and Lipschitz Closed-Loops for Partially-Observed Nonlinear Systems," April 2023. doi: [https://doi.org/10.48550/arXiv.2304.06193](https://doi.org/10.48550/arXiv.2304.06193).