+ Mathematical background
+ Consensus-based optimisation (CBO) is an approach to solve, for a
+ given (continuous) objective function
+
+
+ f:ℝd→ℝ,
+ the global minimisation problem
+
+
+ x*=argminx∈ℝdf(x),
+ i.e., the task of finding the point
+
+ x*
+ where
+
+ f
+ attains its lowest value. Such problems arise in a variety of
+ disciplines including engineering, where
+
+
+ x
+ might represent a vector of design parameters for a structure and
+
+
+ f
+ a function related to its cost and structural integrity, or machine
+ learning, where
+
+ x
+ could comprise the parameters of a neural network and
+
+
+ f
+ the empirical loss, which measures the discrepancy of the neural
+ network prediction with the observed data.
+ In some cases, so-called gradient-based methods
+ (those that involve updating a guess of
+
+ x*
+ by evaluating the gradient
+
+ ∇f)
+ achieve state-of-the-art performance in the global minimisation
+ problem. However, in scenarios where
+
+ f
+ is non-convex (when
+
+ f
+ could have many local minima), where
+
+
+ f
+ is non-smooth (when
+
+ ∇f
+ is not well-defined), or where the evaluation of
+
+
+ ∇f
+ is impractical due to cost or complexity,
+ derivative-free methods are needed. Numerous
+ techniques exist for derivative-free optimisation, such as
+ random or pattern search
+ (Friedman
+ & Savage, 1947;
+ Hooke
+ & Jeeves, 1961;
+ Rastrigin,
+ 1963), Bayesian optimisation
+ (Močkus,
+ 1975) or simulated annealing
+ (Henderson
+ et al., 2003). Here, we focus on particle-based
+ methods, specifically, consensus-based optimisation (CBO), as
+ proposed by Pinnau et al.
+ (2017),
+ and the consensus-based taxonomy of related techniques, which we term
+ CBX.
+ CBO uses a finite number
+
+ N
+ of agents (particles),
+
+ xt=(xt1,…,xtN),
+ dependent on time
+
+ t,
+ to explore the landscape of
+
+ f
+ without evaluating any of its derivatives (as do other CBX methods).
+ The agents evaluate the objective function at their current position,
+
+
+ f(xti),
+ and define a consensus point
+
+
+ cα.
+ This point is an approximation of the global minimiser
+
+
+ x*,
+ and is constructed by weighing each agent’s position against the
+ Gibbs-like distribution
+
+
+ exp(−αf)
+ (Boltzmann,
+ 1868). More rigorously,
+
+
+ cα(xt)=1∑i=1Nωα(xti)∑i=1Nxtiωα(xti),whereωα(⋅)=exp(−αf(⋅)),
+ for some
+ 0]]>
+ α>0.
+ The exponential weights in the definition favour those points
+
+
+ xti
+ where
+
+ f(xti)
+ is lowest, and comparatively ignore the rest, particularly for larger
+
+
+ α.
+ If all the found values of the objective function are approximately
+ the same,
+
+ cα(xt)
+ is roughly an arithmetic mean. Instead, if one particle is much better
+ than the rest,
+
+ cα(xt)
+ will be very close to its position.
+ Once the consensus point is computed, the particles evolve in time
+ following the stochastic differential equation
+ (SDE)
+
+
+ dxti=−λ(xti−cα(xt))dt⏟consensus drift+σ∥xti−cα(xt)∥dBti⏟scaled diffusion,
+ where
+
+ λ
+ and
+
+ σ
+ are positive parameters, and where
+
+ Bti
+ are independent Brownian motions in
+
+ d
+ dimensions. The consensus drift is a deterministic
+ term that drives each agent towards the consensus point, with rate
+
+
+ λ.
+ Meanwhile, the scaled diffusion is a stochastic term
+ that encourages exploration of the landscape. The scaling factor of
+ the diffusion is proportional to the distance of the particle to the
+ consensus point. Hence, whenever the position of a particle and the
+ location of the weighted mean coincide, the particle stops moving. On
+ the other hand, if the particle is far away from the consensus, its
+ evolution has a stronger exploratory behaviour. While both the agents’
+ positions and the consensus point evolve in time, it has been proven
+ that all agents eventually reach the same position and that the
+ consensus point
+
+ cα(xt)
+ is a good approximation of
+
+ x*
+ (Carrillo
+ et al., 2018;
+ Fornasier,
+ Klock, et al., 2021). Other variations of the method, such as
+ CBO with anisotropic noise
+ (Carrillo
+ et al., 2021), polarised CBO
+ (Bungert
+ et al., 2024), or consensus-based sampling
+ (CBS)
+ (Carrillo
+ et al., 2022) have also been proposed.
+ In practice, the solution to the SDE above cannot be found exactly.
+ Instead, an Euler–Maruyama scheme
+ (Kloeden
+ & Platen, 1992) is used to update the position of the
+ agents. The update is given by
+
+
+ xi←xi−λΔt(xi−cα(x))+σΔt∥xi−cα(x)∥ξi,
+ where
+ 0]]>
+ Δt>0
+ is the step size and
+
+ ξi∼𝒩(0,Id)
+ are independent, identically distributed, standard normal random
+ vectors.
+ As a particle-based family of methods, CBX is conceptually related
+ to other optimisation approaches which take inspiration from biology,
+ like particle-swarm optimisation (PSO)
+ (Kennedy
+ & Eberhart, 1995), from physics, like simulated
+ annealing (SA)
+ (Henderson
+ et al., 2003), or from other heuristics
+ (Bayraktar
+ et al., 2013;
+ Chandra
+ Mohan & Baskaran, 2012;
+ Karaboga
+ et al., 2012;
+ Yang,
+ 2009). However, unlike many such methods, CBX has been designed
+ to be compatible with rigorous convergence analysis at the mean-field
+ level (the infinite-particle limit, see
+ Huang
+ & Qiu, 2022). Many convergence results have been shown,
+ whether in the original formulation
+ (Carrillo
+ et al., 2018;
+ Fornasier,
+ Klock, et al., 2021), for CBO with anisotropic noise
+ (Carrillo
+ et al., 2021;
+ Fornasier
+ et al., 2022), with memory effects
+ (Riedl,
+ 2023), with truncated noise
+ (Fornasier
+ et al., 2024), for polarised CBO
+ (Bungert
+ et al., 2024), and PSO
+ (Huang
+ et al., 2023). The relation between CBO and stochastic
+ gradient descent has been recently established by Riedl et
+ al.
+ (2023),
+ which suggests a previously unknown yet fundamental connection between
+ derivative-free and gradient-based approaches.
+
+ Typical evolution of a CBO method minimising the Ackley
+ function
+ (Ackley,
+ 1987).
+
+
+ CBX methods have been successfully applied and extended to several
+ different settings, such as constrained optimisation problems
+ (Borghi
+ et al., 2023b;
+ Fornasier,
+ Huang, et al., 2021), multi-objective optimisation
+ (Borghi
+ et al., 2023a;
+ Klamroth
+ et al., 2024), saddle-point problems
+ (Huang
+ et al., 2024), federated learning tasks
+ (Carrillo
+ et al., 2023), uncertainty quantification
+ (Althaus
+ et al., 2023), or sampling
+ (Carrillo
+ et al., 2022).
+
+
+ Statement of need
+ In general, very few implementations of CBO already exist, and none
+ have been designed with the generality of other CBX methods in mind.
+ Here, we summarise the related software:
+ Regarding Python, we refer to PyPop7
+ (Duan
+ et al., 2022) and scikit-opt
+ (Guo,
+ 2021) for a collection of various derivative-free optimisation
+ strategies. For packages connected to Bayesian optimisation, we refer
+ to BayesO
+ (Kim
+ & Choi, 2023), bayesian-optimization
+ (Nogueira,
+ 2014–), GPyOpt
+ (The
+ GPyOpt authors, 2016), GPflowOpt
+ (Knudde
+ et al., 2017), pyGPGO
+ (Jiménez
+ & Ginebra, 2017), PyBADS
+ (Singh
+ & Acerbi, 2024) and BoTorch
+ (Balandat
+ et al., 2020). Furthermore, CMA-ES
+ (Hansen
+ & Ostermeier, 1996) was implemented in
+ pycma
+ (Hansen
+ et al., 2019). To the best of our knowledge the connection
+ between consensus-based methods and evolution strategies is not fully
+ understood, and is therefore an interesting future direction. PSO and
+ SA implementations are already available in
+ PySwarms
+ (Miranda,
+ 2018), scikit-opt
+ (Guo,
+ 2021), DEAP
+ (Fortin
+ et al., 2012) and pagmo
+ (Biscani
+ et al., 2017). They are widely used by the community and
+ provide a rich framework for the respective methods. However,
+ adjusting these implementations to CBO is not straightforward. The
+ first publicly available Python packages implementing CBX algorithms
+ were given by some of the authors together with collaborators. Tukh
+ & Riedl
+ (2022)
+ implement standard CBO
+ (Pinnau
+ et al., 2017), and the package PolarCBO
+ (Roith
+ et al., 2023) provides an implementation of polarised CBO
+ (Bungert
+ et al., 2024).
+ CBXPy
+ is a significant extension of the latter, which was tailored to the
+ polarised variant. The code architecture was generalised, which
+ allowed the implementation of the whole CBX family within a common
+ framework.
+ Regarding Julia, PSO and SA methods are, among others, implemented
+ in Optim.jl
+ (Mogensen
+ & Riseth, 2018), Metaheuristics.jl
+ (Mejı́a-de-Dios
+ & Mezura-Montes, 2022), and
+ Manopt.jl
+ (Bergmann,
+ 2022). PSO and SA are also included in the meta-library
+ Optimization.jl
+ (Dixit
+ & Rackauckas, 2023), as well as Nelder–Mead, which is a
+ direct search method. The latter is also implemented in
+ Manopt.jl
+ (Bergmann,
+ 2022), which further provides a manifold variant of CMA-ES
+ (Colutto
+ et al., 2009). One of the authors gave the first specific Julia
+ implementation of standard CBO Consensus.jl
+ (Bailo,
+ 2023). That package has now been deprecated in favour of
+ ConsensusBasedX.jl,
+ which improves the performance of the CBO implementation with a
+ type-stable and allocation-free implementation. The package also adds
+ a CBS implementation, and overall presents a more general interface
+ that accomodates the wider CBX class of methods.
+
+
+ Features
+ CBXPy
+ and
+ ConsensusBasedX.jl
+ provide a lightweight and high-level interface. An existing function
+ can be optimised with just one call. Method selection, parameters,
+ different approaches to particle initialisation, and termination
+ criteria can be specified directly through this interface, offering a
+ flexible point of entry for the casual user. Some of the methods
+ provided are standard CBO
+ (Pinnau
+ et al., 2017), CBO with mini-batching
+ (Carrillo
+ et al., 2021), polarised CBO
+ (Bungert
+ et al., 2024), CBO with memory effects
+ (Grassi
+ & Pareschi, 2021;
+ Riedl,
+ 2023), and consensus-based sampling (CBS)
+ (Carrillo
+ et al., 2022). Parallelisation tools are available.
+ A more proficient user will benefit from the fully documented
+ interface, which allows the specification of advanced options (e.g.,
+ debug output, the noise model, or the numerical approach to the matrix
+ square root of the weighted ensemble covariance matrix). Both
+ libraries offer performance evaluation methods as well as
+ visualisation tools.
+ Ultimately, a low-level interface (including documentation and
+ full-code examples) is provided. Both libraries have been designed to
+ express common abstractions in the CBX family while allowing
+ customisation. Users can easily implement new CBX methods or modify
+ the behaviour of the existing implementation by strategically
+ overriding certain hooks. The stepping of the methods can also be
+ controlled manually.
+
+ CBXPy
+
+ CBXPy logo.
+
+
+ Most of the
+ CBXPy
+ implementation uses basic Python functionality, and the agents are
+ handled as an array-like structure. For certain specific features,
+ like broadcasting-behaviour, array copying, and index selection, we
+ fall back to the numpy implementation
+ (Harris
+ et al., 2020). However, it should be noted that an adaptation
+ to other array or tensor libraries like PyTorch
+ (Paszke
+ et al., 2019) is straightforward. Compatibility with the
+ latter enables gradient-free deep learning directly on the GPU, as
+ demonstrated in the documentation.
+
+ The library is available on
+ GitHub
+ and can be installed via pip. It is licensed
+ under the MIT license. Below, we provide a short example on how to
+ optimise a function with CBXPy.
+ from cbx.dynamics import CBO # import the CBO class
+f = lambda x: x[0]**2 + x[1]**2 # define the function to minimise
+x = CBO(f, d=2).optimize() # run the optimisation
+ More examples and details on the implementation are available in
+ the
+ documentation.
+
+
+ ConsensusBasedX.jl
+
+ ConsensusBasedX.jl logo.
+
+
+ ConsensusBasedX.jl
+ has been almost entirely written in native Julia (with the exception
+ of a single call to LAPACK). The code has been developed with
+ performance in mind, thus the critical routines are fully
+ type-stable and allocation-free. A specific tool is provided to
+ benchmark a typical method iteration, which can be used to detect
+ allocations. Through this tool, unit tests are in place to ensure
+ zero allocations in all the provided methods. The benchmarking tool
+ is also available to users, who can use it to test their
+ implementations of
+
+ f,
+ as well as any new CBX methods.
+ Basic function minimisation can be performed by running:
+ using ConsensusBasedX # load the ConsensusBasedX package
+f(x) = x[1]^2 + x[2]^2 # define the function to minimise
+x = minimise(f, D = 2) # run the minimisation
+ The library is available on
+ GitHub.
+ It has been registered in the
+ general
+ Julia registry, and therefore it can be installed by
+ running ]add ConsensusBasedX. It is licensed
+ under the MIT license. More examples and full instructions are
+ available in the
+ documentation.
+
+
+