Adding facility for separation of the log likelihood and log prior via a Joint struct #75

HarrisonWilde · 2021-06-08T14:59:39Z

Many of the packages that build upon AMCMC define model's with a log density function, in general this is the joint density consisting of the sum of the log prior and likelihood, for the MCMCTempering and some other situations (@yebai made me aware of these I cannot remember them off the top of my head but was assured I might have some luck in advocating for this as I believe it has a reasonably wide use case), I propose we add something like the following to AMCMC:

struct Joint{Tℓprior, Tℓll} <: Function
    ℓprior      :: Tℓprior
    ℓlikelihood :: Tℓll
end

function (joint::Joint)(θ)
    return joint.ℓprior(θ) .+ joint.ℓlikelihood(θ)
end

The purpose is to allow a user to - if they desire - define the log prior and likelihood separately and pass this in to a logdensity model in AMH / as the log density function in AHMC etc. In most cases there is not much motivation to do this as you would simply sum your components in the density function, but having them separable is critical for MCMCTempering at least to work, and in general is a cheap and very simple interface to facilitate operation on the prior / likelihood.

I am open to a bit of discussion on the design of this, as it is a bit of a weird thing to expose to a user potentially, unless they are using tempering or something like that, so I would imagine it would just be documented and flagged up as an option to users rather than being explicitly encouraged, but perhaps other people have an idea for how the same functionality can be achieved in a nicer way in terms of design.

In tempering, we use the user-defined Joint in the model to then apply a temperature to result in the following TemperedJoint, hopefully this illustrates what I mean:

struct TemperedJoint{Tℓprior, Tℓll, T<:AbstractFloat} <: Function
    ℓprior      :: Tℓprior
    ℓlikelihood :: Tℓll
    β           :: T
end

function (tj::TemperedJoint)(θ)
    return tj.ℓprior(θ) .+ (tj.ℓlikelihood(θ) .* tj.β)
end

Currently I need to add both the Joint and TemperedJoint to any tempering implementations that depend on MCMCTempering (aside from Turing which works definitely as the user never defines the logdensity anyway), so if Joint could go in AMCMC, I could add TemperedJoint to MCMCTempering and the requirements for adding tempering to samplers becomes even more trivial.

Woud like to hear thoughts, relatively simple PR if people are happy to add it.

The text was updated successfully, but these errors were encountered:

devmotion · 2021-06-08T17:41:16Z

I'm not particularly happy about enforcing a specific struct, it seems a bit restrictive. Also it seems difficult (impossible?) to evaluate both the prior and likelihood in a single execution of the model. What about the following design:

Use logprior(model, x), logjoint(model, x) (both already defined for DynamicPPL.Model in DynamicPPL) and Distributions.loglikelihood(model, x) (also already used in DynamicPPL) as part of a (not enforced) interface for AbstractModels in AbstractMCMC
Define logprior_loglikelihood (also part of the interface) with the fallback
```
logprior_loglikelihood(model, x) = logprior(model, x), loglikeihood(model, x)
```
for evaluating both the prior and the likelihood. If a model can evaluate both together more efficiently it can implement the method, otherwise it just works.

Use it to define

function logjoint(model, x)
    logprior, loglikelihood = logprior_loglikelihood(model, x)
    return logprior + loglikelihood
end

(of course, if there is a more efficient method to get the log join a model should implement it)

This would allow

models to only define logprior(model, x) and loglikelihood(model, x) and get the rest for free
to exploit possible efficiency gains when evaluating both logprior and loglikelihood

you to define something like

struct TemperedLogjoint{B}
    beta::B
end
function (f::TemperedLogjoint)(model, x)
    logprior, loglikelihood = logprior_loglikelihood(model, x)
    return logprior + f.beta * loglikelihood
end

(or whatever other design you want to use for the tempered log joint, e.g., with an additional argument beta instead)

devmotion · 2021-06-08T20:03:52Z

Just an additional note: of course, this suggestion would still allow you to use a struct for implementing logprior or logprior_loglikelihood etc. for a specific class of models. But the interface would not enforce it.

HarrisonWilde · 2021-06-09T08:23:38Z

Also it seems difficult (impossible?) to evaluate both the prior and likelihood in a single execution of the model.

By this I presume you mean separately evaluating just the prior / likelihood? This is relatively simple as you can in the case of a DensityModel say do:

model = DensityModel(Joint(lprior, llikelihood))
# then:
model.logdensity.lprior(z), model.logdensity.llikelihood(z)
# or to evaluate the joint
model.logdensity(z)

Of course here we are not quite conforming to the intended usage pattern, as we must access the logdensity as a component of the model rather than the usual logdensity(model, z).

As for your proposal, I think it sounds good, and just to clarify, this conceptual splitting of the joint is only necessary in AMH / AHMC where DynamicPPL.Model is not used; as you have seen in the Turing PR for DynamicPPL model's we use a different approach via contexts. I think you know this though and I am still just getting my head around the idea. It is a tradeoff I suppose as I am stuck in a mindset of trying to make it as simple as possible for someone to support tempering where the only assumption is that they work off AMCMC.

The part I am not following as a result is, what would end-user usage look like with this approach? Presumably they'd define a logprior, a loglikelihood, but then there would still be a step where in AMH say we would need to define a new model constructor that then combines the two, or expect the user to do this themselves? Could you give an example of what this would look like as to me it seems that regardless we end up back to a model where the interior density function is the sum of the two components like you said, but these components cannot be accessed retrospectively as is required for tempering.

devmotion · 2021-06-09T08:38:01Z

By this I presume you mean separately evaluating just the prior / likelihood?

No, I mean executing e.g. a Turing model and then accumulating log prior and log likelihood in one run (these quantities are evaluated by executing the model, there's usually no closed form expression or function).

HarrisonWilde · 2021-06-09T08:40:34Z

Ah I see, yeah it is a funny one because this whole approach is only required for anything external to DynamicPPL, in the model structs present in the packages I mention we only gain functionality through being interact with the log prior and log likelihood post-definition but wouldn't lose anything this way provided the struct is callable, that is really all that is needed here I think.

devmotion · 2021-06-09T08:40:40Z

It is a tradeoff I suppose as I am stuck in a mindset of trying to make it as simple as possible for someone to support tempering where the only assumption is that they work off AMCMC.

With the suggestion, users/developers would just have to implement logprior(model, x) and loglikelihood(model, x) for their models. Then tempering would just work (and also things such as logjoint(model, x) and logprior_loglikelihood(model, x)). It is up to the downstream packages how exactly they implement these functions - with a struct where they bundle some functions, by overloading the methods directly etc.

devmotion · 2021-06-09T08:55:28Z

Presumably they'd define a logprior, a loglikelihood, but then there would still be a step where in AMH say we would need to define a new model constructor that then combines the two, or expect the user to do this themselves? Could you give an example of what this would look like as to me it seems that regardless we end up back to a model where the interior density function is the sum of the two components like you said, but these components cannot be accessed retrospectively as is required for tempering.

No, there would not necessarily be such a step as the DynamicPPL example shows.

Basically, overloading logprior(model, x) and loglikelihood(model, x) supports both the case

struct Model{F}
  execute::F
end

function logprior(model::Model, x)
    return model.execute(PriorContext(), x)
end
function loglikelihood(model::Model, x)
    return model.execute(LikelihoodContext(), x)
end

(similar to what e.g. DynamicPPL does) and

struct Model{P,L}
    prior::P
    likelihood::L
end

logprior(model::Model, x) = logpdf(model.prior, x)
loglikelihood(model::Model, x) = model.likelihood(x)

(basically what EllipticalSliceSampling does: https://github.com/TuringLang/EllipticalSliceSampling.jl/blob/d54630e397b15efdae9c0ef25af839f62e8f35c2/src/model.jl#L3-L13).

The logprior and loglikelihood functions allow to define your model in whatever way is suitable for the task or sampler. And e.g. in EllipticalSliceSampling we are never interested in the log joint probability and only in the prior and the likelihood, so it seems quite unintuitive to demand to implement a Joint struct. However, with the logprior and loglikelihood functions also this seems not necessary.

devmotion mentioned this issue Jul 7, 2021

Moving function logdensity into super-lightweight package for abstract densities? tpapp/LogDensityProblems.jl#78

Closed

devmotion mentioned this issue Jul 15, 2021

Add some default implementations of AbstractModel #81

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding facility for separation of the log likelihood and log prior via a Joint struct #75

Adding facility for separation of the log likelihood and log prior via a Joint struct #75

HarrisonWilde commented Jun 8, 2021

devmotion commented Jun 8, 2021

devmotion commented Jun 8, 2021

HarrisonWilde commented Jun 9, 2021

devmotion commented Jun 9, 2021

HarrisonWilde commented Jun 9, 2021

devmotion commented Jun 9, 2021

devmotion commented Jun 9, 2021

Adding facility for separation of the log likelihood and log prior via a Joint struct #75

Adding facility for separation of the log likelihood and log prior via a Joint struct #75

Comments

HarrisonWilde commented Jun 8, 2021

devmotion commented Jun 8, 2021

devmotion commented Jun 8, 2021

HarrisonWilde commented Jun 9, 2021

devmotion commented Jun 9, 2021

HarrisonWilde commented Jun 9, 2021

devmotion commented Jun 9, 2021

devmotion commented Jun 9, 2021