Update distribution shape inference to handle independent dims #402

eb8680 · 2020-12-01T20:29:37Z

Consider the following snippet of Pyro code simplified from a test in pyro-ppl/pyro#2702 :

with pyro.plate("data", T, dim=-1):
    y = pyro.sample("y", dist.Normal(torch.ones(3), 1.).to_event(1))
    pyro.sample("z", dist.Normal(y, 1.).to_event(1), obs=data)

Attempting to wrap this in a poutine.collapse() context fails when dist.Normal(y, 1) is incorrectly converted to a Funsor. In particular, there is a mismatch between y.output, which is Reals[3] and the .output value Real expected for the loc param of funsor.torch.distributions.Normal given by funsor.distribution.Distribution._infer_value_domain.

In cases like the above, however, it is always possible to infer the correct parameter and value .output shapes generically using the broadcasting logic in the underlying backend distribution. This PR implements this behavior in funsor.distribution.DistributionMeta.__call__ and distribution_to_data.

As a result, it is possible after this PR to represent Independent distributions without an intermediary funsor.Independent by passing a Variable with extra output dimensions as the value:

# diagonal multivariate normal in pyro:
dist.Normal(zeros(3), 1.).to_event(1)

# funsor equivalent, after this PR - note Reals[3] for x
Normal(loc=zeros(3), scale=1., value=Variable("x", Reals[3]))

This should considerably simplify pattern-matching over Independent distributions.

To make this work, converting a funsor.distribution.Distribution to data via funsor.to_data will have to include a step handling nontrivial event shapes by calling .to_event and unsqueezing parameters.

More ambitiously, we could also attempt to handle to_funsor conversion of backend Independent distributions by substituting an appropriately shaped Variable for their value rather than resorting to lazy application of funsor.terms.Independent, which would make pattern-matching for collapseing multivariate distributions much easier, but it's not clear yet whether this can be done generically, especially in the presence of transforms. I have not attempted to do this in this PR.

Tasks:

Update funsor.distribution.DistributionMeta
Update distribution_to_data
Add tests
Fix DirichletMultinomial broadcasting errors

eb8680 · 2020-12-01T20:32:07Z

funsor/distribution.py

+        # The arguments to _infer_value_domain are the .output shapes of parameters,
+        # so any extra batch dimensions that aren't part of the instance event_shape
+        # must be broadcasted output dimensions by construction.
+        out_shape = instance.batch_shape + instance.event_shape


This change to _infer_value_domain is the conceptual meat of the PR.

eb8680 · 2020-12-03T23:33:32Z

funsor/distribution.py

+        params, value = params[:-1], params[-1]
+        params = params + (Variable("value", value.output),)
+        instance = reflect(cls, *params)
+        raw_dist, value_name, value_output, dim_to_name = instance._get_raw_dist()


I had to refactor eager_log_prob to use Distribution._get_raw_dist() to get the new tests to pass.

funsor/distribution.py

fehiepsi · 2020-12-04T06:00:30Z

funsor/distribution.py

+            domains[k] = domain if domain is not None else to_funsor(v).output
+
+            # broadcast individual param domains with Funsor inputs
+            # this avoids .expand-ing underlying parameter tensors


What is the expected domain of scale for Normal(Reals[2], 1.) and Normal(Reals[2], torch.ones(2))? Currently, domains["scale"] will be Real in both case. The second case will trigger an error at to_funsor(v, output=domains[k]) below.

In either case, I guess we need to rewrite eager_normal or eager_mvn to address Reals[2] loc. Maybe there is some trick to avoid doing so. cc @fritzo

What is the expected domain of scale for Normal(Reals[2], 1.) and Normal(Reals[2], torch.ones(2))?

In the first case, it's Real, and in the second, it's Reals[2]. I guess I should add a second broadcasting condition below to handle the case where the parameter is a raw tensor:

if ops.is_numeric_array(v): # at this point we know all of v's dims are output dims domains[k] = Reals[broadcast_shape(v.shape, domains[k].shape)]

fritzo · 2020-12-08T13:45:02Z

It looks like you just need to relax precision of test_dirichlet_density() for the jax backend.

eb8680 · 2020-12-08T19:32:39Z

@fehiepsi there are a ton of unrelated new failures on the last JAX build, any idea what's going on? Seems like a mix of weird new numerical errors and ValueError: cannot compare objects of type <class 'jax.interpreters.xla._DeviceArray'>.

fehiepsi

The logic to get expected outputs looks reasonable to me, pending further comments by @fritzo.

I'll try to address eager_normal, eager_mvn issues in a follow-up PR. Currently, those eager rules assume Real args but we have Reals or a mix of Real, Reals now.

eb8680 · 2020-12-10T14:30:34Z

I'll try to address eager_normal, eager_mvn issues in a follow-up PR. Currently, those eager rules assume Real args but we have Reals or a mix of Real, Reals now.

Hmm, I wonder if we'll have to do this for all the other eager distribution patterns in funsor/distribution.py as well...

fritzo

(thanks for your patience @eb8680)

funsor/distribution.py

fritzo · 2020-12-14T20:00:35Z

funsor/distribution.py

    funsor_event_shape = funsor_dist.value.output.shape
+
+    # attempt to generically infer the independent output dimensions
+    instance = funsor_dist.dist_class(**{


Beyond the scope of this PR, I'm concerned with the increasing overhead of shape computations that need to do tensor ops. I like @fehiepsi's recent suggestion of implementing .forward_event_shape() for transforms. I think it would be worthwhile to discuss and think about extensions to the Distribution interface that could replace all this need to create an throw away dummy distributions.

(Indeed in theory an optimizing compiler could remove all this overhead, but in practice our tensor backends either incur super-linear compile time cost, or fail to cover the wide range of probabilistic models we would like to handle. And while these dummy tensor ops are cheap, they add noise to debugging efforts.)

Yes, I agree the repeated creation of distribution instances here is not ideal. Perhaps we could add counterparts of some of the shape inference methods from TFP (e.g. event_shape_tensor, param_shapes) upstream in torch.distributions.

funsor/distribution.py

fritzo · 2020-12-14T20:13:18Z

funsor/jax/distributions.py

+dist.Independent.has_rsample = property(lambda self: self.base_dist.has_rsample)
+dist.Independent.rsample = dist.Independent.sample
+dist.MaskedDistribution.has_rsample = property(lambda self: self.base_dist.has_rsample)
+dist.MaskedDistribution.rsample = dist.MaskedDistribution.sample
+dist.TransformedDistribution.has_rsample = property(lambda self: self.base_dist.has_rsample)
+dist.TransformedDistribution.rsample = dist.TransformedDistribution.sample


Can you add a TODO pointing to a NumPyro issue to fix this bug, so we can delete this workaround once the bug is fixed? cc @fehiepsi

@neerajprad Should we add those new attributes to NumPyro distributions? We can make default behaviors for them so that there would be only a few changes in the code.

Sounds good. In numpyro, for distributions that have reparametrized samplers available both will be the same so we can just add a default Distribution.rsample method which delegates to sample and throws a NotImplemented error when not available.

funsor/torch/distributions.py

eb8680 · 2020-12-17T16:11:58Z

@fritzo any other comments? I'd like to bump the Funsor dependency version in Pyro to the latest master to unblock @ordabayevy's PR pyro-ppl/pyro#2716 and we might as well include these changes too.

fritzo · 2020-12-17T16:32:29Z

Thanks for the reminder and thanks for addressing nits! LGTM.

Update distribution shape inference to handle independent dims

0e95347

eb8680 added the WIP label Dec 1, 2020

eb8680 commented Dec 1, 2020

View reviewed changes

eb8680 mentioned this pull request Dec 1, 2020

Add more tests for collapse pyro-ppl/pyro#2702

Open

eb8680 added 7 commits December 1, 2020 23:28

update eager_log_prob and add a test

5f83573

remove pdb, add mvnormal test

13d94f4

fix alignment in eager_log_prob

7b8c2ee

attempting a fix, and ops.unsqueeze for jax

88cb603

fix dirichletmultinomial

2de25be

comments

9a0ccf3

fix tests for jax

0102210

eb8680 added awaiting review and removed WIP labels Dec 3, 2020

eb8680 requested review from fehiepsi and fritzo December 3, 2020 23:13

eb8680 mentioned this pull request Dec 3, 2020

Full-featured distribution wrappers #386

Open

34 tasks

eb8680 commented Dec 3, 2020

View reviewed changes

funsor/distribution.py Outdated Show resolved Hide resolved

fehiepsi reviewed Dec 4, 2020

View reviewed changes

eb8680 added 4 commits December 4, 2020 12:47

add extra broadcasting condition

9f85c50

patch jax compound dists

9f5f391

align to align_tensors

a3a4c4b

lint

260d2a2

tweak tolerance

204b357

ordabayevy mentioned this pull request Dec 8, 2020

Update partial_unroll to support markov dims #404

Merged

Merge branch 'master' into infer-independent-dims

b8f7bf6

fehiepsi approved these changes Dec 10, 2020

View reviewed changes

tolerance

aa56a2d

rtol

51d7bef

fritzo reviewed Dec 14, 2020

View reviewed changes

eb8680 added 2 commits December 14, 2020 20:41

address comments

e7f4441

fix test

925cdfb

fritzo merged commit c685dde into master Dec 17, 2020

fritzo deleted the infer-independent-dims branch December 17, 2020 16:33

fritzo mentioned this pull request Dec 22, 2020

Implement distribution shape helpers to avoid need for dummy tensors #412

Closed

4 tasks

eb8680 mentioned this pull request Jan 25, 2021

Add factory for dynamic TransformOp #427

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update distribution shape inference to handle independent dims #402

Update distribution shape inference to handle independent dims #402

eb8680 commented Dec 1, 2020 •

edited

Loading

eb8680 Dec 1, 2020 •

edited

Loading

eb8680 Dec 3, 2020

fehiepsi Dec 4, 2020 •

edited

Loading

eb8680 Dec 4, 2020 •

edited

Loading

fritzo commented Dec 8, 2020

eb8680 commented Dec 8, 2020

fehiepsi left a comment

eb8680 commented Dec 10, 2020

fritzo left a comment

fritzo Dec 14, 2020

eb8680 Dec 15, 2020

fritzo Dec 14, 2020

fehiepsi Dec 14, 2020

neerajprad Dec 14, 2020

eb8680 commented Dec 17, 2020

fritzo commented Dec 17, 2020

Update distribution shape inference to handle independent dims #402

Update distribution shape inference to handle independent dims #402

Conversation

eb8680 commented Dec 1, 2020 • edited Loading

eb8680 Dec 1, 2020 • edited Loading

Choose a reason for hiding this comment

eb8680 Dec 3, 2020

Choose a reason for hiding this comment

fehiepsi Dec 4, 2020 • edited Loading

Choose a reason for hiding this comment

eb8680 Dec 4, 2020 • edited Loading

Choose a reason for hiding this comment

fritzo commented Dec 8, 2020

eb8680 commented Dec 8, 2020

fehiepsi left a comment

Choose a reason for hiding this comment

eb8680 commented Dec 10, 2020

fritzo left a comment

Choose a reason for hiding this comment

fritzo Dec 14, 2020

Choose a reason for hiding this comment

eb8680 Dec 15, 2020

Choose a reason for hiding this comment

fritzo Dec 14, 2020

Choose a reason for hiding this comment

fehiepsi Dec 14, 2020

Choose a reason for hiding this comment

neerajprad Dec 14, 2020

Choose a reason for hiding this comment

eb8680 commented Dec 17, 2020

fritzo commented Dec 17, 2020

eb8680 commented Dec 1, 2020 •

edited

Loading

eb8680 Dec 1, 2020 •

edited

Loading

fehiepsi Dec 4, 2020 •

edited

Loading

eb8680 Dec 4, 2020 •

edited

Loading