Skip to content

Commit

Permalink
changes to docs to build.
Browse files Browse the repository at this point in the history
  • Loading branch information
iancze committed Dec 28, 2023
1 parent 2709d3b commit d8e9829
Show file tree
Hide file tree
Showing 4 changed files with 33 additions and 31 deletions.
2 changes: 1 addition & 1 deletion docs/ci-tutorials/initializedirtyimage.md
Original file line number Diff line number Diff line change
Expand Up @@ -212,7 +212,7 @@ rml.state_dict() # the now uninitialized parameters of the model (the ones we s
Here you can clearly see the ``state_dict`` is in its original state, before the training loop changed the parameters through the optimization function. Loading our saved dirty image state into the model is as simple as

```{code-cell}
rml.load_state_dict(torch.load("dirty_image_model.pt"))
rml.load_state_dict(torch.load("dirty_image_model.pt"), strict=False)
rml.state_dict() # the reloaded parameters of the model
```

Expand Down
26 changes: 12 additions & 14 deletions docs/ci-tutorials/loose-visibilities.md
Original file line number Diff line number Diff line change
Expand Up @@ -154,24 +154,22 @@ img_packed = utils.sky_gaussian_arcsec(
img_packed_cube = np.broadcast_to(img_packed, (nchan, coords.npix, coords.npix)).copy()
# convert img_packed to pytorch tensor
img_packed_tensor = torch.from_numpy(img_packed_cube)
# insert into ImageCube layer
icube = images.ImageCube(coords=coords, nchan=nchan, cube=img_packed_tensor)
```

## Producing model visibilities

The interesting part of the NuFFT is that it will carry an image plane model all the way to the Fourier plane in loose visibilities, resulting in a model visibility array the same shape as the original visibility data.

```{code-cell}
vis_model_loose = nufft(icube(), uu_chan, vv_chan)
vis_model_loose = nufft(img_packed_tensor, uu_chan, vv_chan)
print("Loose model visibilities from the NuFFT have shape {:}".format(vis_model_loose.shape))
print("The original loose data visibilities have shape {:}".format(data.shape))
```

By comparison, the {class}`~mpol.gridding.Gridder` object puts the visibilities onto a grid and exports a {class}`~mpol.datasets.GriddedDataset` object. These gridded data visibilities have the same dimensionality as the gridded model visibilities produced by the {class}`~mpol.fourier.FourierCube` layer

```{code-cell}
vis_model_gridded = flayer(icube())
vis_model_gridded = flayer(img_packed_tensor)
print("Gridded model visibilities from FourierCube have shape {:}".format(vis_model_gridded.shape))
print("Gridded data visibilities have shape {:}".format(gridded_dset.vis_gridded.shape))
```
Expand Down Expand Up @@ -273,18 +271,18 @@ Now we'll evaluate the likelihood function using both the loose visibilities pro
data_loose = torch.tensor(data)
weight_loose = torch.tensor(weight)
chisquare = losses.chi_squared(vis_model_loose, data_loose, weight_loose)
r_chisquare = losses.r_chi_squared(vis_model_loose, data_loose, weight_loose)
loglike = losses.log_likelihood(vis_model_loose, data_loose, weight_loose)
print("Chi squared", chisquare)
print("Reduced Chi squared", r_chisquare)
print("Log likelihood", loglike)
```

### Gridded visibility log likelihood

```{code-cell}
chisquare_gridded = losses.chi_squared_gridded(vis_model_gridded, gridded_dset)
r_chisquare_gridded = losses.r_chi_squared_gridded(vis_model_gridded, gridded_dset)
loglike_gridded = losses.log_likelihood_gridded(vis_model_gridded, gridded_dset)
print("Chi squared gridded", chisquare_gridded)
print("Reduced Chi squared gridded", r_chisquare_gridded)
print("Log likelihood gridded", loglike_gridded)
```

Expand Down Expand Up @@ -320,16 +318,16 @@ $$

This is the same reasoning that gives rise to the statistical apothegm "a well-fit model has a reduced $\chi^2_R \approx 1$" (one that has [many caveats and pitfalls](https://arxiv.org/abs/1012.3754)). In this case the extra factor of 2 in the denominator comes about because the visibility data and its noise are complex-valued.

The hope is that for many applications, the normalized negative log likelihood loss function will have a minimum value of $L(\hat{\boldsymbol{\theta}}) \approx 1$ for a well-fit model (regardless of the number of data points), making it easier to set the regularizer strengths *relative* to this value. Note that even this normalized loss won't be the same between an unbinned and binned dataset, though hopefully both will be on the order of $1$.
The hope is that for many applications, the reduced $\chi^2_R$ loss function will have a minimum value of $$\chi^2_R(\hat{\boldsymbol{\theta}}) \approx 1$ for a well-fit model (regardless of the number of data points), making it easier to set the regularizer strengths *relative* to this value. Note that even this normalized loss won't be the same between an unbinned and binned dataset, though hopefully both will be on the order of $1$.

This loss function is implemented in {func}`mpol.losses.nll` and {func}`mpol.losses.nll_gridded`, and we can see the results here.
This loss function is implemented in {func}`mpol.losses.r_chi_squared` and {func}`mpol.losses.r_chi_squared_gridded`, and we can see the results here.

```{code-cell}
nll = losses.nll(vis_model_loose, data_loose, weight_loose)
print("Normalized log likelihood", nll)
r_chi2 = losses.r_chi_squared(vis_model_loose, data_loose, weight_loose)
print("Reduced chi squared", r_chi2)
```

```{code-cell}
nll_gridded = losses.nll_gridded(vis_model_gridded, gridded_dset)
print("Normalized log likelihood gridded", nll_gridded)
r_chi2_gridded = losses.r_chi_squared_gridded(vis_model_gridded, gridded_dset)
print("Reduced chi squared gridded", r_chi2_gridded)
```
8 changes: 4 additions & 4 deletions docs/ci-tutorials/optimization.md
Original file line number Diff line number Diff line change
Expand Up @@ -183,7 +183,7 @@ But, exciting things are about to happen! We can calculate the loss between thes

```{code-cell} ipython3
# calculate a loss
loss = losses.nll_gridded(vis, dset)
loss = losses.r_chi_squared_gridded(vis, dset)
print(loss.item())
```

Expand Down Expand Up @@ -255,7 +255,7 @@ for i in range(300):
vis = rml()
# calculate a loss
loss = losses.nll_gridded(vis, dset)
loss = losses.r_chi_squared_gridded(vis, dset)
loss_tracker.append(loss.item())
Expand Down Expand Up @@ -318,8 +318,8 @@ nufft = fourier.NuFFT(coords=coords, nchan=dset.nchan)
and then we can calculate model visibilities corresponding to some model image (in this case, our optimal image). Since {meth}`mpol.fourier.NuFFT.forward` returns a Pytorch tensor, we'll need to detach it and convert it to numpy. We'll also remove the channel dimension.

```{code-cell} ipython3
# note the NuFFT expects a "packed image cube," as output from ImageCube.forward()
vis_model = nufft(rml.icube(), uu, vv)
# note the NuFFT expects a "packed image cube," as stored in ImageCube.cube
vis_model = nufft(rml.icube.cube, uu, vv)
# convert from Pytorch to numpy, remove channel dimension
vis_model = np.squeeze(vis_model.detach().numpy())
```
Expand Down
28 changes: 16 additions & 12 deletions src/mpol/losses.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,18 +6,22 @@
We expect you might use the loss functions in the following scenarios:
- *optimizing a model with fixed data amplitudes and weights* this is the most common
scenario for RML imaging. Use :meth:`~mpol.losses.r_chi_squared` for loose visibilities
and :meth:`~mpol.losses.r_chi_squared_gridded` for gridded quantities
(e.g., :class:`~mpol.datasets.GriddedDataset`).
- *optimizing a model with potential adjustments to data amplitudes and weights*: you
will need to do this in a self-calibration workflow. Use
:meth:`~mpol.losses.neg_log_likelihood_avg` for loose visibilities. The nature of the
adjustments makes gridding inefficient, so there is no gridded version of this loss.
- *doing Bayesian inference*: if you are calculating a log likelihood to use with MCMC
sampling, for example, use :meth:`~mpol.losses.log_likelihood` or
:meth:`~mpol.losses.log_likelihood_gridded`. (Pyro defines its own likelihood function
through the probabilistic programming language, so this is not needed in that case).
* optimizing a model with fixed data amplitudes and weights:
this is the most common
scenario for RML imaging. Use :meth:`~mpol.losses.r_chi_squared` for loose
visibilities and :meth:`~mpol.losses.r_chi_squared_gridded` for gridded quantities
(e.g., :class:`~mpol.datasets.GriddedDataset`).
* optimizing a model with potential adjustments to data amplitudes and weights:
you will need to do this in a self-calibration workflow. Use
:meth:`~mpol.losses.neg_log_likelihood_avg` for loose visibilities. The nature of
the adjustments makes gridding inefficient, so there is no gridded version of this
loss.
* doing Bayesian inference:
if you are calculating a log likelihood to use with MCMC
sampling, for example, use :meth:`~mpol.losses.log_likelihood` or
:meth:`~mpol.losses.log_likelihood_gridded`. (Pyro defines its own likelihood
function through the probabilistic programming language, so this is not needed in
that case).
"""

import numpy as np
Expand Down

0 comments on commit d8e9829

Please sign in to comment.