MPoL-dev · iancze · Jan 5, 2024 · Apr 5, 2023 · Apr 5, 2023 · Apr 5, 2023
diff --git a/docs/_static/mmd/src/SimpleNet.mmd → docs/_static/mmd/src/GriddedNet.mmd b/docs/_static/mmd/src/SimpleNet.mmd → docs/_static/mmd/src/GriddedNet.mmd
@@ -1,5 +1,5 @@
 graph TD
-    subgraph SimpleNet
+    subgraph GriddedNet
     bc(BaseCube) --> HannConvCube
     HannConvCube --> ImageCube
     ImageCube --> FourierLayer

diff --git a/docs/changelog.md b/docs/changelog.md
@@ -3,9 +3,30 @@
 # Changelog
 
 ## v0.3.0
+
+- Standardized nomenclature of {class}`mpol.coordinates.GridCoords` and {class}`mpol.fourier.FourierCube` to use `sky_cube` for a normal image and `ground_cube` for a normal visibility cube (rather than `sky_` for visibility quantities). Routines use `packed_cube` instead of `cube` internally to be clear when packed format is preferred.
+- Modified {class}`mpol.coordinates.GridCoords` object to use cached properties [#187](https://github.com/MPoL-dev/MPoL/pull/187).
+- Changed the base spatial frequency unit from k$\lambda$ to $\lambda$, addressing [#223](https://github.com/MPoL-dev/MPoL/issues/223). This will affect most users data-reading routines!
+- Added the {meth}`mpol.gridding.DirtyImager.from_tensors` routine to cover the use case where one might want to use the {meth}`mpol.gridding.DirtyImager` to image residual visibilities. Otherwise, {meth}`mpol.gridding.DirtyImager` and {meth}`mpol.gridding.DataAverager` are the only notable routines that expect `np.ndarray` input arrays. This is because they are designed to work with data arrays directly after loading (say from a MeasurementSet or `.npy` file) and are implemented internally in numpy. If a routine requires data separately as `data_re` and `data_im`, that is a tell-tale sign that the routine works with numpy histogram routines internally.
+- Changed name of {class}`mpol.precomposed.SimpleNet` to {class}`mpol.precomposed.GriddedNet` to more clearly indicate purpose. Updated documentation to make clear that this is just a convenience starter module, and users are encouraged to write their own `nn.Module`s.
+- Changed internal instance attribute of {class}`mpol.images.ImageCube` from `cube` to `packed_cube` to more clearly indicate format.
+- Removed `mpol.fourier.get_vis_residuals` and added `predict_loose_visibilities` to {class}`mpol.precomposed.SimpleNet`.
+- Standardized treatment of numpy vs `torch.tensor`s, with preference for `torch.tensor` in many routines. This simplifies the internal logic of the routines and will make most operations run faster.
+- Standardized the input types of {class}:`mpol.fourier.NuFFT` and {class}:`mpol.fourier.NuFFTCached` to expect {class}`torch.Tensor`s (removed support for numpy arrays). This simplifies the internal logic of the routines and will make most operations run faster.
+- Changed {class}`mpol.fourier.make_fake_data` -> {class}`mpol.fourier.generate_fake_data`.
+- Changed base spatial frequency unit from k$\lambda$ to $\lambda$, closing issue [#223](https://github.com/MPoL-dev/MPoL/issues/223) and simplifying the internals of the codebase in numerous places. The following routines now expect inputs in units of $\lambda$:
+  - {class}`mpol.coordinates.GridCoords`
+  - {class}`mpol.coordinates.check_data_fit`
+  - {class}`mpol.datasets.GriddedDataset`
+  - {class}`mpol.fourier.NuFFT.forward`
+  - {class}`mpol.fourier.NuFFTCached`
+  - {class}`mpol.gridding.verify_no_hermitian_pairs`
+  - {class}`mpol.gridding.GridderBase`
+  - {class}`mpol.gridding.DataAverager`
+  - {class}`mpol.gridding.DirtyImager`
 - Major documentation edits to be more concise with the objective of making the core package easier to develop and maintain. Some tutorials moved to the [MPoL-dev/examples](https://github.com/MPoL-dev/examples) repository.
 - Added the {meth}`mpol.losses.neg_log_likelihood_avg` method to be used in point-estimate or optimization situations where data amplitudes or weights may be adjusted as part of the optimization (such as via self-calibration). Moved all documentation around loss functions into the [Losses API](api/losses.md).
-- Renamed `mpol.losses.nll` -> {meth}`mpol.losses.r_chi_squared` and `mpol.losses.nll_gridded` -> {meth}`mpol.losses.r_chi_squared_gridded` because that is what those routines were previously calculating (see the {ref}`api-reference-label` for more details). ([#237](https://github.com/MPoL-dev/MPoL/issues/237)). Tutorials have also been updated to reflect the change. 
+- Renamed `mpol.losses.nll` -> {meth}`mpol.losses.r_chi_squared` and `mpol.losses.nll_gridded` -> {meth}`mpol.losses.r_chi_squared_gridded` because that is what those routines were previously calculating. ([#237](https://github.com/MPoL-dev/MPoL/issues/237)). Tutorials have also been updated to reflect the change. 
 - Fixed implementation and docstring of {meth}`mpol.losses.log_likelihood` ([#237](https://github.com/MPoL-dev/MPoL/issues/237)).
 - Made some progress converting docstrings from "Google" style format to "NumPy" style format. Ian is now convinced that NumPy style format is more readable for the type of docstrings we write in MPoL. We usually require long type definitions and long argument descriptions, and the extra indentation required for Google makes these very scrunched.
 - Make the `passthrough` behaviour of {class}`mpol.images.ImageCube` the default and removed this parameter entirely. Previously, it was possible to have {class}`mpol.images.ImageCube` act as a layer with `nn.Parameter`s. This functionality has effectively been replaced since the introduction of {class}`mpol.images.BaseCube` which provides a more useful way to parameterize pixel values. If a one-to-one mapping (including negative pixels) from `nn.Parameter`s to output tensor is desired, then one can specify `pixel_mapping=lambda x : x` when instantiating {class}`mpol.images.BaseCube`. More details in ([#246](https://github.com/MPoL-dev/MPoL/issues/246))
@@ -28,7 +49,7 @@
 - TOML does not support adding keyed entries, so creating layered build environments of default, `docs`, `test`, and `dev` as we used to with `setup.py` is laborious and repetitive with `pyproject.toml`. We have simplified the list to be default (key dependencies), `test` (minimal necessary for test-suite), and `dev` (covering everything needed to build the docs and actively develop the package).
 - Removed custom `spheroidal_gridding` routines, tests, and the `UVDataset` object that used them. These have been superseded by the TorchKbNuFFT package. For reference, the old routines (including the tricky `corrfun` math) is preserved in a Gist [here](https://gist.github.com/iancze/f3d2769005a9e2c6731ee6977f166a83).
 - Changed API of {class}`~mpol.fourier.NuFFT`. Previous signature took `uu` and `vv` points at initialization (`__init__`), and the `.forward` method took only an image cube. This behaviour is preserved in a new class {class}`~mpol.fourier.NuFFTCached`. The updated signature of {class}`~mpol.fourier.NuFFT` *does not* take `uu` and `vv` at initialization. Rather, its `forward` method is modified to take an image cube and the `uu` and `vv` points. This allows an instance of this class to be used with new `uu` and `vv` points in each forward call. This follows the standard expectation of a layer (e.g., a linear regression function predicting at new `x`) and the pattern of the TorchKbNuFFT package itself. It is expected that the new `NuFFT` will be the default routine and `NuFFTCached` will only be used in specialized circumstances (and possibly deprecated/removed in future updates). Changes implemented by [#232](https://github.com/MPoL-dev/MPoL/pull/232).
-- Moved "Releasing a new version of MPoL" from the wiki to the Developer Documentation ({ref}`releasing-new-version-label`).
+- Moved "Releasing a new version of MPoL" from the wiki to the Developer Documentation on the main docs.
 
 ## v0.2.0
 
@@ -38,7 +59,7 @@
 - Reorganized some of the docs API
 - Expanded discussion and demonstration in `optimzation.md` tutorial
 - Localized harcoded Zenodo record reference to single instance, and created new external Zenodo record from which to draw
-- Added [Parametric inference with Pyro tutorial](large-tutorials/pyro.md)
+- Added Parametric inference with Pyro tutorial
 - Updated some discussion and notation in `rml_intro.md` tutorial
 - Added `mypy` static type checks
 - Added `frank` as a 'test' and 'analysis' extras dependency

diff --git a/docs/ci-tutorials/crossvalidation.md b/docs/ci-tutorials/crossvalidation.md
@@ -292,7 +292,7 @@ def cross_validate(config):
     for k_fold, (train_dset, test_dset) in enumerate(k_fold_datasets):
 
         # create a new model and optimizer for this k_fold
-        rml = precomposed.SimpleNet(coords=coords, nchan=train_dset.nchan)
+        rml = precomposed.GriddedNet(coords=coords, nchan=train_dset.nchan)
         optimizer = torch.optim.Adam(rml.parameters(), lr=config["lr"])
 
         # train for a while
@@ -310,7 +310,7 @@ Finally, we'll write one more function to train the model using the full dataset
 
 ```{code-cell}
 def train_and_image(pars):
-    rml = precomposed.SimpleNet(coords=coords, nchan=dset.nchan)
+    rml = precomposed.GriddedNet(coords=coords, nchan=dset.nchan)
     optimizer = torch.optim.Adam(rml.parameters(), lr=pars["lr"])
     writer = SummaryWriter()
     train(rml, dset, pars, optimizer, writer=writer)
@@ -324,8 +324,6 @@ def train_and_image(pars):
     return fig, ax
 ```
 
-All of the method presented here can be sped up using GPU acceleration on certain Nvidia GPUs. To learn more about this, please see the {ref}`GPU Setup Tutorial <gpu-reference-label>`.
-
 +++
 
 ## Results

diff --git a/docs/ci-tutorials/fakedata.md b/docs/ci-tutorials/fakedata.md
@@ -288,9 +288,9 @@ fname = download_file(
 # select the components for a single channel
 chan = 4
 d = np.load(fname)
-uu = d["uu"][chan]
-vv = d["vv"][chan]
-weight = d["weight"][chan]
+uu = torch.as_tensor(d["uu"][chan])
+vv = torch.as_tensor(d["vv"][chan])
+weight = torch.as_tensor(d["weight"][chan])
 ```
 
 MPoL has a helper routine to calculate the maximum `cell_size` that can still Nyquist sample the highest spatial frequency in the baseline distribution.
@@ -306,12 +306,12 @@ Thankfully, we see that we already chose a sufficiently small `cell_size`.
 
 ## Making the mock dataset
 
-With the {class}`~mpol.images.ImageCube`, $u,v$ and weight distributions now in hand, generating the mock visibilities is relatively straightforward using the {func}`mpol.fourier.make_fake_data` routine. This routine uses the {class}`~mpol.fourier.NuFFT` to produce loose visibilities at the $u,v$ locations and then adds random Gaussian noise to the visibilities, drawn from a probability distribution set by the value of the weights.
+With the {class}`~mpol.images.ImageCube`, $u,v$ and weight distributions now in hand, generating the mock visibilities is relatively straightforward using the {func}`mpol.fourier.generate_fake_data` routine. This routine uses the {class}`~mpol.fourier.NuFFT` to produce loose visibilities at the $u,v$ locations and then adds random Gaussian noise to the visibilities, drawn from a probability distribution set by the value of the weights.
 
 ```{code-cell} ipython3
 from mpol import fourier
 # will have the same shape as the uu, vv, and weight inputs
-data_noise, data_noiseless = fourier.make_fake_data(image, uu, vv, weight)
+data_noise, data_noiseless = fourier.generate_fake_data(img_tensor_packed, coords, uu, vv, weight)
 
 print(data_noise.shape)
 print(data_noiseless.shape)
@@ -337,22 +337,19 @@ To make sure the whole process worked OK, we'll load the visibilities and then m
 ```{code-cell} ipython3
 from mpol import coordinates, gridding
 
-# well set the
 coords = coordinates.GridCoords(cell_size=cell_size, npix=npix)
 
-imager = gridding.DirtyImager(
+imager = gridding.DirtyImager.from_tensors(
     coords=coords,
     uu=uu,
     vv=vv,
     weight=weight,
-    data_re=np.squeeze(np.real(data)),
-    data_im=np.squeeze(np.imag(data)),
-)
+    data=data)
 ```
 
 ```{code-cell} ipython3
-C = 1 / np.sum(weight)
-noise_estimate = C * np.sqrt(np.sum(weight))
+C = 1 / torch.sum(weight)
+noise_estimate = C * torch.sqrt(torch.sum(weight))
 print(noise_estimate, "Jy / dirty beam")
 ```
 

diff --git a/docs/ci-tutorials/gridder.md b/docs/ci-tutorials/gridder.md
@@ -59,7 +59,7 @@ data_im = np.imag(data)
 
 ## Plotting the data
 
-Following some of the exercises in the [visread documentation](https://mpol-dev.github.io/visread/tutorials/introduction_to_casatools.html), let's plot up the baseline distribution and get a rough look at the raw visibilities. For more information on these data types, we recommend you read the [Introduction to RML Imaging](../rml_intro.md).
+Following some of the exercises in the [visread documentation](https://mpol-dev.github.io/visread/tutorials/introduction_to_casatools.html), let's plot up the baseline distribution and get a rough look at the raw visibilities. 
 
 Note that the `uu`, `vv`, `weight`, `data_re`, and `data_im` arrays are all two-dimensional numpy arrays of shape `(nchan, nvis)`. This is because MPoL has the capacity to image spectral line observations. MPoL will absolutely still work with single-channel continuum data, you will just need to work with 2D arrays of shape `(1, nvis)`.
 

diff --git a/docs/ci-tutorials/initializedirtyimage.md b/docs/ci-tutorials/initializedirtyimage.md
@@ -115,7 +115,7 @@ Here we set the optimizer and the image model (RML). If this is unfamiliar pleas
 
 ```{code-cell}
 dirty_image = torch.tensor(img.copy())  # turns it into a pytorch tensor
-rml = precomposed.SimpleNet(coords=coords, nchan=dset.nchan)
+rml = precomposed.GriddedNet(coords=coords, nchan=dset.nchan)
 optimizer = torch.optim.SGD(
     rml.parameters(), lr=1000.0
 )  # multiple different possiple optimizers
@@ -205,7 +205,7 @@ For more information on saving and loading models in PyTorch, please consult the
 Now let's assume we're about to start an optimization loop in a new file, and we've just created a new model.
 
 ```{code-cell}
-rml = precomposed.SimpleNet(coords=coords)
+rml = precomposed.GriddedNet(coords=coords)
 rml.state_dict()  # the now uninitialized parameters of the model (the ones we started with)
 ```