place fourier adjoint sources collectively for each frequency #1821

mawc2019 · 2021-11-08T01:24:51Z

The main features of this PR are

The Fourier adjoint sources at each frequency are placed collectively, while in the current master branch the adjoint sources are placed point by point.
The Fourier adjoint solver has 'yee_grid' as a user specified argument, which can be True or False. The default option is False.

This PR passes all the built-in tests except test_adjoint_jax.py, test_binary_partition_utils.py, and test_chunk_balancer.py, as the current master branch does.

smartalecH · 2021-11-08T23:56:06Z

How does this PR relate to #1547? Does that PR need to work/be merged first? Are you locally testing this PR on top of that PR?

mawc2019 · 2021-11-09T01:58:41Z

How does this PR relate to #1547?

The PR in #1547 could improve the accuracy of this PR in some circumstances.

Does that PR need to work/be merged first?

No. This PR can work/be merged without the PR in #1547, but the PR in #1547 needs this PR to work/be merged at first.

Are you locally testing this PR on top of that PR?

This PR has been tested separately from that PR. I also combined the two PRs, but the combined PR does not work at present as described in #1547.

python/adjoint/objective.py

smartalecH · 2021-11-09T15:00:21Z

This PR passes all the built-in tests except test_adjoint_jax.py, test_binary_partition_utils.py, and test_chunk_balancer.py, as the current master branch does.

It looks like it's passing all tests online? Perhaps you need to rebase your local build to ensure they pass locally too.

mawc2019 · 2021-11-09T17:47:36Z

It looks like it's passing all tests online? Perhaps you need to rebase your local build to ensure they pass locally too.

I ran the built-in tests only locally.

smartalecH · 2021-11-09T14:29:11Z

python/adjoint/objective.py

+        thickness_indicator = self.sim.fields.indicate_thick_dims(self.volume.swigobj)
+        min_max_corners = self.sim.fields.get_corners(self._monitor.swigobj, self.component) # ivec values of the corners
+        raw_size = (min_max_corners[3:6]-min_max_corners[0:3])/2+np.ones(3)
+        true_size = np.array([raw_size[i]*(thickness_indicator[i]==1)+(thickness_indicator[i]==0) for i in range(3)],dtype=int) # monitor size


It would probably be a good idea to refactor this piece into an external function call since we'll be using it in some of the other objective quantities too (eventually).

Something similar to ObjectiveQuantity._adj_src_scale(). Maybe ObjectiveQuantity._get_dft_src_shape()?

Perhaps a name like _dft_monitor_size sounds more pertinent. The function would be

def _dft_monitor_size(self,min_max_corners): thickness_indicator = self.sim.fields.indicate_thick_dims(self.volume.swigobj) raw_size = (min_max_corners[3:6]-min_max_corners[0:3])/2+np.ones(3) true_size = np.array([raw_size[i]*(thickness_indicator[i]==1)+(thickness_indicator[i]==0) for i in range(3)],dtype=int) return true_size

Line 252, which computes min_max_corners, is not encapsulated here, because min_max_corners is needed by fourier_sourcedata later.

python/adjoint/objective.py

src/dft.cpp

codecov-commenter · 2021-11-12T15:29:01Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 78.26087% with 5 lines in your changes missing coverage. Please review.

Project coverage is 73.09%. Comparing base (61fae68) to head (5a39da6).
Report is 290 commits behind head on master.

Files with missing lines	Patch %	Lines
python/adjoint/optimization_problem.py	0.00%	4 Missing ⚠️
python/adjoint/objective.py	94.73%	1 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1821      +/-   ##
==========================================
- Coverage   73.14%   73.09%   -0.05%     
==========================================
  Files          17       17              
  Lines        4923     4918       -5     
==========================================
- Hits         3601     3595       -6     
- Misses       1322     1323       +1

Files with missing lines	Coverage Δ
python/adjoint/objective.py	`92.07% <94.73%> (+0.26%)`	⬆️
python/adjoint/optimization_problem.py	`56.18% <0.00%> (-0.59%)`	⬇️

smartalecH · 2021-11-12T18:34:37Z

The failing test is the same failing test described here.

For reference, here are the values comparing the adjoint gradients with the finite differences:

Directional derivative -- adjoint solver: [-4.94904819e-06 -3.02577147e-04 -2.97732191e-04], FD: [-4.75876167e-07 -2.98877141e-04 -3.12129430e-04]

The finite-difference gradients seem to be off (again).

mawc2019 · 2021-11-29T22:25:21Z

The finite-difference gradients seem to be off (again).

It seems that this issue has been resolved. Should I rebuild this PR on the current master branch and upload again?

smartalecH · 2021-11-30T13:53:42Z

It seems that this issue has been resolved. Should I rebuild this PR on the current master branch and upload again?

I retriggered the tests, but they are still failing.

It's possible that the fix implemented in #1780 wasn't sufficient to resolve the issue with the new DFT adjoint sources. In which case, try building with single-precision locally and play with the finite-difference tolerance to see if that really is the issue (as described above). Make sure you rebase onto master when building locally.

mawc2019 · 2021-11-30T19:42:01Z

I rebased this PR on the current master branch locally with --enable-single. The error comes from self.assertClose when it is invoked at test_damping of test_adjoint_solver.py. The problematic gradients are
Directional derivative -- adjoint solver: [1.13276692e-05 1.59813860e-05 1.32704957e-05], FD: [1.19334179e-05 1.85025667e-05 1.42896045e-05].
The tolerance at single precision is 0.06. If it is increased to 0.14, the test will not fail.

For comparision, I also compiled this PR without --enable-single. The adjoint and finite-difference gradients there are
Directional derivative -- adjoint solver: [1.13276534e-05 1.59813124e-05 1.32704613e-05], FD: [1.17366255e-05 1.59434964e-05 1.29082464e-05], which pass the test.

In both cases, Python 3.7.3 was used. The finite-difference gradients seem still off.

smartalecH · 2021-12-01T02:48:29Z

(@oskooi I retriggered the tests after merging #1835 -- unfortunately, the tolerance adjustments in that PR were not enough to make this PR pass).

smartalecH · 2021-12-01T15:32:53Z

The finite-difference gradients seem still off.

The finite-difference gradients are a bit flaky, particularly when single-precision is enabled and the new subpixel smoothing routine is turned on.

One issue we discussed was that the new default for MaterialGrid subpixel smoothing routine is do_smoothing=True. This is a bit problematic if the underlying densities are not filtered (which appears to be the case with the failing test).

Maybe try filtering the "random" design field before running the finite-difference (and corresponding adjoint) simulations. In other words, change the line

meep/python/tests/test_adjoint_solver.py

Line 495 in f9bd757

S12_unperturbed = forward_simulation_damping(p, frequencies)

to something like

S12_unperturbed = forward_simulation_damping(mpa.conic_filter(p,<<put parameters here...>>), frequencies)

and the same for the second part of the finite-difference calculation:

meep/python/tests/test_adjoint_solver.py

Line 502 in f9bd757

S12_perturbed = forward_simulation_damping(p+dp, frequencies)

Some other things we can try:

Do a center-difference calculation
Do some Richardson extrapolation
Don't do a random initial design, but an initial design that makes sense (e.g. a ring or something)

mawc2019 · 2021-12-04T18:36:29Z

The failing check at test_damping was not caused by this PR, because test_damping only involves the EigenmodeCoefficient adjoint solver. In fact, the previous master branch, which was released on this Tuesday morning, did not pass this test, either, when compiled with --enable-single.

Maybe try filtering the "random" design field before running the finite-difference (and corresponding adjoint) simulations.

If we filter the design field in test_damping as follows, the check will be successful in the previous master branch as well as in this PR based on that.

    def test_damping(self):
        print("*** TESTING CONDUCTIVITIES ***")

        for frequencies in [[1/1.58, fcen, 1/1.53]]:
            ## filter the design field
            filter_radius = 0.12
            p_filtered = mpa.conic_filter(
                np.reshape(p,(Nx,Ny)),filter_radius,design_region_size.x,design_region_size.y,design_region_resolution).flatten()
            p_plus_dp_filtered = mpa.conic_filter(
                np.reshape(p+dp,(Nx,Ny)),filter_radius,design_region_size.x,design_region_size.y,design_region_resolution).flatten()
            
            ## compute gradient using adjoint solver
            adjsol_obj, adjsol_grad = adjoint_solver_damping(p_filtered, frequencies)

            ## compute unperturbed S12
            S12_unperturbed = forward_simulation_damping(p_filtered, frequencies)

            ## compare objective results
            print("S12 -- adjoint solver: {}, traditional simulation: {}".format(adjsol_obj,S12_unperturbed))
            self.assertClose(adjsol_obj,S12_unperturbed,epsilon=1e-6)

            ## compute perturbed S12
            S12_perturbed = forward_simulation_damping(p_plus_dp_filtered, frequencies)

            ## compare gradients
            if adjsol_grad.ndim < 2:
                adjsol_grad = np.expand_dims(adjsol_grad,axis=1)
            adj_scale = (dp[None,:]@adjsol_grad).flatten()
            fd_grad = S12_perturbed-S12_unperturbed
            print("Directional derivative -- adjoint solver: {}, FD: {}".format(adj_scale,fd_grad))
            tol = 0.06 if mp.is_single_precision() else 0.03
            self.assertClose(adj_scale,fd_grad,epsilon=tol)

However, doing so leads to a failing check for double precision. Moreover, the current master branch also fails the checks in test_adjoint_solver.py, which may not be caused by the same reason.

smartalecH · 2021-12-06T19:38:29Z

In fact, the previous master branch, which was released on this Tuesday morning, did not pass this test, either, when compiled with --enable-single.

Hmm, I'm not sure I follow. The current master branch appears to be passing all tests (including the recent addition you mention). Are you saying that when you build master on your local machine (e.g. with --enable-single) that test fails for you?

Moreover, the current master branch also fails the checks in test_adjoint_solver.py, which may not be caused by the same reason.

Can you be more specific? Again, master seems to be passing (and is passing for me locally).

mawc2019 · 2021-12-07T00:33:23Z

Are you saying that when you build master on your local machine (e.g. with --enable-single) that test fails for you?

Yes, I built the master branch on a local machine with --enable-single. I used Python 3.7.3. The tests fail at test_adjoint_solver.py and test_get_epsilon_grid.py as I also mentioned in #1835. Such failing tests already existed in a previous master branch. In other words, they are not caused by the most recent commit.

which may not be caused by the same reason.

I need to correct this statement. The reason should be the same, because the message given by the failing check at test_adjoint_solver.py does not change in the different recent versions of the master branch.

The failing check at test_damping was not caused by this PR, because test_damping only involves the EigenmodeCoefficient adjoint solver.

This failing check at test_adjoint_solver.py was caused by the commit on November 22nd. Before that commit, the check did not fail. After that commit, the check fails. In #1830, the test failure of single precision was also mentioned.

smartalecH · 2021-12-08T12:49:34Z

Even after the force push it still seems to be failing. Maybe take a closer look at exactly what's failing, as it's most likely due to something new in this PR (and not an out of date conflicting commit)

(Note you can download the python logs from the CI to see exactly where things are failing)

mawc2019 · 2021-12-08T14:49:39Z

In the log I saw the failure happened at test_adjoint_solver.py during make distcheck after configure with single-precision floating point and swig threads. Can I get the details of that failure at test_adjoint_solver.py? There should be test_adjoint_solver.log online.

smartalecH · 2021-12-08T15:13:51Z

Yes, the details are here.

Note that the DFT tests are failing:

<<line 98>> Directional derivative -- adjoint solver: [-131.11312903], FD: [-0.00342687]

Also, the fields are blowing up during the complex EigenmodeCoefficient tests... Does your code somehow impact those at all?

mawc2019 · 2021-12-08T19:05:07Z

Note that the DFT tests are failing

The version of Python seems to matter a lot. I used Python 3.7.3 locally, and there were no failing tests of the FourierFields adjoint solver.

Also, the fields are blowing up during the complex EigenmodeCoefficient tests...

It seems that the diverging field in the tests of complex fields is also caused by the FourierFields adjoint solver, and adjoint_solver_complex_fields only invokes the FourierFields adjoint solver.

Does your code somehow impact those at all?

I do not think this PR impacts EigenmodeCoefficient.

smartalecH · 2021-12-08T19:39:26Z

It seems that the diverging field in the tests of complex fields is also caused by the FourierFields adjoint solver, and adjoint_solver_complex_fields only invokes the FourierFields adjoint solver.

You're right, I misread the log. The eigenmode tests seem to be unaffected by this (as expected)

The version of Python seems to matter a lot

Are you sure it's the Python version and not the single-precision that's causing the issue? It seems odd that the python version would affect anything in this PR... It's especially odd that the adjoint run for the complex case blows up...

mawc2019 · 2021-12-08T21:57:33Z

Are you sure it's the Python version and not the single-precision that's causing the issue?

I am not sure. I guess that the Python version matters, only because those failures of FourierFields do not appear locally when I compile Meep with or without --enable-single. The failure that I have seen locally is caused by test_damping that invokes EigenmodeCoefficient only. If the Python version does not matter, perhaps there are some differences between local and online tests?

It's especially odd that the adjoint run for the complex case blows up...

Diverging results also appeared previously, as mentioned in #1676. In an example there, the problem also appears without using the adjoint solver. I do not know whether these problems are caused by the same reason.

smartalecH · 2021-12-08T22:03:51Z

I am not sure. I guess that the Python version matters, only because those failures of FourierFields do not appear locally when I compile Meep with or without --enable-single. The failure that I have seen locally is caused by test_damping that invokes EigenmodeCoefficient only. If the Python version does not matter, perhaps there are some differences between local and online tests?

Hmm, I'm guessing your local branch doesn't line up with this PR+master.

mawc2019 · 2021-12-09T01:49:19Z

I'm guessing your local branch doesn't line up with this PR+master.

I just tested this PR locally again based on the current master branch. The check now fails at the same place as the most recent online check. By the way, I also checked the current master branch locally with single precision again, and saw that it still fails test_damping.

mawc2019 · 2021-12-13T19:56:54Z

In addition to the diverging fields and the inconsistency between the gradients, the adjoint gradients may be different in different runs. These problems happen only when this PR is compiled with --enable-single, but the Near2Far adjoint solver does not have these problems even if compiled with --enable-single.

The precision issue has already been handled in get_gradient, but do any places in this PR still need to handle the precision issue? In addition, this PR does not introduce any randomness explicitly. Where could the randomness in adjoint gradients come from?

smartalecH · 2021-12-14T03:51:53Z

Can you provide a MWE that illustrates the problem? Something simpler than the entire test suite.

I can then pull your branch and look into it locally.

mawc2019 · 2021-12-14T17:44:13Z

An example is as follows.

import meep as mp
import meep.adjoint as mpa
import numpy as np
from autograd import numpy as npa
Si, SiO2 = mp.Medium(index=3.4), mp.Medium(index=1.44)
print("Single precision? ",mp.is_single_precision())

resolution,design_resolution = 20,20
Lx,Ly,pml_size = 2,4,1
cell_size = mp.Vector3(Lx,Ly+2*pml_size)
pml_layers = [mp.PML(pml_size, direction=mp.Y)]
design_width,design_height = 1,0.6
monitor_width,monitor_height = 1,design_height+0.1

fcen,fwidth = 1/1.55,0.01/1.55
source_center,source_size = mp.Vector3(0,-Ly/4),mp.Vector3(Lx,0)

src = mp.GaussianSource(frequency=fcen,fwidth=fwidth)
source = [mp.Source(src,component=mp.Ez,center=source_center,size=source_size)]
Nx,Ny = int(round(design_resolution*design_width)),1

design_variables = mp.MaterialGrid(mp.Vector3(Nx,Ny),SiO2,Si,grid_type='U_MEAN')
design_region = mpa.DesignRegion(design_variables,volume=mp.Volume(center=mp.Vector3(), size=mp.Vector3(design_width, design_height)))

geometry = [mp.Block(center=design_region.center, size=design_region.size, material=design_variables)] # design region
sim = mp.Simulation(
    cell_size=cell_size,boundary_layers=pml_layers,geometry=geometry,
    sources=source,default_material=SiO2,resolution=resolution)

nf = mpa.FourierFields(sim,mp.Volume(center=mp.Vector3(0,monitor_height),size=mp.Vector3(monitor_width,0)),mp.Ez,yee_grid=True)
ob_list = [nf]

def J(near_data):
    return npa.abs(near_data[0,0])**2

opt = mpa.OptimizationProblem(
    simulation = sim,objective_functions = J,objective_arguments = ob_list,design_regions = [design_region],
    fcen=fcen,df = 0,nf = 1,decay_by = 1e-5)

x0 = 0.5*np.ones(Nx*Ny)
opt.update_design([x0])
f0, dJ_deps = opt()

db, choose = 1e-4, 5
np.random.seed(20211204)

g_discrete, idx = opt.calculate_fd_gradient(num_gradients=choose,db=db)
g_discrete = np.array(g_discrete).flatten()
g_adjoint = dJ_deps.flatten()

print("Randomly selected indices: ",idx)
print("Finite-difference gradient: ",g_discrete)
print("Adjoint gradient: ",g_adjoint[idx])

With double precision, the results are

Finite-difference gradient:  [206.65506121 -82.40772321  98.45018865   5.91237534  85.70581779]
Adjoint gradient:  [206.65504516 -82.38054598  98.43878076   5.91337638  85.71809238]

With single precision, the adjoint gradients vary among different runs and do not match the finite-difference gradients.

stevengj · 2022-01-26T03:22:04Z

the adjoint run hangs at MPI_Allreduce in max_to_all, which is invoked in one of the new functions get_dims

This will happen if not all of the processes are calling max_to_all. Maybe add a printf("hello from process %\n", my_rank()); to see which processes are calling fourier_sourcedata and hence get_dims?

src/dft.cpp

stevengj · 2022-02-03T17:45:18Z

python/adjoint/objective.py

-            frequencies. Because of linearity, we can sum across the
-            second frequency dimension to calculate a frequency
-            scale factor for each point (rather than a scale vector).
-            '''


Keep this comment, it's helpful!

@smartalecH, could you explain what is going on here? Why can you sum over frequency if the objective function is vector-valued?

python/adjoint/objective.py

src/array_slice.cpp

src/dft.cpp

stevengj · 2022-02-03T18:39:29Z

src/dft.cpp

+  volume *where = &v; // use full volume of fields
+  bufsz = 0;
+  min_corner = gv.round_vec(where->get_max_corner()) + one_ivec(gv.dim);
+  max_corner = gv.round_vec(where->get_min_corner()) - one_ivec(gv.dim);


You can just use v.get_max_corner() etcetera and eliminate where completely?

src/dft.cpp

…w functions

… and tweak calculate_fd_gradient

… and some other minor changes

oskooi · 2022-02-11T16:12:23Z

Looks like this PR, particularly the contents of dft_fields::fourier_sourcedata, is causing failures in the address sanitizer related to the misuse of size_t (unsigned) and ptrdiff_t (signed).

smartalecH reviewed Nov 9, 2021

View reviewed changes

python/adjoint/objective.py Show resolved Hide resolved

smartalecH reviewed Nov 11, 2021

View reviewed changes

smartalecH reviewed Nov 12, 2021

View reviewed changes

src/dft.cpp Outdated Show resolved Hide resolved

stevengj force-pushed the fourier_adjoint branch from 94533d9 to c41747a Compare December 8, 2021 02:48

mawc2019 mentioned this pull request Dec 10, 2021

Fix adjoint gradient with conductivities #1830

Merged

stevengj reviewed Jan 26, 2022

View reviewed changes

src/dft.cpp Outdated Show resolved Hide resolved