Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

evaluating the order of accuracy of the adjoint gradient #1854

Open
2 tasks
oskooi opened this issue Dec 12, 2021 · 31 comments
Open
2 tasks

evaluating the order of accuracy of the adjoint gradient #1854

oskooi opened this issue Dec 12, 2021 · 31 comments

Comments

@oskooi
Copy link
Collaborator

oskooi commented Dec 12, 2021

While we have already demonstrated second-order accuracy for subpixel smoothing of the MaterialGrid in a "forward" simulation (see results in #1503), following #1780 which enabled support for backpropagating the gradients from the Yee grid to the MaterialGrid, we have yet to validate the second-order accuracy of the gradients computed using an adjoint simulation.

ToDos

  • using subpixel smoothing, verify that the gradient via the directional derivative converges with second-order accuracy with the Meep grid resolution
  • using no smoothing, verify that the directional directive converges with first-order accuracy

The following 2d test computes the directional derivative using the adjoint gradient over a design region consisting of a ring structure which connects an input and output port/waveguide. (This is a similar setup to the existing unit tests for the adjoint solver in which the design region contains random pixel values and the directional derivative is computed in a random direction.) The objective function is a single DFT field point in the output waveguide at a single frequency (the red dot in the schematic below). (We would also eventually like to adapt this test to the other two objective functions: mode coefficients and near-to-far fields.) The ring structure is a MaterialGrid at a fixed resolution of 640 pixels/μm while the Meep grid resolution is repeatedly doubled: 20, 40, 80, and 160 pixels/μm. The directional derivative should be converging to a fixed value with increasing resolution but this does not appear to be the case. The output from running the script below shows the resolution in the second column and directional derivative in the third column.

grad:, 20.0, -8.3007345310497
grad:, 40.0, -9.140322608657096
grad:, 80.0, -8.32203124442412
grad:, 160.0, -7.0622182341423985

Before we can validate the accuracy of the gradients, we need to ensure that the results are converging to a fixed value.

ring_matgrid_opt

import meep as mp
import meep.adjoint as mpa
import numpy as np
from autograd import numpy as npa

silicon = mp.Medium(epsilon=12)

sxy = 5.0
cell_size = mp.Vector3(sxy,sxy,0)

dpml = 1.0
pml_xy = [mp.PML(thickness=dpml)]

eig_parity = mp.EVEN_Y + mp.ODD_Z

design_region_size = mp.Vector3(1.5,1.5)
design_region_resolution = 640
Nx = int(design_region_resolution*design_region_size.x) + 1
Ny = int(design_region_resolution*design_region_size.y) + 1

w = 1.0
waveguide_geometry = [mp.Block(material=silicon,
                               center=mp.Vector3(),
                               size=mp.Vector3(mp.inf,w,mp.inf))]

fcen = 1/1.55
df = 0.23*fcen
wvg_source = [mp.EigenModeSource(src=mp.GaussianSource(fcen,fwidth=df),
                                 center=mp.Vector3(-0.5*sxy+dpml,0),
                                 size=mp.Vector3(0,sxy-2*dpml),
                                 eig_parity=eig_parity)]

x = np.linspace(-0.5*design_region_size.x,0.5*design_region_size.x,Nx)
y = np.linspace(-0.5*design_region_size.y,0.5*design_region_size.y,Ny)
xv, yv = np.meshgrid(x,y)

rad = 0.538948295
wdt = 0.194838432
ring_weights = np.logical_and(np.sqrt(np.square(xv) + np.square(yv)) > rad,
                              np.sqrt(np.square(xv) + np.square(yv)) < rad+wdt,
                              dtype=np.double)
ring_weights = ring_weights.flatten()

def adjoint_solver(resolution):
    matgrid = mp.MaterialGrid(mp.Vector3(Nx,Ny),
                              mp.air,
                              silicon,
                              weights=np.ones((Nx,Ny)))

    matgrid_region = mpa.DesignRegion(matgrid,
                                      volume=mp.Volume(center=mp.Vector3(),
                                                       size=mp.Vector3(design_region_size.x,
                                                                       design_region_size.y,
                                                                       0)))

    matgrid_geometry = [mp.Block(center=matgrid_region.center,
                                 size=matgrid_region.size,
                                 material=matgrid)]

    geometry = waveguide_geometry + matgrid_geometry

    sim = mp.Simulation(resolution=resolution,
                        cell_size=cell_size,
                        boundary_layers=pml_xy,
                        sources=wvg_source,
                        geometry=geometry)

    obj_list = [mpa.FourierFields(sim,
                                  mp.Volume(center=mp.Vector3(1.25),
                                            size=mp.Vector3()),
                                  mp.Ez)]

    def J(mode_mon):
        return npa.power(npa.abs(mode_mon),2)

    opt = mpa.OptimizationProblem(
        simulation=sim,
        objective_functions=J,
        objective_arguments=obj_list,
        design_regions=[matgrid_region],
        frequencies=[fcen])

    f, dJ_du = opt([ring_weights])

    sim.reset_meep()

    return f, dJ_du



for r in [20., 40., 80., 160.]:
    adjsol_obj, adjsol_grad = adjoint_solver(r)
    print("grad:, {}, {}".format(r,np.sum(adjsol_grad)))
@oskooi oskooi changed the title evaluating the error convergence of the gradient from the adjoint solver evaluating the accuracy of the gradient from the adjoint solver Dec 12, 2021
@smartalecH
Copy link
Collaborator

smartalecH commented Dec 12, 2021

Even though smoothing is "on", it's not going to work unless the input is smoothed and beta=np.inf.

@ianwilliamson
Copy link
Contributor

It looks like your design resolution (set to 640?) is much higher than even your highest simulation resolution (160). Is this intended? I thought the correct approach was to use a comparable or lower design resolution to ensure a slowly varying latent design over each Yee grid cell?

One issue with this approach is that the bilinear interpolation from a high resolution design array to a relatively lower resolution Yee grid will lead to a very sparse gradient, where may of the gradient values are zero (by definition of the bilinear interpolation). The locations of the pixels which are non-zero will probably shift significantly with changes to the Yee grid resolution, which could partially explain the behavior you're seeing. It would be useful to plot the gradient field to check this.

@oskooi oskooi changed the title evaluating the accuracy of the gradient from the adjoint solver evaluating the accuracy of the adjoint gradient Dec 13, 2021
@oskooi
Copy link
Collaborator Author

oskooi commented Dec 13, 2021

It looks like your design resolution (set to 640?) is much higher than even your highest simulation resolution (160). Is this intended?

This is intentional because we want to ensure that, as part of the convergence study, the MaterialGrid is always being down-sampled to Meep's Yee grid. Note that in this example, the inner ring radius is ~0.5 μm and the ring width is ~0.2 μm while the largest Meep grid voxel dimension is 0.05 μm (at the smallest resolution of 20 pixels/μm). This means the MaterialGrid geometry should be resolvable on even the coarsest Yee grid used in the convergence study.

The locations of the pixels which are non-zero will probably shift significantly with changes to the Yee grid resolution, which could partially explain the behavior you're seeing.

The gradient for this geometry is indeed sparse (only the values at the dielectric interfaces are non-zero) which could therefore cause large changes in the directional derivative with "small" changes in the Meep grid resolution. Will post results on this topic shortly.

@smartalecH I did try smoothing the ring geometry of the MaterialGrid and using the built-in projection feature with beta=np.inf. However, the directional derivative still does not appear to be converging to a fixed value with increasing Meep grid resolution.

@smartalecH
Copy link
Collaborator

smartalecH commented Dec 13, 2021

I thought the correct approach was to use a comparable or lower design resolution to ensure a slowly varying latent design over each Yee grid cell?

What really matters is that the first-order Taylor approximation used to define the interfaces of the new subpixel smoothing routine is satisfied. The intended way to do this is using a smoothing filter (in fact, @stevengj derived this approach with the assumption the input variables would be smoothed, as the original adjoint solver was built on a density-based framework).

One issue with this approach is that the bilinear interpolation from a high-resolution design array to a relatively lower resolution Yee grid will lead to a very sparse gradient

All the more reason to first smooth using a filter. Backpropagating through this filter is just a transposed convolution, which will "smear" the gradient energy to all surrounding pixels so that the resulting gradient is no longer sparse. The filter will also "regularize" the problem (although that doesn't matter here).

The gradient for this geometry is indeed sparse (only the values at the dielectric interfaces are non-zero) which could therefore cause large changes in the directional derivative with "small" changes in the Meep grid resolution. Will post results on this topic shortly.

This is actually a different kind of sparsity than what @ianwilliamson is referring to. This kind of sparsity is a result of levelset "shape optimization" methods (i.e. when beta=np.inf), where only the boundaries contain gradient information.

I did try smoothing the ring geometry of the MaterialGrid and using the built-in projection feature with beta=np.inf. However, the directional derivative still does not appear to be converging to a fixed value with increasing Meep grid resolution.

@oskooi can you post the code and the results? I don't see any filtering or projection in the above script.

How are you computing the directional derivative? The above code is just computing the sum of the gradient. You're not scaling with resolution (i.e. doing a proper integral). In that case, we shouldn't expect the sum to converge to a finite value with increasing resolution (although the integral should converge...I think...). Edit: although if the design region resolution is consistent, the sum should also work.

But the directional derivative should converge, assuming you're acting in the same direction with increasing resolution (which means your design grid resolution must remain the same throughout all runs).

@ianwilliamson
Copy link
Contributor

This is intentional because we want to ensure that, as part of the convergence study, the MaterialGrid is always being down-sampled to Meep's Yee grid.

This is what I was trying to point out. By definition, using bilinear interpolation to downsample from high resolution to low resolution will always create "sparsity" in the gradient. This is the correct gradient for the operation being performed, but I'm not sure if it's what you want for the test you're creating.

What really matters is that the first-order Taylor approximation used to define the interfaces of the new subpixel smoothing routine is satisfied. The intended way to do this is using a smoothing filter (in fact, @stevengj derived this approach with the assumption the input variables would be smoothed, as the original adjoint solver was built on a density-based framework).

I think even with a filter applied, a sufficiently high resolution design array will still have sparsity in the gradient. It should depend on the ratio of the resolutions and the size of the filter kernel. If your filter doesn't cover enough pixels in the high resolution design array, then some design pixels will just never influence the simulation and, as a result, will have zero gradient. Just want to point out that having a filter doesn't guarantee the intended behavior.

@oskooi oskooi changed the title evaluating the accuracy of the adjoint gradient evaluating the order of accuracy of the adjoint gradient Dec 13, 2021
@oskooi
Copy link
Collaborator Author

oskooi commented Dec 14, 2021

Here are some results for the adjoint gradient map obtained using subpixel smoothing (script below) for a fixed MaterialGrid resolution (160 pixels/μm) and different combinations of (1) Meep resolution (40 and 80 pixels/μm), (2) the tanh projection bias β (1.0 and 10.0), and (3) the radius of the conic filter (16 and 64 pixels).

Some comments:

  • increasing the filter radius does seem to mitigate the sparsity even when β is large (compare each inset in the two figures below)
  • the larger the down sampling (i.e., the ratio of the MaterialGrid resolution to the Meep grid resolution), the greater the sparsity
  • increasing β increases the overall sparsity of the gradient map by forcing the gradient to be non-zero mainly at or near the dielectric interfaces

The next thing to check based on the first comment is to redo the convergence study with a larger filter radius.

1. filter radius: 16 pixels
adjoint_grading_vary_resolution_fradius16

2. filter radius: 64 pixels
adjoint_grading_vary_resolution_fradius64

The script used for subpixel smoothing is as follows:

weights = np.logical_and(np.sqrt(np.square(xv) + np.square(yv)) > rad,
                         np.sqrt(np.square(xv) + np.square(yv)) < rad+wdt,
                         dtype=np.double)

filter_radius = 16/design_region_resolution
filtered_weights = mpa.conic_filter(weights,
                                    filter_radius,
                                    design_region_size.x,
                                    design_region_size.y,
                                    design_region_resolution)

ring_weights = filtered_weights.flatten()

def adjoint_solver(resolution):
    matgrid = mp.MaterialGrid(mp.Vector3(Nx,Ny),
                              mp.air,
                              silicon,
                              weights=np.ones((Nx,Ny)),
                              do_averaging=True,
                              beta=np.inf)

    f, dJ_du = opt([ring_weights])

@oskooi
Copy link
Collaborator Author

oskooi commented Dec 14, 2021

Quick update: increasing the filter radius from 16 pixels to 32 and then 64 produces results for the directional derivative which appear to be converging to a fixed value. However, as shown below, the convergence rate appears to be linear (or slightly greater) but not quadratic. It's likely that the convergence rate is indeed quadratic and we just need to extend the data out to larger resolutions to see the trend more clearly...

adj_grad_vs_res

@smartalecH
Copy link
Collaborator

Are you backpropagating the gradient through the filter? How are you computing the directional derivative?

@oskooi
Copy link
Collaborator Author

oskooi commented Dec 17, 2021

Based on recent discussions, I modified the test above to remove the input and output waveguides (since the terminations contain sharp corners and thus introduce field discontinuities which can affect the results) so there is now just the ring structure. I extended the Meep grid resolution range to 20, 40, 80, 160, and 320 using a MaterialGrid with a fixed resolution of 1280. The "exact" result is computed at a Meep grid resolution of 640. The gradient computed on the MaterialGrid is backpropagated through the conic filter using the vJp. The directional derivative is the sum of all the values of the gradient map (equivalent to a direction vector of [1, 1, 1, ... ]). The full script is here.

The figure below shows the results for three different filter radii. The convergence rate is (nearly) second order. Note that increasing the filter radius does not seem to significantly change the results. For completeness, we can try to continue extending the resolution range even further but these results seem sufficient.

Next steps are (1) to demonstrate that reducing the filter radius to zero will eventually lead to a first-order convergence rate and (2) show results for no smoothing.

adj_grad_vs_res_desres1280_b

@oskooi
Copy link
Collaborator Author

oskooi commented Dec 20, 2021

As expected, gradually decreasing the filter radius to zero seems to be causing the convergence rate to switch to linear from quadratic for large filter radii. This is shown in the figure below for the results of the smallest filter radius of 2 pixels.

adj_grad_vs_res_desres1280

@stevengj
Copy link
Collaborator

Maybe a clearer way to present this would be to plot the |gradient - FD gradient| vs resolution, showing that gradient always matches the forward solver.

Ideally, this error should be small regardless of the resolution. That way, if the forward solver is converging quadratically, the gradient necessarily is as well.

@oskooi
Copy link
Collaborator Author

oskooi commented Jan 25, 2022

Based on @stevengj's suggestion above, here are results comparing the adjoint gradient vs. the finite-difference gradient (via the directional derivative) as a function of the Meep grid resolution (25, 50, 100, and 200 pixels/μm) for a ring structure (fixed MaterialGrid resolution of 256 pixels/μm and smoothed using a conical filter with radius of 20 pixels) with and without subpixel smoothing. The results were generated using this script. The data consists of four columns: resolution, directional derivative via finite-difference gradient, directional derivative via adjoint gradient, and relative error of directional derivative.

There are three different test cases. Two of the test cases for which beta=0 (i.e., no projection) of the MaterialGrid produce the correct result: the directional derivative is the nearly the same for the finite-difference and adjoint gradients at all resolutions. However, the third case involving subpixel smoothing and beta=meep.inf shows that the results are not matching. In fact, results for the adjoint gradient are nearly two orders of magnitude smaller than the finite-difference gradient. This could be an indication of a possible bug in the gradient calculation using subpixel smoothing. I also verified that #1855 produces nearly identical results (the results below were generated using the master branch).

  1. do_averaging=False and beta=0 [MATCHING]
 25, 0.0001647893063382866, 0.00016478545812634394, 2.3352315924872847e-05
 50, 8.401643947411608e-05, 8.402562535793222e-05, 0.0001093343621039143
 100, 3.906795224262183e-05, 3.9082547485223304e-05, 0.00037358606642179664
 200, 2.1522901334170008e-05, 2.154429340699264e-05, 0.0009939214277152168
  1. do_averaging=True and beta=0 [MATCHING]
 25, 0.0001649242139017404, 0.00016478547941764323, 0.0008412014270980213
 50, 8.408429453657384e-05, 8.402562334214373e-05, 0.0006977663873316858
 100, 3.910189194772373e-05, 3.908253600774202e-05, 0.0004950128757859302
 200, 2.1542667606067045e-05, 2.1544279353341334e-05, 7.481651315249934e-05
  1. do_averaging=True and beta=meep.inf [NOT MATCHING]
 25, -4.076376569521756, -0.012698227454176274, 0.9968849228628388
 50, -3.5236862010826115, -0.003103849127757297, 0.9991191471230316
 100, -3.0848543411635454, -0.0132913744432552, 0.9956914093913938
 200, -3.0830687882494914, -0.02861448086511751, 0.9907188315180718

@smartalecH
Copy link
Collaborator

smartalecH commented Jan 25, 2022

Why aren't you using the same convergence criteria between your finite-difference solves and your adjoint gradient solves?

The adjoint solver, of course, checks the relative change in dft fields. Your FD solves use a fixed runtime, however:

sim.run(until_after_sources=2000)

I see that you have a max_runtime in your adjoint solver, but that's not the same thing. In order for the gradient comparisons to be apples-to-apples, they need to use the same runtimes (and consequently the same computed values for the FourierFields)

Another thing to check is the finite_difference_step parameter used to recombine the fields and compute the gradient (the inner product step, λᵀAᵤx). IIRC, this parameter is a bit sensitive at higher values of beta. The default (1e-3), probably won't cut it. Maybe try a few different values (e.g. 1e-5,1e-6,1e-7,...) at the lower resolutions first to see if the gradients converge.

Using this example, I can quickly get about 10% error (at beta=np.inf) without much effort -- I'm sure we could do much better. The next step, of course, would be to implement the analytic gradient of the system matrix (Aᵤ) -- it's a rather complicated task, however, given all the steps in the forward pass.

Another hack would be to use Richardson extrapolation -- this would increase the computation time, however.

@oskooi
Copy link
Collaborator Author

oskooi commented Jan 25, 2022

I see that you have a max_runtime in your adjoint solver

The choice to use a fixed runtime of until_after_sources=2000 which is large enough to ensure that the fields have decayed away is intentional. I experimented with using the stop_when_dft_decayed feature but unfortunately this convergence criteria does not seem to work correctly for this particular setup because the DFT monitor is a small region located behind the ring structure on the opposite side from the line source. This means the fields appear to decay away far sooner than they actually have.

Another thing to check is the finite_difference_step parameter used to recombine the fields and compute the gradient

The test setup that I am using applies a small, random perturbation to every pixel of the MaterialGrid for computing the directional derivative. This is similar to the existing unit tests in meep/python/tests/test_adjoint_solver.py. The results shown above are using a "step size" of 1e-5:

# random epsilon perturbation for design region 
deps = 1e-5
dp = deps*rng.rand(Nx*Ny)

I also tried changing deps to 1e-3 and 1e-6 but the results are mostly unchanged.

@smartalecH
Copy link
Collaborator

smartalecH commented Jan 25, 2022

I also tried changing deps to 1e-3 and 1e-6 but the results are mostly unchanged.

We're talking about two different things. I'm referring to the step size used to compute the derivative of the operator (again, the finite_difference_step parameter in OptimizationProblem).

The choice to use a fixed runtime of until_after_sources=2000 which is large enough to ensure that the fields have decayed away is intentional.

This is wrong. They need to converge at the same point, otherwise it's not an apples to apples comparison.

This means the fields appear to decay away far sooner than they actually have.

Correct. See my comment here.

but unfortunately this convergence criteria does not seem to work correctly for this particular setup because the DFT monitor is a small region located behind the ring structure on the opposite side from the line source.

For your FD calculations, make sure you use the same OptimizationProblem object. This way the dft regions are stored in the design region, and stop_when_dft_decayed should be 100% consistent between the FD and adjoint calculations. Currently, you only have dft regions in your objective function region, not your design region.

@oskooi
Copy link
Collaborator Author

oskooi commented Jan 26, 2022

I tried reducing the finite-difference step size used to compute the adjoint gradient to 1e-6 and 1e-8 by manually editing:

# default finite difference step size when calculating Aᵤ
FD_DEFAULT = 1e-3

However, the results for the adjoint gradient for beta=meep.inf seem to get worse with decreasing step size:

  1. FD_DEFAULT=1e-6
 25, -4.076372021733257, -5.858128042396337, 0.43709357516036634
 50, -3.523689286196941, -3.7280615583270516, 0.05799951571515724
 100, -3.0848559819242443, -11.495548169275693, 2.726445654719058
 200, -3.083068860087986, -26.823114992606026, 7.700134901249184
  1. FD_DEFAULT=1e-8
 25, -4.076372021733257, -576.6552679233561, 140.4628657170904
 50, -3.523689286196941, -238.49296288738134, 66.68274484972619
 100, -3.0848559819242443, -1099.3486597532478, 355.3695246050047

As for the fixed runtime, the key point is that it is chosen to be sufficiently long to ensure that the DFT fields are converged for both the forward and adjoint simulations. There is no requirement that the runtime needs to be exactly the same to compare the finite-difference gradient from the forward simulations to the adjoint gradient from the adjoint simulation.

@smartalecH
Copy link
Collaborator

... by manually editing:...

Note, as I mentioned above, you can just use the finite_difference_step parameter (i.e. no need to rebuild with a hardcoded default each time...)

finite_difference_step=utils.FD_DEFAULT

The results are disappointing, but I'm still not convinced this is a "bug" in the recombination step. Rather, I think the issue is a combination of choosing the right finite difference and ensuring the field convergence is consistent between FD and adjoint gradients.

As for the fixed runtime, the key point is that it is chosen to be sufficiently long to ensure that the DFT fields are converged for both the forward and adjoint simulations. There is no requirement that the runtime needs to be exactly the same to compare the finite-difference gradient from the forward simulations to the adjoint gradient from the adjoint simulation.

You need to be careful here. Yes, as long as the fields converge for both the adjoint gradient calculation and the FD gradient calculation, the results should be consistent. But it's hard to ensure that -- especially when you have resonant structures.

So the next best thing is just to make sure the same dft fields converge to the same threshold, which is why I suggested my fix above.

@oskooi
Copy link
Collaborator Author

oskooi commented Jan 26, 2022

Following @smartalecH's suggestion to use the same OptimizationProblem object for the two forward simulations used to compute the finite-difference gradient as the setup used in the adjoint simulation and also replacing the fixed runtime convergence criteria with stop_when_dft_decayed (script), the results are unchanged:

  1. do_averaging=True and beta=0 [MATCHING]
 25.0, 0.00014467420362052064, 0.0001444867192581499, 0.0012959073399325654
 50.0, 0.00016882941995499667, 0.00016861908936993518, 0.0012458171396760183
 100.0, 0.00017910894225070528, 0.00017886748394149965, 0.0013481086213309125
  1. do_averaging=True and beta=meep.inf [NOT MATCHING]
 25.0, -1.7848244175389105e-05, 4.930893875602062e-06, 1.2762677284750092
 50.0, -9.402149873938281e-06, -1.9155904094497948e-05, 1.0373961648490622

These results are not surprising since the fixed runtime I was using previously already ensured that the DFT fields were converged. We therefore need to look elsewhere for the cause of the discrepancy in the gradients.

@oskooi
Copy link
Collaborator Author

oskooi commented Apr 3, 2022

Confirming that #1951 produces matching gradients (finite difference vs. adjoint solver) computed for do_averaging=True and beta=meep.inf for the test case linked to in the previous comment.

The four columns of output are: resolution, gradient (finite difference), gradient (adjoint solver), relative error.

25.0, 8.062159396859947e-06, 8.026619860084367e-06, 0.004408190786878007
50.0, -5.8634390464856745e-05, -5.8572580323187495e-05, 0.0010541619206614952

The results above are generated using double-precision floating point for the fields and materials arrays. I also confirmed that #1951 is working using single-precision floating point:

25.0, 7.331371307373047e-06, 8.026554402081342e-06, 0.09482306454853275
50.0, -6.105005741119385e-05, -5.855535233965434e-05, 0.040863271507457964

Note that the relative error for single precision is larger than double precision, as expected.

@smartalecH
Copy link
Collaborator

@oskooi this test case is using Ez. Do you have results where you use the in-plane mode (Hz)?

@oskooi
Copy link
Collaborator Author

oskooi commented Aug 17, 2022

this test case is using Ez. Do you have results where you use the in-plane mode (Hz)?

During today's discussion, I was referring to the example in #1503 which uses an $H_z$ source.

@stevengj
Copy link
Collaborator

stevengj commented Apr 6, 2023

Let me suggest an experiment:

  1. Do a topology optimization of some problem at some resolution r. Obtain the the design d and the predicted objective function f.
  2. Take the same design d and evaluate it at a resolution 2r (interpolate the ρ to resolution 2r then smooth and project), and find the "high-resolution" objective. Compute the difference Δf between this and the original objective.
  3. Repeat this process for different resolutions, to plot |Δf| vs r. Ideally it should go as 1/r^2. (To get a smoother plot, you probably want to make sure you don't jump to an entirely different structure when you increase r, so maybe interpolate the structure at one r as the starting guess for the next biggest r, or vice versa.)

@smartalecH
Copy link
Collaborator

I'm pulling together the results of this experiment now. I have a few followup questions:

Compute the difference Δf between this and the original objective.

Is there a reason we don't want to plot the relative change in FOM? For example, $\frac{|f_1-f_2|}{|f_2|}$ where $f_1$ is the low resolution result and $f_2$ is the high resolution result?

interpolate the ρ to resolution 2r then smooth and project

Would it be ok to instead smooth ρ, then interpolate, then project? This is precisely how meep does it under the hood... so it's more convenient.

@smartalecH
Copy link
Collaborator

Ok so I computed |Δf| as suggested (there's not much of a difference it turns out between this and the relative error) and I get the following:

image

Not the second-order convergence we were hoping for.

@smartalecH
Copy link
Collaborator

smartalecH commented May 4, 2023

Ok here are the normal and interpolated images of the devices at resolution=160 for smoothing=True:

image

image

and smoothing=False:

image

image

@stevengj
Copy link
Collaborator

stevengj commented May 4, 2023

It's concerning that it doesn't seem to be converging. If this is a real effect (not just oscillatory convergence and you happen to be unlucky at ending on a peak), what is limiting the accuracy?

Might be worth doing the TO with a material grid at 4x the Meep resolution, so that when you double the Meep resolution it is still 2x.

(Since you are doing the Ez polarization, it's not surprising that no smoothing does fine. You should see more of a difference for the Hz polarization.)

Also, make sure the lengthscale constraint is satisfied; sharp corners will definitely hamper convergence.

@stevengj
Copy link
Collaborator

stevengj commented May 4, 2023

Maybe try a smaller test problem, where it is easier to crank the resolution.

For example, suppose you have PML in x and periodic in y, consisting of air + design + air. Launch a planewave with a line source, and maximize |Hz|^2 in the air on the other side of the design region (i.e. maximize a near-field focus). Make sure the period in y is not an integer number of wavelengths to avoid scattering into tangential waves. (e.g. maybe just period=0.9λ).

As noted above, you may need a lengthscale constraint to prevent sharp corners from forming. Don't put the near-field focus too close to the design region, e.g. λ/2 away, to avoid favoring corner singularities.

@smartalecH
Copy link
Collaborator

Ok here are the latest results with for a problem that looks like this:

image

Where we use an $H_z$ source (when smoothing really matters). The results are:

image

A little hard to say... for one it looks like the no smoothing results almost have second order convergence, while the smoothing results don't. I can run it at a few higher points over the long weekend and repost...

@stevengj
Copy link
Collaborator

Note that the convergence might be more irregular if you are converging to a completely different structure at each resolution — might be better to use the previous resolution as a starting guess.

Also, make sure you are imposing the lengthscale constraints at the boundary of the design region, to ensure that it doesn't converge towards a structure with a sharp corner. (i.e. pad the design region before you impose constraints).

@smartalecH
Copy link
Collaborator

Ok finally got around to this...

I now use the previous structure as a starting guess. Here are the final geometries for each resolution:

image

There are still some potential for sharp corners. I need to modify my filtering. But still looks first order:

image

@stevengj
Copy link
Collaborator

Note that you need to avoid sharp corners at the edges of the design region (e.g. by padding the edges when you do the filtering/lengthscale constraints).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants