Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Computing the gradient of a level-set function #2235

Closed
oskooi opened this issue Sep 14, 2022 · 19 comments · Fixed by #2792
Closed

Computing the gradient of a level-set function #2235

oskooi opened this issue Sep 14, 2022 · 19 comments · Fixed by #2792

Comments

@oskooi
Copy link
Collaborator

oskooi commented Sep 14, 2022

Related to #2011.

This is an initial attempt to use the new subpixel smoothing feature of the adjoint solver to compute the gradient of a structure parameterized by its geometry (a level set) as an alternative to the conventional density-based (MaterialGrid) representation. The example involves computing the gradient of the diffraction efficiency of the m=+1 transmitted order of a 1D binary grating with respect to its height and fill factor. The results are to be validated by the usual method of the brute-force gradient computed using a finite difference. Once everything is working correctly, this demonstration will be converted into a tutorial for the user manual and a unit test.

The main feature of this calculation is a differentiable wrapper function (grating_matgrid in the script below) that takes the grating height and fill factor as arguments and returns the weights of a MaterialGrid for the binary grating. These weights are then passed to the adjoint solver via a DesignRegion object in the usual way and the gradient which is computed is then backpropagated through the wrapper function.

An image of the geometry created using this approach confirms that the structure is set up correctly. However, it seems there is a bug in the current setup because the final gradient is zero. Also, there is an important warning from autograd:

/.local/lib/python3/site-packages/autograd/tracer.py:14: UserWarning: Output seems independent of input.
  warnings.warn("Output seems independent of input.")

This message suggests there is a likely bug in the backpropagation step.

binary_grating_matgrid

output of running the script below

-----------
Initializing structure...
Halving computational cell along direction y
Splitting into 2 chunks by voxels
time for choose_chunkdivision = 0.000121082 s
Working in 2D dimensions.
Computational cell is 8.5 x 0.78 x 0 with resolution 100
time for set_epsilon = 0.024535 s
-----------
field decay(t = 50.005): 0.04304565716016541 / 0.04304565716016541 = 1.0
field decay(t = 100.01): 0.11354452052348654 / 0.11354452052348654 = 1.0
on time step 29645 (time=148.225), 0.000134933 s/step
field decay(t = 150.01500000000001): 1.3975259335832059e-08 / 0.11354452052348654 = 1.2308175922008755e-07
field decay(t = 200.02): 5.861816753051918e-16 / 0.11354452052348654 = 5.162571232875486e-15
run 0 finished at t = 200.02 (40004 timesteps)
Starting forward run...
-----------
Initializing structure...
Warning: grid volume is not an integer number of pixels; cell size will be rounded to nearest pixel.
Halving computational cell along direction y
Splitting into 2 chunks by voxels
time for choose_chunkdivision = 0.000153305 s
Working in 2D dimensions.
Computational cell is 8.5 x 0.78 x 0 with resolution 100
     block, center = (-2.25,0,0)
          size (4,1e+20,1e+20)
          axes (1,0,0), (0,1,0), (0,0,1)
          dielectric constant epsilon diagonal = (2.25,2.25,2.25)
     block, center = (1.5,0,0)
          size (3.5,0.777862,0)
          axes (1,0,0), (0,1,0), (0,0,1)
time for set_epsilon = 0.10944 s
-----------
run 0 finished at t = 160.28 (32056 timesteps)
Warning: grid volume is not an integer number of pixels; cell size will be rounded to nearest pixel.
MPB solved for frequency_1(1.53209,0,0) = 1.53209 after 11 iters
MPB solved for frequency_1(2,0,0) = 2 after 1 iters
Dominant planewave for band 1: (2.000000,-0.000000,0.000000)
Starting adjoint run...
Warning: grid volume is not an integer number of pixels; cell size will be rounded to nearest pixel.
MPB solved for frequency_1(-1.53209,0,0) = 1.53209 after 11 iters
MPB solved for frequency_1(-2,0,0) = 2 after 1 iters
run 1 finished at t = 106.055 (21211 timesteps)
Calculating gradient...
objective:, [0.1019033]
Warning: grid volume is not an integer number of pixels; cell size will be rounded to nearest pixel.
     block, center = (-2.25,0,0)
          size (4,1e+20,1e+20)
          axes (1,0,0), (0,1,0), (0,0,1)
          dielectric constant epsilon diagonal = (2.25,2.25,2.25)
     block, center = (1.5,0,0)
          size (3.5,0.777862,0)
          axes (1,0,0), (0,1,0), (0,0,1)
/.local/lib/python3/site-packages/autograd/tracer.py:14: UserWarning: Output seems independent of input.
  warnings.warn("Output seems independent of input.")
gradient:, 0.0

Elapsed run time = 15.9436 s
import math
import cmath
from enum import Enum

import numpy as np
import matplotlib
matplotlib.use('agg')
import matplotlib.pyplot as plt
from autograd import numpy as npa
from autograd import tensor_jacobian_product

import meep as mp
import meep.adjoint as mpa


dpml = 1.0 # PML thickness                                                                                         
dsub = 3.0 # substrate thickness                                                                                   
dpad = 3.0 # padding between PML and grating                                                                       

ng = 1.5
glass = mp.Medium(index=ng)

# angle of transmtted order                                                                                        
# 0 is +x with CCW rotation around z axis                                                                          
theta_tran = math.radians(40.0)

boundary_layers = [mp.PML(thickness=dpml,direction=mp.X)]

Polarization = Enum('Polarization','S P')


def tran_order(res: float, wvl: float, theta_pw: float, pol: Polarization,
               m: int, gp: float, gh: float, gff: float):
  """Computes the gradient of the diffraction efficiency of the m'th                                               
     transmitted order of a 1D binary grating with respect to the grating                                          
     height and fill factor using the adjoint solver.                                                              
                                                                                                                   
     Args:                                                                                                         
        res: resolution (pixels/μm)                                                                                
        wvl: wavelength of the diffraction order.                                                                  
        theta_pw: angle (degrees) of incident planewave.                                                           
                  0° is +x with CCW rotation around z axis.                                                        
                  plane of incidence is xy.                                                                        
        pol: polarization of the incident planewave.                                                               
        m: diffraction order in y direction.                                                                       
        gp: grating period.                                                                                        
        gh: grating height.                                                                                        
        gff: grating fill factor.                                                                                  
  """
  sx = dpml+dsub+gh+dpad+dpml
  sy = gp
  cell_size = mp.Vector3(sx,sy,0)

  design_region_size = mp.Vector3(gh+dpad,sy,0)
  design_region_resolution = int(2*res)
  Nx = int(design_region_size.x*design_region_resolution)
  Ny = int(design_region_size.y*design_region_resolution)

  matgrid = mp.MaterialGrid(mp.Vector3(Nx,Ny),
                            mp.air,
                            glass,
                            weights=np.ones((Nx,Ny)))

  matgrid_region = mpa.DesignRegion(
      matgrid,
      volume=mp.Volume(
          center=mp.Vector3(0.5*sx-dpml-0.5*(gh+dpad),0),
          size=mp.Vector3(design_region_size.x,
                          design_region_size.y)
      )
  )

  matgrid_geometry = [mp.Block(center=matgrid_region.center,
                               size=matgrid_region.size,
                               material=matgrid)]

  if pol.name == 'S':
    src_cmpt = mp.Ez
    mode_parity = mp.ODD_Z
  elif pol.name == 'P':
    src_cmpt = mp.Hz
    mode_parity = mp.EVEN_Z
  else:
    raise ValueError("pol must be S or P, only.")

  fcen = 1/wvl

  # wavevector of transmitted order in air                                                                         
  kdiff = mp.Vector3((fcen**2-(m/sy)**2)**0.5,m/sy,0)

  if theta_pw == 0:
    k = mp.Vector3()
    symmetries = [mp.Mirror(direction=mp.Y,
                            phase=1. if pol.name == 'S' else -1.)]
  else:
    k = mp.Vector3(ng*fcen,0,0).rotate(mp.Vector3(0,0,1),math.radians(theta_pw))
    symmetries = []

  def pw_amp(k,x0):
    def _pw_amp(x):
      return cmath.exp(1j*2*math.pi*k.dot(x+x0))
    return _pw_amp

  src_pt = mp.Vector3(-0.5*sx+dpml,0,0)
  sources = [mp.Source(mp.GaussianSource(fcen,fwidth=0.05*fcen),
                       component=src_cmpt,
                       center=src_pt,
                       size=mp.Vector3(0,sy,0),
                       amp_func=pw_amp(k,src_pt))]

  sim = mp.Simulation(resolution=res,
                      cell_size=cell_size,
                      default_material=glass,
                      symmetries=symmetries,
                      boundary_layers=boundary_layers,
                      k_point=k,
                      sources=sources)

  tran_pt = mp.Vector3(0.5*sx-dpml,0)
  tran_mon = sim.add_flux(fcen,
                          0,
                          1,
                          mp.FluxRegion(center=tran_pt,
                                        size=mp.Vector3(0,sy,0)))

  sim.run(
      until_after_sources=mp.stop_when_fields_decayed(
          50,
          src_cmpt,
          tran_pt,
          1e-9
      )
  )

  input_flux = mp.get_fluxes(tran_mon)[0]

  sim.reset_meep()

  substrate = [mp.Block(material=glass,
                        size=mp.Vector3(dpml+dsub,mp.inf,mp.inf),
                        center=mp.Vector3(-0.5*sx+0.5*(dpml+dsub),0,0))]

  geometry = substrate + matgrid_geometry

  sim = mp.Simulation(resolution=res,
                      cell_size=cell_size,
                      symmetries=symmetries,
                      boundary_layers=boundary_layers,
                      k_point=k,
                      sources=sources,
                      geometry=geometry)

  obj_list = [mpa.EigenmodeCoefficient(
      sim,
      mp.Volume(center=tran_pt,
                size=mp.Vector3(0,sy,0)),
      1,
      eig_parity=mode_parity,
      kpoint_func=lambda *not_used: kdiff
  )]

  def J(mode_mon):
    return npa.power(npa.abs(mode_mon),2) / input_flux

  opt = mpa.OptimizationProblem(simulation=sim,
                                objective_functions=J,
                                objective_arguments=obj_list,
                                design_regions=[matgrid_region],
                                frequencies=[fcen])

  def grating_matgrid(h, ff):
    """Generates the weights for a `MaterialGrid`                                                                  
       based on the grating height and fill factor.                                                                
    """
    xcoord = npa.linspace(-0.5*design_region_size.x,
                          +0.5*design_region_size.x,
                          Nx)
    ycoord = npa.linspace(-0.5*design_region_size.y,
                          +0.5*design_region_size.y,
                          Ny)
    xv, yv = npa.meshgrid(xcoord,ycoord)
    p = npa.where(npa.abs(yv) <= 0.5*ff*gp,
                  npa.where(xv <= xcoord[0] + h,
                            1.,
                            0.),
                  0.)
    p = p.flatten(order='F')
    return p

  design_params = grating_matgrid(gh, gff)
  f, dJ_du = opt([design_params])
  print(f"objective:, {f}")

  opt.plot2D()
  if mp.am_master():
    plt.savefig('binary_grating_matgrid.png',dpi=150,bbox_inches='tight')

  backprop_dJ_du = tensor_jacobian_product(grating_matgrid, 0)(gh, gff, dJ_du)
  sim.reset_meep()

  return backprop_dJ_du


if __name__ == '__main__':
  res = 100

  wvl = 0.5  # wavelength                                                                                          
  inc_ang = 0.
  m = +1

  pol = Polarization.S

  gp = wvl/math.sin(theta_tran) # grating period                                                                   
  gh = 0.5  # grating height                                                                                       
  gff = 0.5 # grating fill factor                                                                                  

  diff_eff_grad = tran_order(res, wvl, inc_ang, pol, m, gp, gh, gff)
  print(f"gradient:, {diff_eff_grad}")
@smartalecH
Copy link
Collaborator

smartalecH commented Sep 14, 2022

Recall that our level set requires a filter step (or that you transform your grid into a signed distance function) in order to work properly. Then you must internally set β=∞.

I may have missed it, but it looks like you are simply populating the grid with a binary array.

@smartalecH
Copy link
Collaborator

See #2067 for a SDF.

@oskooi
Copy link
Collaborator Author

oskooi commented Oct 8, 2022

Unfortunately, work on this issue is blocked by the bug in #2226 (comment) involving an incorrect mode (an oblique planewave) as computed by the Newton-method solver via MPB in fields::get_eigenmode.

@smartalecH
Copy link
Collaborator

smartalecH commented Oct 24, 2022

@oskooi check out this comment too when you are back to looking at this. It describes essentially what you want to do (as a basic example).

@oskooi
Copy link
Collaborator Author

oskooi commented Jan 31, 2024

We are finally able to return to this issue with the use of #2285 and #2767. I have modified the example from #2054 (comment) of computing the adjoint gradient of a 1D grating to specify the grating structure as a level set using the signed-distance function from #2067 (map_to_signed_distance_function in the script below). We are also using subpixel smoothing of the MaterialGrid via do_averaging=True and beta=np.inf.

@smartalecH: to validate the adjoint gradient of the level set using the brute-force finite difference, should we just perturb one of the geometric parameters of the grating such as the height or duty cycle? This would be a more useful validation than simply applying a random perturbation to the entire design region as we have been doing in all of our previous examples involving a density-based representation of the design region.

from enum import Enum
from typing import Tuple

from autograd.extend import primitive, defvjp
from autograd import numpy as anp
from autograd import tensor_jacobian_product
import matplotlib.pyplot as plt
import meep as mp
import meep.adjoint as mpa


RESOLUTION_UM = 100
WAVELENGTH_UM = 0.5
GRATING_PERIOD_UM = 2.5
PADDING_UM = 3.0

Polarization = Enum("Polarization", "S P")


def mapping(
        weights: anp.ndarray,
        eta: float,
        beta: float,
        filter_radius_um: float,
        design_region_size: mp.Vector3,
        design_region_resolution: float
) -> anp.ndarray:
    """A differentiable mapping function for the design weights."""

    # Note: weights should be a 2d array. If it is 1d, it will be reshaped                                        
    #       to 2d by conic_filter. There is no need to reshape it here.                                           
    filtered_field = mpa.conic_filter(
        weights,
        filter_radius_um,
        design_region_size.x,
        design_region_size.y,
        design_region_resolution,
    )

    if beta == 0:
        return filtered_field.flatten()
    else:
        projected_field = mpa.smoothed_projection(
            filtered_field,
            beta,
            eta,
            design_region_resolution,
        )
        return projected_field.flatten()


def design_region_to_meshgrid(
        design_region_size: mp.Vector3,
        nx: int,
        ny: int
) -> Tuple[anp.ndarray, anp.ndarray]:
    """Returns the meshgrid coordinate arrays for the design region."""

    xcoord = anp.linspace(
        -0.5*design_region_size.x,
        +0.5*design_region_size.x,
        nx
    )
    ycoord = anp.linspace(
        -0.5*design_region_size.y,
        +0.5*design_region_size.y,
        ny
    )
    xv, yv = anp.meshgrid(xcoord, ycoord, indexing='ij')

    return xv, yv


@primitive
def grating_levelset(
        grating_height_um: float,
        grating_duty_cycle: float,
        design_region_size: mp.Vector3,
        nx: int,
        ny: int
) -> anp.ndarray:
    """Returns the weights for the grating as a 1D array."""

    xv, yv = design_region_to_meshgrid(design_region_size, nx, ny)

    weights = anp.where(
        anp.abs(yv) <= 0.5 * grating_duty_cycle * GRATING_PERIOD_UM,
        anp.where(
            xv <= xv[0][0] + grating_height_um,
            1,
            0
        ),
        0
    )

    return weights.flatten()


def grating_levelset_vjp(
        ans,
        grating_height_um: float,
        grating_duty_cycle: float,
        design_region_size: mp.Vector3,
        nx: int,
        ny: int
) -> anp.ndarray:
    """Returns the vector-Jacobian product."""

    xv, yv = design_region_to_meshgrid(design_region_size, nx, ny)

    # pixel dimensions                                                                                            
    delta_x = design_region_size.x / (nx - 1)
    delta_y = design_region_size.y / (ny - 1)

    # Gradient of the level set with respect to the grating height.                                               
    # The gradient is obtained using a finite difference. The gradient is                                         
    # therefore a zero matrix with non-zero entries only at the locations                                         
    # of the height boundary. Since the height boundary is normal to the                                          
    # x axis, these non-zero matrix elements have a value of 1 / Δx.                                              
    weights = anp.where(
        anp.abs(yv) <= 0.5 * grating_duty_cycle * GRATING_PERIOD_UM,
        anp.where(
            xv <= xv[0][0] + grating_height_um + 0.5 * delta_x,
            anp.where(
                xv >= xv[0][0] + grating_height_um - 0.5 * delta_x,
                1 / delta_x,
                0
            ),
            0
        ),
        0
    )

    jacobian = weights.flatten()

    return lambda g: anp.tensordot(g, jacobian, axes=1)


def grating_1d(
        pol: Polarization,
        grating_height_um: float,
        design_region_size: mp.Vector3,
        nx: int,
        ny: int
) -> mpa.OptimizationProblem:
    """Sets up the adjoint optimization of a 1D grating."""

    frequency = 1 / WAVELENGTH_UM
    substrate_um = 3.0
    pml_um = 1.0
    pml_layers = [mp.PML(direction=mp.X, thickness=pml_um)]

    n_glass = 1.5
    glass = mp.Medium(index=n_glass)

    size_x_um = pml_um + substrate_um + grating_height_um + PADDING_UM + pml_um
    size_y_um = GRATING_PERIOD_UM
    cell_size = mp.Vector3(size_x_um, size_y_um, 0)

    k_point = mp.Vector3()

    if pol.name == "S":
        eig_parity = mp.ODD_Z
        src_cmpt = mp.Ez
    else:
        eig_parity = mp.EVEN_Z
        src_cmpt = mp.Hz

    src_pt = mp.Vector3(-0.5 * size_x_um + pml_um, 0, 0)
    sources = [
        mp.Source(
            mp.GaussianSource(frequency, fwidth=0.1 * frequency),
            component=src_cmpt,
            center=src_pt,
            size=mp.Vector3(0, size_y_um, 0),
        )
    ]

    matgrid = mp.MaterialGrid(
        mp.Vector3(nx, ny),
        mp.air,
        glass,
        weights=anp.ones((nx, ny)),
        do_averaging=False,
    )

    matgrid_region = mpa.DesignRegion(
        matgrid,
        volume=mp.Volume(
            center=mp.Vector3(
                (-0.5 * size_x_um + pml_um + substrate_um +
                 0.5 * design_region_size.x),
                0,
                0
            ),
            size=design_region_size,
        ),
    )

    geometry = [
        mp.Block(
            material=glass,
            size=mp.Vector3(pml_um + substrate_um, mp.inf, mp.inf),
            center=mp.Vector3(
                -0.5 * size_x_um + 0.5 * (pml_um + substrate_um), 0, 0
            ),
        ),
        mp.Block(
            material=matgrid,
            size=matgrid_region.size,
            center=matgrid_region.center,
        ),
    ]

    sim = mp.Simulation(
        resolution=RESOLUTION_UM,
        cell_size=cell_size,
        boundary_layers=pml_layers,
        k_point=k_point,
        sources=sources,
        geometry=geometry,
    )

    tran_pt = mp.Vector3(0.5 * size_x_um - pml_um, 0, 0)

    order_y = 1
    kdiff = mp.Vector3(
        (frequency**2 - (order_y / GRATING_PERIOD_UM)**2)**0.5,
        order_y / GRATING_PERIOD_UM,
        0
    )
    print(f"kdiff = ({kdiff.x:.5f}, {kdiff.y:.5f}, {kdiff.z:.5f})")

    obj_list = [
        mpa.EigenmodeCoefficient(
            sim,
            mp.Volume(
                center=tran_pt,
                size=mp.Vector3(0, size_y_um, 0),
            ),
            mode=1,
            kpoint_func=lambda *not_used: kdiff,
            eig_parity=eig_parity,
            eig_vol=mp.Volume(
                center=tran_pt,
                size=mp.Vector3(0, 1 / RESOLUTION_UM, 0)
            ),
        ),
    ]

    def J(tran_mon):
        return anp.abs(tran_mon)**2

    opt = mpa.OptimizationProblem(
        simulation=sim,
        objective_functions=J,
        objective_arguments=obj_list,
        design_regions=[matgrid_region],
        frequencies=[frequency],
    )

    return opt


if __name__ == "__main__":
    grating_height_um = 0.5
    grating_duty_cycle = 0.5
    height_perturbation_um = 1 / RESOLUTION_UM

    design_region_size = mp.Vector3(
        grating_height_um + PADDING_UM,
        GRATING_PERIOD_UM,
        0
    )
    design_region_resolution_um = int(2 * RESOLUTION_UM)
    nx = int(design_region_size.x * design_region_resolution_um) + 1
    ny = int(design_region_size.y * design_region_resolution_um) + 1

    # Properties of the convolution filter and projection operator.                                               
    beta = 0.2
    eta_intrinsic = 0.5
    eta_erosion = 0.75
    min_feature_size_um = 0.5
    filter_radius_um = mpa.get_conic_radius_from_eta_e(
        min_feature_size_um,
        eta_erosion
    )

    opt = grating_1d(
        Polarization.P,
        grating_height_um,
        design_region_size,
        nx,
        ny,
    )

    unperturbed_design_weights = grating_levelset(
        grating_height_um,
        grating_duty_cycle,
        design_region_size,
        nx,
        ny,
    )
    mapped_design_weights = mapping(
        unperturbed_design_weights,
        eta_intrinsic,
        beta,
        filter_radius_um,
        design_region_size,
        design_region_resolution_um,
    )
    obj_value_unperturbed, grad_unperturbed = opt(
        [mapped_design_weights],
        need_gradient=True,
    )

    # Backpropagate the adjoint gradient through two functions in the correct order.                              
    grad_backprop = tensor_jacobian_product(mapping, 0)(
        mapped_design_weights,
        eta_intrinsic,
        beta,
        filter_radius_um,
        design_region_size,
        design_region_resolution_um,
        grad_unperturbed,
    )

    defvjp(grating_levelset, grating_levelset_vjp)
    grad_backprop = tensor_jacobian_product(grating_levelset, 0)(
        grating_height_um,
        grating_duty_cycle,
        design_region_size,
        nx,
        ny,
        grad_backprop,
    )

    fig, ax = plt.subplots()
    opt.plot2D(init_opt=False, ax=ax)
    if mp.am_master():
        fig.savefig(
            'grating_1d_plot2D.png', dpi=150, bbox_inches='tight'
        )

    perturbed_design_weights = grating_levelset(
        grating_height_um + height_perturbation_um,
        grating_duty_cycle,
        design_region_size,
        nx,
        ny,
    )
    mapped_design_weights = mapping(
        perturbed_design_weights,
        eta_intrinsic,
        beta,
        filter_radius_um,
        design_region_size,
        design_region_resolution_um,
    )
    obj_value_perturbed, _ = opt(
        [mapped_design_weights],
        need_gradient=False
    )

    adj_dirderiv = height_perturbation_um * grad_backprop
    fnd_dirderiv = obj_value_perturbed[0] - obj_value_unperturbed[0]
    rel_err = abs(adj_dirderiv - fnd_dirderiv) / fnd_dirderiv
    print(
        f"dirderiv:, {fnd_dirderiv} (finite difference), "
        f"{adj_dirderiv} (adjoint), {rel_err:.6f} (error)"
    )

grating_1d_plot2D

@smartalecH
Copy link
Collaborator

We are also using subpixel smoothing of the MaterialGrid via do_averaging=True and beta=np.inf.

Frankly, I would use the smoothed projection feature we recently implemented in python (rather than what we tried to implement in c++). We know there are bugs in 3D, I imagine there are bugs in 2D too.

should we just perturb one of the geometric parameters of the grating such as the height or duty cycle?

Yes that should work fine. You may have to play with how big the step needs to be.

@oskooi
Copy link
Collaborator Author

oskooi commented Jan 31, 2024

Frankly, I would use the smoothed projection feature we recently implemented in python

Why is it necessary to apply a smoothing projection to the design weights when they are already smoothed using the signed-distance function?

Separately, I have updated the script in my previous comment to compute the adjoint gradient of the diffraction efficiency with respect to the height of the grating. However, the adjoint gradient back propagated through the signed-distance function using Autograd's tensor_jacobian_product is zero which is incorrect:

directional-deriv:, [-0.01232668] (finite difference), 0.0 (adjoint)

I suspect this is because the signed-distance function implemented in the function map_to_signed_distance_function uses SciPy's skfmm.distance function which is not differentiable using Autograd. Do we therefore need to implement our own signed-distance function using Autograd to make this work?

@smartalecH
Copy link
Collaborator

Why is it necessary to apply a smoothing projection to the design weights when they are already smoothed using the signed-distance function?

So keep in mind that the underlying framework using the material grid assumes a density based approach to TO. In order to turn it into a level set, you need to do subpixel smoothing, and our modified projection function lets us do that easily without leaving the realm of density-based TO.

If you already have a level-set function that does smoothing for you, and you can backprop through it, then you don't need any extra smoothing (and you should turn off both the smoothing and set beta=0 in the material grid).

The challenge is that we don't always have level set functions available to us. For trivial geometries, like squares or circles it's easy. But if you are trying to evolve something more difficult, (like starting with a simple waveguide crossing and doing shape optimization) then using the smoothed projection function makes this easy.

which is not differentiable using Autograd.

I would just write a custom level set for this. Rectangles are easy and you can do it in native autograd. Lots of resources online.

@oskooi
Copy link
Collaborator Author

oskooi commented Feb 1, 2024

If you already have a level-set function that does smoothing for you, and you can backprop through it, then you don't need any extra smoothing (and you should turn off both the smoothing and set beta=0 in the material grid).

I have modified the script to disable the internal subpixel smoothing in MaterialGrid and instead apply a filter and projection via conic_filter and the new smoothing_projection functions to the level set directly:

1. unperturbed grating

   # Obtain the grating as a level set.
    unperturbed_design_weights = grating_weights(
        grating_height_um,
        grating_duty_cycle,
        design_region_size,
        nx,
        ny,
    )
   # Apply a filter and projection to the level set.
    mapped_unperturbed_design_weights = mapping(
        unperturbed_design_weights,
        eta_intrinsic,
        npa.inf,
        filter_radius_um,
        design_region_size,
        design_region_resolution_um,
    )

    obj_value_unperturbed, grad_unperturbed = opt(
        [mapped_unperturbed_design_weights],
        need_gradient=True,
    )

2. grating with perturbed height

   # Obtain the grating as a level set.
    height_perturbation_um = 1e-2
    perturbed_design_weights = grating_weights(
        grating_height_um + height_perturbation_um,
        grating_duty_cycle,
        design_region_size,
        nx,
        ny,
    )
   # Apply a filter and projection to the level set.
    mapped_perturbed_design_weights = mapping(
        perturbed_design_weights,
        eta_intrinsic,
        npa.inf,
        filter_radius_um,
        design_region_size,
        design_region_resolution_um,
    )

    obj_value_perturbed, _ = opt(
        [mapped_perturbed_design_weights],
        need_gradient=False
    )

Now it seems we just need to backpropagate twice: the first time through the mapping function and then through the grating_weights function:

def mapping(
        weights: npa.ndarray,
        eta: float,
        beta: float,
        filter_radius_um: float,
        design_region_size: mp.Vector3,
        design_region_resolution: float
) -> np.ndarray:
    """A differentiable mapping function for the design weights."""

    filtered_field = mpa.conic_filter(
        weights,
        filter_radius_um,
	design_region_size.x,
        design_region_size.y,
        design_region_resolution,
    )

    if beta == 0:
        return filtered_field.flatten()

    else:
        projected_field = mpa.smoothed_projection(
            filtered_field,
            beta,
            eta,
            design_region_resolution,
        )

        return projected_field.flatten()

Is this the correct approach?

@stevengj
Copy link
Collaborator

stevengj commented Feb 1, 2024

You want to skip the conic filter, which moves your level set.

@oskooi
Copy link
Collaborator Author

oskooi commented Feb 2, 2024

You want to skip the conic filter, which moves your level set.

There is a comment in the docstrings of the smoothed_projection function which states that the input to this function must have been already filtered:

In order for this to work, the input array must already be smooth (e.g. by
filtering).

@smartalecH: what will happen if we remove the conic_filter step in the mapping function and thus only apply smoothed_projection to the level set?

@smartalecH
Copy link
Collaborator

@oskooi the only assumption for this new projection function is that you have a level set and that it's smoothly varying.

You don't appear to have that. You just have binary shapes derived from fundamental parameters.

So in your case, you do need to filter. But you would also need to map this ad hoc levelset to your new shape using eg a marching squares algorithm.

It would be much easier to simply use a signed distance function of a rectangle. Lots of resources online like this one. Offset the zero levelset to be 0.5 (our standard eta value) and use the new smoothed projection function to do the proper smoothing. You should be able to code it up in autograd.

@stevengj
Copy link
Collaborator

stevengj commented Feb 8, 2024

(To be clear, the main application of this smoothed projection is for topology optimization, i.e. when the density/level-set function itself comprises the degrees of freedom.)

@oskooi
Copy link
Collaborator Author

oskooi commented Feb 14, 2024

As a test to verify that the gradient calculation has been set up correctly, I would like to demonstrate that we can compute a non-zero gradient of the diffraction efficiency with respect to the grating height. (The gradient is incorrect until we have implemented the signed-distance function using Autograd and used it to replace the conic_filter in the mapping function.)

For now, applying a mapping consisting of a conic_filter followed by a smoothed_projection to a levelset (full script is above) and then trying to back propagate the gradient through the functions mapping and then grating_levelset always produces a zero gradient along with this warning from autograd:

/python3.10/site-packages/autograd/tracer.py:14: UserWarning: Output seems independent of input.
  warnings.warn("Output seems independent of input.")
grad_backprop (grating_levelset):, 0.0, ()
directional-deriv:, [-0.00056425] (finite difference), 0.0 (adjoint)

It is unclear why the gradient of the levelset function is always zero. @smartalecH, any thoughts?

@oskooi
Copy link
Collaborator Author

oskooi commented Feb 14, 2024

Looks like the problem is that Autograd does not support gradients of numpy.meshgrid (HIPS/autograd#457) which is currently used by the levelset construction function:

def grating_levelset(
        grating_height_um: float,
        grating_duty_cycle: float,
        design_region_size: mp.Vector3,
        nx: int,
        ny: int
) -> npa.ndarray:
    """Returns the weights for the grating as a 1D array."""

    xcoord = npa.linspace(
        -0.5*design_region_size.x,
        +0.5*design_region_size.x,
        nx
    )
    ycoord = npa.linspace(
        -0.5*design_region_size.y,
        +0.5*design_region_size.y,
        ny
    )
    xv, yv = npa.meshgrid(xcoord, ycoord, indexing='ij')
    weights = npa.where(
        npa.abs(yv) <= 0.5 * grating_duty_cycle * GRATING_PERIOD_UM,
        npa.where(
            xv <= xcoord[0] + grating_height_um,
            1,
            0
        ),
        0
    )

    return weights.flatten()

Also, it does not seem to be an issue but Autograd does support gradients of np.abs which is not normally differentiable.

The fix is that we therefore just need to rewrite grating_levelset and replace np.meshgrid.

@stevengj
Copy link
Collaborator

Here is an easier approach:

  1. write a function f(p) that generates the binary image from your geometric parameters p
  2. call fast-marching or similar to generate a "smooth/signed-distance" level set from f(p), call this F(f(p))
  3. write a custom vJp routine for F(f(p)) that just uses finite differences to obtain the Jacobian matrix.

This is feasible because your number of parameters p is very small, and F(f(p)) is very fast (no PDE solve), so finite differences are cheap. Just make sure that the image is high enough resolution that finite differences are sufficiently accurate.

@stevengj
Copy link
Collaborator

(Your f(p) function above, i.e. grating_levelset, has a derivative zero almost everywhere, because an infinitesimal perturbation doesn't change the image. To fix that you would need to do subpixel smoothing within that function. However, with my suggested finite-difference approach, you can just use finite differences that change the image by at least one pixel, and the problem goes away.)

@oskooi
Copy link
Collaborator Author

oskooi commented Feb 19, 2024

write a custom vJp routine for F(f(p)) that just uses finite differences to obtain the Jacobian matrix.

This seems to be working now after I added a custom VJP function for the levelset construction function (shown below). We can now demonstrate good agreement in the directional derivative computed using the adjoint gradient and finite differences:

dirderiv:, 0.00161151 (finite difference), 0.00191024 (adjoint), 0.18537272 (error)

These results were obtained using a resolution of 200 pixels/μm and a runtime of nearly 1.5 hours using a single Xeon 4.2 GHz. Unfortunately, using finite differences to approximate the Jacobian necessarily requires large resolutions to obtain accurate results. This feature has already been noted in the comments above:

Just make sure that the image is high enough resolution that finite differences are sufficiently accurate.

As a check, I cranked up the resolution to 400 pixels/μm and the relative error decreased. However, this run took nearly 12 hours with 4 Xeon cores.

dirderiv:, 0.00110571 (finite difference), 0.00118250 (adjoint), 0.06944858 (error)

Details of the calculation involving composition of the objective function and computing the Jacobian matrix using finite differences are summarized in the slide below. The current calculation does not use subpixel smoothing (which will be added next by extending this approach to the signed-distance function in order to replace the conic filter in the mapping function).

backpropagation_levelset

def grating_levelset_vjp(
        ans,
        grating_height_um: float,
        grating_duty_cycle: float,
        design_region_size: mp.Vector3,
        nx: int,
        ny: int
) -> np.ndarray:
    """Returns the vector-Jacobian product."""

    xv, yv = design_region_to_meshgrid(design_region_size, nx, ny)

    # pixel dimensions
    delta_x = design_region_size.x / (nx - 1)
    delta_y = design_region_size.y / (ny - 1)

    # Gradient of the level set with respect to the grating height.
    # The gradient is obtained using a finite difference. The gradient is
    # therefore a zero matrix with non-zero entries only at the locations
    # of the height boundary. Since the height boundary is normal to the
    # x axis, these non-zero matrix elements have a value of 1 / Δx.
    weights = np.where(
        np.abs(yv) <= 0.5 * grating_duty_cycle * GRATING_PERIOD_UM,
        np.where(
            xv <= xv[0][0] + grating_height_um + 0.5 * delta_x,
            np.where(
                xv >= xv[0][0] + grating_height_um - 0.5 * delta_x,
                1 / delta_x,
                0
            ),
            0
        ),
        0
    )

    jacobian = weights.flatten()

    return lambda g: np.tensordot(g, jacobian, axes=1)

@oskooi
Copy link
Collaborator Author

oskooi commented Feb 21, 2024

Using the same approach involving a custom VJP described in my previous comment applied to the signed-distance function via the module scikit-fmm and computing the adjoint gradient of a level-set representation of the 1D grating using the subpixel-smoothing feature of the MaterialGrid class (rather than the alternative approach proposed by @smartalecH of replacing the tanh_projection operation of the mapping function with smoothed_projection) seems to produce the expected agreement with the brute-force result obtained using finite differences.

Here are the results comparing the directional derivatives obtained using the two methods (finite difference vs. adjoint gradient) for the $\mathcal{S}$ and $\mathcal{P}$ polarization at a resolution of 200 pixels/μm.

S polarization

directional-deriv:, -0.00146690 (finite difference), -0.00128812 (adjoint), 0.121877 (error)

P polarization

directional-deriv:, 0.00021602 (finite difference), 0.00136521 (adjoint), 5.319892 (error)

There is good agreement for the $S$ polarization but that is the "easy" case for a 2D simulation because the electric field ($E_z$) is always parallel to the material interfaces. The relative error for the $\mathcal{P}$-polarization case, however, is more than an order of magnitude larger than the $\mathcal{S}$-polarization case. The accuracy of the $\mathcal{P}$ polarization but not $\mathcal{S}$ is improved by increasing the grid resolution to 400 pixels/μm:

S polarization

directional-deriv:, -0.00058979 (finite difference), -0.00043476 (adjoint), 0.262865 (error)

P polarization

directional-deriv:, 0.00056589 (finite difference), 0.00098577 (adjoint), 0.741962 (error)

The script used to generate the results is available in this gist. It contains two key parts:

  1. Use of subpixel smoothing via the MaterialGrid object and $\beta = \infty$.
  2. The level-set construction function which smoothes the grating boundaries using the signed-distance function and its custom VJP.
    matgrid = mp.MaterialGrid(
        mp.Vector3(nx, ny),
        mp.air,
        glass,
        weights=anp.ones((nx, ny)),
        do_averaging=True,
        beta=anp.inf
    )
def signed_distance(weights: anp.ndarray) -> anp.ndarray:
    """Maps the 2D weights using a signed-distance function."""

    # Create signed distance function.
    sd = skfmm.distance(weights - 0.5)

    # Linear interpolation of zero-levelset onto 0.5-levelset.
    xp = [anp.min(sd.flatten()), 0, anp.max(sd.flatten())]
    yp = [0, 0.5001, 1]

    return anp.interp(sd.flatten(), xp, yp)
  
  
@primitive
def levelset_and_smoothing(
        grating_height_um: float,
        grating_duty_cycle: float,
        design_region_size: mp.Vector3,
        nx: int,
        ny: int
) -> anp.ndarray:
    """Generates the grating as a levelset with signed-distance smoothing."""

    xv, yv = design_region_to_meshgrid(design_region_size, nx, ny)

    weights = anp.where(
        anp.abs(yv) <= 0.5 * grating_duty_cycle * GRATING_PERIOD_UM,
        anp.where(
            xv <= xv[0][0] + grating_height_um,
            1,
            0
        ),
        0
    )

    return signed_distance(weights)


def levelset_and_smoothing_vjp(
        ans: anp.ndarray,
        grating_height_um: float,
        grating_duty_cycle: float,
        design_region_size: mp.Vector3,
        nx: int,
        ny: int
) -> anp.ndarray:
    """Computes the vector-Jacobian product."""

    xv, yv = design_region_to_meshgrid(design_region_size, nx, ny)

    # pixel dimensions 
    delta_x = design_region_size.x / (nx - 1)

    weights = anp.where(
        anp.abs(yv) <= 0.5 * grating_duty_cycle * GRATING_PERIOD_UM,
        anp.where(
            xv <= xv[0][0] + grating_height_um + delta_x,
            1,
            0
        ),
        0
    )

    jacobian = (signed_distance(weights) - ans) / delta_x

    return lambda g: anp.tensordot(g, jacobian, axes=1)

This figure shows the smoothed grating obtained by applying the signed-distance function to its level-set representation.

grating_weights

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants