Catalyst v0.4.0
New features
-
Catalyst is now accessible directly within the PennyLane user interface, once Catalyst is installed, allowing easy access to Catalyst just-in-time functionality.
Through the use of the
qml.qjit
decorator, entire workflows can be JIT compiled down to a machine binary on first-function execution, including both quantum and classical processing. Subsequent calls to the compiled function will execute the previously-compiled binary, resulting in significant performance improvements.import pennylane as qml dev = qml.device("lightning.qubit", wires=2) @qml.qjit @qml.qnode(dev) def circuit(theta): qml.Hadamard(wires=0) qml.RX(theta, wires=1) qml.CNOT(wires=[0, 1]) return qml.expval(qml.PauliZ(wires=1))
>>> circuit(0.5) # the first call, compilation occurs here array(0.) >>> circuit(0.5) # the precompiled quantum function is called array(0.)
Currently, PennyLane supports the Catalyst hybrid compiler with the
qml.qjit
decorator, which directly aliases Catalyst'scatalyst.qjit
.In addition to the above
qml.qjit
integration, the following native PennyLane functions can now be used with theqjit
decorator:qml.adjoint
,qml.ctrl
,qml.grad
,qml.jacobian
,qml.vjp
,qml.jvp
, andqml.adjoint
,qml.while_loop
,qml.for_loop
,qml.cond
. These will alias to the corresponding Catalyst functions when used within aqjit
context.For more details on these functions, please refer to the PennyLane compiler documentation and compiler module documentation.
-
Just-in-time compiled functions now support asynchronous execution of QNodes. (#374) (#381) (#420) (#424) (#433)
Simply specify
async_qnodes=True
when using the@qjit
decorator to enable the async execution of QNodes. Currently, asynchronous execution is only supported bylightning.qubit
andlightning.kokkos
.Asynchronous execution will be most beneficial for just-in-time compiled functions that contain --- or generate --- multiple QNodes.
For example,
dev = qml.device("lightning.qubit", wires=2) @qml.qnode(device=dev) def circuit(params): qml.RX(params[0], wires=0) qml.RY(params[1], wires=1) qml.CNOT(wires=[0, 1]) return qml.expval(qml.PauliZ(wires=0)) @qjit(async_qnodes=True) def multiple_qnodes(params): x = jnp.sin(params) y = jnp.cos(params) z = jnp.array([circuit(x), circuit(y)]) # will be executed in parallel return circuit(z)
>>> func(jnp.array([1.0, 2.0])) 1.0
Here, the first two circuit executions will occur in parallel across multiple threads, as their execution can occur independently.
-
Preliminary support for PennyLane transforms has been added. (#280)
@qjit @qml.transforms.split_non_commuting @qml.qnode(dev) def circuit(x): qml.RX(x,wires=0) return [qml.expval(qml.PauliY(0)), qml.expval(qml.PauliZ(0))]
>>> circuit(0.4) [array(-0.51413599), array(0.85770868)]
Currently, most PennyLane transforms will work with Catalyst as long as:
-
The circuit does not include any Catalyst-specific features, such
as Catalyst control flow or measurement, -
The QNode returns only lists of measurement processes,
-
AutoGraph is disabled, and
-
The transformation does not require or depend on the numeric value of
dynamic variables.
-
-
Catalyst now supports just-in-time compilation of dynamically-shaped arrays. (#366) (#386) (#390) (#411)
The
@qjit
decorator can now be used to compile functions that accepts or contain tensors whose dimensions are not known at compile time; runtime execution with different shapes is supported without recompilation.In addition, standard tensor initialization functions
jax.numpy.ones
,jnp.zeros
, andjnp.empty
now accept dynamic variables (where the value is only known at runtime).@qjit def func(size: int): return jax.numpy.ones([size, size], dtype=float)
>>> func(3) [[1. 1. 1.] [1. 1. 1.] [1. 1. 1.]]
When passing tensors as arguments to compiled functions, the
abstracted_axes
keyword argument to the@qjit
decorator can be used to specify which axes of the input arguments should be treated as abstract (and thus avoid recompilation).For example, without specifying
abstracted_axes
, the followingsum
function would recompile each time an array of different size is passed as an argument:>>> @qjit >>> def sum_fn(x): >>> return jnp.sum(x) >>> sum_fn(jnp.array([1])) # Compilation happens here. >>> sum_fn(jnp.array([1, 1])) # And here!
By passing
abstracted_axes
, we can specify that the first axes of the first argument is to be treated as dynamic during initial compilation:>>> @qjit(abstracted_axes={0: "n"}) >>> def sum_fn(x): >>> return jnp.sum(x) >>> sum_fn(jnp.array([1])) # Compilation happens here. >>> sum_fn(jnp.array([1, 1])) # No need to recompile.
Note that support for dynamic arrays in control-flow primitives (such as loops), is not yet supported.
-
Error mitigation using the zero-noise extrapolation method is now available through the
catalyst.mitigate_with_zne
transform. (#324) (#414)For example, given a noisy device (such as noisy hardware available through Amazon Braket):
dev = qml.device("noisy.device", wires=2) @qml.qnode(device=dev) def circuit(x, n): @for_loop(0, n, 1) def loop_rx(i): qml.RX(x, wires=0) loop_rx() qml.Hadamard(wires=0) qml.RZ(x, wires=0) loop_rx() qml.RZ(x, wires=0) qml.CNOT(wires=[1, 0]) qml.Hadamard(wires=1) return qml.expval(qml.PauliY(wires=0)) @qjit def mitigated_circuit(args, n): s = jax.numpy.array([1, 2, 3]) return mitigate_with_zne(circuit, scale_factors=s)(args, n)
>>> mitigated_circuit(0.2, 5) 0.5655341100116512
In addition, a mitigation dialect has been added to the MLIR layer of Catalyst. It contains a Zero Noise Extrapolation (ZNE) operation, with a lowering to a global folded circuit.
Improvements
-
The three backend devices provided with Catalyst,
lightning.qubit
,lightning.kokkos
, andbraket.aws
, are now dynamically loaded at runtime. (#343) (#400)This takes advantage of the new backend plugin system provided in Catalyst v0.3.2, and allows the devices to be packaged separately from the runtime CAPI. Provided backend devices are now loaded at runtime, instead of being linked at compile time.
For more details on the backend plugin system, see the custom devices documentation.
-
Finite-shot measurement statistics (
expval
,var
, andprobs
) are now supported for thelightning.qubit
andlightning.kokkos
devices. Previously, exact statistics were returned even when finite shots were specified. (#392) (#410)>>> dev = qml.device("lightning.qubit", wires=2, shots=100) >>> @qjit >>> @qml.qnode(dev) >>> def circuit(x): >>> qml.RX(x, wires=0) >>> return qml.probs(wires=0) >>> circuit(0.54) array([0.94, 0.06]) >>> circuit(0.54) array([0.93, 0.07])
-
Catalyst gradient functions
grad
,jacobian
,jvp
, andvjp
can now be invoked from outside a@qjit
context. (#375)This simplifies the process of writing functions where compilation can be turned on and off easily by adding or removing the decorator. The functions dispatch to their JAX equivalents when the compilation is turned off.
dev = qml.device("lightning.qubit", wires=2) @qml.qnode(dev) def circuit(x): qml.RX(x, wires=0) return qml.expval(qml.PauliZ(0))
>>> grad(circuit)(0.54) # dispatches to jax.grad Array(-0.51413599, dtype=float64, weak_type=True) >>> qjit(grad(circuit))(0.54). # differentiates using Catalyst array(-0.51413599)
-
New
lightning.qubit
configuration options are now supported via theqml.device
loader, including Markov Chain Monte Carlo sampling support. (#369)dev = qml.device("lightning.qubit", wires=2, shots=1000, mcmc=True) @qml.qnode(dev) def circuit(x): qml.RX(x, wires=0) return qml.expval(qml.PauliZ(0))
>>> circuit(0.54) array(0.856)
-
Improvements have been made to the runtime and quantum MLIR dialect in order to support asynchronous execution.
-
The runtime now supports multiple active devices managed via a device pool. The new
RTDevice
data-class andRTDeviceStatus
along with thethread_local
device instance pointer enable the runtime to better scope the lifetime of device instances concurrently. With these changes, one can create multiple active devices and execute multiple programs in a multithreaded environment. (#381) -
The ability to dynamically release devices has been added via
DeviceReleaseOp
in the Quantum MLIR dialect. This is lowered to the__quantum__rt__device_release()
runtime instruction, which updates the status of the device instance fromActive
toInactive
. The runtime will reuse this deactivated instance instead of creating a new one automatically at runtime in a multi-QNode workflow when another device with identical specifications is requested. (#381) -
The
DeviceOp
definition in the Quantum MLIR dialect has been updated to lower a tuple of device information('lib', 'name', 'kwargs')
to a single device initialization call__quantum__rt__device_init(int8_t *, int8_t *, int8_t *)
. This allows the runtime to initialize device instances without keeping partial information of the device. (#396)
-
-
The quantum adjoint compiler routine has been extended to support function calls that affect the quantum state within an adjoint region. Note that the function may only provide a single result consisting of the quantum register. By itself this provides no user-facing changes, but compiler pass developers may now generate quantum adjoint operations around a block of code containing function calls as well as quantum operations and control flow operations. (#353)
-
The allocation and deallocation operations in MLIR (
AllocOp
,DeallocOp
) now follow simple value semantics for qubit register values, instead of modelling memory in the MLIR trait system. Similarly, the frontend generates proper value semantics by deallocating the final register value.The change enables functions at the MLIR level to accept and return quantum register values, which would otherwise not be correctly identified as aliases of existing register values by the bufferization system. (#360)
Breaking changes
- Third party devices must now provide a configuration TOML file, in order to specify their supported operations, measurements, and features for Catalyst compatibility. For more information please visit the Custom Devices section in our documentation. (#369)
Bug fixes
-
Resolves a bug in the compiler's differentiation engine that results in a segmentation fault when attempting to differentiate non-differentiable quantum operations. The fix ensures that all existing quantum operation types are removed during gradient passes that extract classical code from a QNode function. It also adds a verification step that will raise an error if a gradient pass cannot successfully eliminate all quantum operations for such functions. (#397)
-
Resolves a bug that caused unpredictable behaviour when printing string values with the
debug.print
function. The issue was caused by non-null-terminated strings. (#418)
Contributors
This release contains contributions from (in alphabetical order):
Ali Asadi,
David Ittah,
Romain Moyard,
Sergei Mironov,
Erick Ochoa Lopez,
Shuli Shu.