Optimize Performance with Parallel Execution

pyrauli is designed for high-performance simulation, and its C++ backend includes a parallel execution engine powered by OpenMP. This guide shows how to leverage this feature to significantly speed up simulations, especially when working with many observables.

The RuntimePolicy

The execution strategy is controlled by a RuntimePolicy object. pyrauli provides two policies:

pyrauli.seq: A sequential (single-threaded) execution policy.
pyrauli.par: A parallel (multi-threaded) execution policy.

You can pass these policies to the runtime parameter of the simulation methods.

Running a Single Observable in Parallel

You can achieve a performance boost even when simulating just a single observable by using the parallel runtime. The internal operations for evolving the observable—especially during splitting gates like Rz—are parallelized across multiple CPU cores.

While this is faster than sequential execution, it is not as efficient as parallel batching. The overhead of managing threads is better amortized when distributed across many independent observables.

circuit = pyrauli.Circuit(4)
circuit.h(0)
circuit.cx(0, 1)
circuit.cx(1, 2)
circuit.rz(2, 0.5)
circuit.cx(2, 3)

observable = pyrauli.Observable("ZIII")

# Run the simulation for one observable using the parallel policy
ev, err = circuit.expectation_value(observable, runtime=pyrauli.par)

print(f"Expectation value (parallel): {ev}")

Batching Observables for Maximum Throughput

The greatest performance benefit of the parallel engine comes from batching—that is, running the circuit simulation for a list of observables in a single call. pyrauli will automatically distribute the work across multiple CPU cores.

The example below computes the expectation values for three different observables in one parallel call.

circuit = pyrauli.Circuit(4)
circuit.h(0)
circuit.cx(0, 1)
circuit.cx(1, 2)
circuit.rz(2, 0.5)
circuit.cx(2, 3)

observables = [
    pyrauli.Observable("ZIII"),
    pyrauli.Observable("IZII"),
    pyrauli.Observable("IIZI"),
    pyrauli.Observable("IIIZ"),
]

# Run the simulation for the batch of observables in parallel
results = circuit.expectation_value(observables, runtime=pyrauli.par)

for i, (ev, err) in enumerate(results):
    print(f"Observable {i}: EV = {ev:.4f}, Error = {err:.4f}")

Tip

Always prefer passing a list of observables to a single .expectation_value() or .run() call over looping in Python. The performance gain from processing the batch in C++ is substantial.

Enabling Parallelism in Qiskit

When using pyrauli with Qiskit, you can enable parallel execution by default by passing the policy during the backend or estimator’s initialization.

from pyrauli import PBackend

# Initialize the backend with the parallel execution policy as the default
backend = PBackend(runtime=pyrauli.par)

# All jobs run on this backend will now use the parallel engine by default
qc = QuantumCircuit(2)
qc.h(0)
qc.cx(0, 1)
observables = [SparsePauliOp("ZI"), SparsePauliOp("IZ")]

job = backend.run([(qc, observables)])
result = job.result()

print(f"Expectation values: {result[0].data.evs}")