1. Basic Example Notebook

Hello and welcome to PauliPropagation.jl - a Julia package for Pauli propagation simulation of quantum circuits and quantum systems.

This package can be used to estimate quantities like $\text{Tr}[\rho \ \mathcal{E}(O)]$ where $\rho$ is an initial state, $\mathcal{E}$ is a quantum channel that is not necessarily unitary, and $O$ is an observable preferrably sparse in Pauli basis.

In the following we are going to introduce the basic concepts and datatypes of the package.

using PauliPropagation

Define the number of qubits in the simulation.

nq = 64
64

Now define the observable $O = Z_{32} = I_1 \otimes ... \otimes I_{31} \otimes Z_{32} \otimes I_{33}... \otimes I_{64}$.

The PauliString type is our high-level way of handling Pauli strings. This above constructor defines a PauliString that has the only non-Identity Pauli at position 32.

This library uses encodes Paulis into the bits of Integers for performance reasons, and also actions of gates onto Paulis are defined as bit operations. You can definitely get by just sticking to the high level interface. Check out some other example notebooks for more involved code.

pstr = PauliString(nq, :Z, 32)
PauliString(nqubits: 64, 1.0 * IIIIIIIIIIIIIIIIIIII...)

As another example, this what it prints for a 4-qubit example with a different coefficient:

PauliString(4, [:X, :Y, :Z], [2, 3, 4], 2.0)
PauliString(nqubits: 4, 2.0 * IXYZ)

Under the hood everything works with the PauliSum data type. It can also be passed to propagate(), and you can create it like this:

# input_psum = PauliSum(nq)
# add!(input_psum, :Z, 32)
# add!(input_psum, [:X, :X], [1, 64])
# add!(input_psum, [:X, :Y, :Z], [1, 4, 7], 1.3)

Now we specify the circuit that we want to run. We provide some simple constructors for common circuit ansätze. Here we use a generic circuit ansatz called the hardwareefficientcircuit. It consists of repeated RX- RZ - RX Pauli rotation gates intertwined with RYY entangling gates.

For everything to work out, we need to specify the number of circuit layers and the entangling topology. By default we use the so-called 1D bricklayertopology.

nl = 4
topology = bricklayertopology(nq; periodic=false)
circuit = hardwareefficientcircuit(nq, nl; topology=topology)
nparams = countparameters(circuit)
1020

But you can use other predefined topologies in the library or customize a topology via an array or tuples like [(1, 2), (1, 3), ...].

# rectopology = rectangletopology(8, 8; periodic=false) 

# hardcoded_custom_topology = [(1, 2), (1, 15), (2, 3), (3, 4), (4, 5), (5, 6), (5, 16), (6, 7)]

# function centralspintopology(nqubits)
#     return [(1, ii) for ii in 2:nqubits]
# end

# centralspin_topology = centralspintopology(nq)

There are also several other inbuilt circuits in the library and you can define your own custom circuits. Note that both circuits, and the angles they take, are defined in the Schrodinger picture. For example:

# circuit = tfitrottercircuit(nq, nl; topology=topology, start_with_ZZ=true);
# circuit = tiltedtfitrottercircuit(nq, nl; topology=topology)
# circuit = heisenbergtrottercircuit(nq, nl; topology=topology)
# function myheisenbergtrottercircuit(nqubits::Integer, nlayers::Integer; topology=nothing)
#     circuit::Vector{Gate} = []

#     if isnothing(topology)
#         topology = bricklayertopology(nqubits)
#     end

#     for _ in 1:nlayers
#         rxxlayer!(circuit, topology)
#         ryylayer!(circuit, topology)
#         rzzlayer!(circuit, topology)
#     end

#     return circuit
# end

# circuit = myheisenbergtrottercircuit(nq, nl; topology=topology);
# nparams = countparameters(circuit)

# Jxx = 0.5
# Jyy = 0.25
# Jzz = 0.25
# dt = 0.2

# thetaxx_indices = getparameterindices(circuit, PauliRotation, [:X, :X])
# thetayy_indices = getparameterindices(circuit, PauliRotation, [:Y, :Y])
# thetazz_indices = getparameterindices(circuit, PauliRotation, [:Z, :Z])

# thetas = zeros(nparams)
# thetas[thetaxx_indices] .= 2 * Jxx * dt
# thetas[thetayy_indices] .= 2 * Jyy * dt
# thetas[thetazz_indices] .= 2 * Jzz * dt

Let us draw some random parameters from a narrow-ish Gaussian around zero. These parameters are 1-to-1 assigned to the parametrized gates in the circuit, for the beginning to the end. Internally we will reverse both and apply the gates with their corresponding parameter in the Heisenberg picture.

using Random
Random.seed!(42)
thetas = randn(nparams) * 0.5

Now we get to the interesting part. The propagation of the Observable pstr through circuit. In other words, we apply the circuit onto the Pauli string. First, let us do it exactly.

@time psum = propagate(circuit, pstr, thetas)
  3.745413 seconds (2.39 M allocations: 130.737 MiB, 1.06% gc time, 81.55% compilation time)





PauliSum(nqubits: 64, 109556 Pauli terms:
 2.7344e-9 * IIIIIIIIIIIIIIIIIIII...
 -5.3891e-7 * IIIIIIIIIIIIIIIIIIII...
 1.4518e-8 * IIIIIIIIIIIIIIIIIIII...
 4.5131e-9 * IIIIIIIIIIIIIIIIIIII...
 -6.8804e-9 * IIIIIIIIIIIIIIIIIIII...
 2.6186e-10 * IIIIIIIIIIIIIIIIIIII...
 -1.6283e-7 * IIIIIIIIIIIIIIIIIIII...
 -3.5821e-7 * IIIIIIIIIIIIIIIIIIII...
 8.9712e-6 * IIIIIIIIIIIIIIIIIIII...
 -1.5098e-9 * IIIIIIIIIIIIIIIIIIII...
 -1.8325e-10 * IIIIIIIIIIIIIIIIIIII...
 6.9084e-6 * IIIIIIIIIIIIIIIIIIII...
 -2.8954e-9 * IIIIIIIIIIIIIIIIIIII...
 -1.9562e-9 * IIIIIIIIIIIIIIIIIIII...
 -1.2302e-9 * IIIIIIIIIIIIIIIIIIII...
 7.0029e-10 * IIIIIIIIIIIIIIIIIIII...
 -1.3435e-10 * IIIIIIIIIIIIIIIIIIII...
 1.2432e-7 * IIIIIIIIIIIIIIIIIIII...
 1.2141e-8 * IIIIIIIIIIIIIIIIIIII...
 1.8277e-7 * IIIIIIIIIIIIIIIIIIII...
  ⋮)

Just to run it again after Julia's jit (just-in-time)-compilation, which compiles the code and makes subsequent calls faster:

@time propagate(circuit, pstr, thetas)
  0.710525 seconds (6.27 k allocations: 14.802 MiB)

How is it possible that we can exactly run 64-qubit circuits? We leverage a concept referred to as the entanglement lightcone. But we don't do it manually, the Pauli propagation framework does it naturally. Still, we created over 160k Pauli strings in an object of type PauliSum. As the name suggests, this is a sum of Pauli strings. Be very mindful of the fact that the runtime will initially scale exponentially with the circuit depth! The psum object may already be of interest of you. But here is a quick and easy way how to evaluate the expectation value if initial state is the zero-state, i.e. $\rho = |0\rangle\langle 0|$.

overlapwithzero(psum)
0.5609238336860108

Remember, this was an exact calculation. Well actually, our library has a default truncation threshold of Pauli strings with coefficients smaller than machine precision eps ~ 2e-16. Here is how you control the minimum absolute coefficient threshold min_abs_coeff. (Warning - if you are now using your own custom circuit / topology this may time out).

@time psum_exact = propagate(circuit, pstr, thetas; min_abs_coeff=0)
  2.471811 seconds (362.85 k allocations: 50.886 MiB, 1.91% gc time, 36.15% compilation time)





PauliSum(nqubits: 64, 173055 Pauli terms:
 -2.8798e-11 * IIIIIIIIIIIIIIIIIIII...
 5.7179e-14 * IIIIIIIIIIIIIIIIIIII...
 -5.3891e-7 * IIIIIIIIIIIIIIIIIIII...
 4.5148e-9 * IIIIIIIIIIIIIIIIIIII...
 -6.8698e-9 * IIIIIIIIIIIIIIIIIIII...
 1.7871e-11 * IIIIIIIIIIIIIIIIIIII...
 1.9305e-11 * IIIIIIIIIIIIIIIIIIII...
 4.2701e-14 * IIIIIIIIIIIIIIIIIIII...
 -1.5036e-9 * IIIIIIIIIIIIIIIIIIII...
 -2.5691e-10 * IIIIIIIIIIIIIIIIIIII...
 6.9084e-6 * IIIIIIIIIIIIIIIIIIII...
 -2.8937e-9 * IIIIIIIIIIIIIIIIIIII...
 4.4664e-11 * IIIIIIIIIIIIIIIIIIII...
 -1.9572e-9 * IIIIIIIIIIIIIIIIIIII...
 3.1811e-11 * IIIIIIIIIIIIIIIIIIII...
 -1.4641e-10 * IIIIIIIIIIIIIIIIIIII...
 4.1297e-9 * IIIIIIIIIIIIIIIIIIII...
 -4.0042e-13 * IIIIIIIIIIIIIIIIIIII...
 -6.1652e-9 * IIIIIIIIIIIIIIIIIIII...
 -8.3748e-9 * IIIIIIIIIIIIIIIIIIII...
  ⋮)

And the expectation value is pretty much the same.

print("The error with our default truncation of $(eps()) is ", abs(overlapwithzero(psum_exact) - overlapwithzero(psum)))
The error with our default truncation of 2.220446049250313e-16 is 7.416897096490516e-10

How about higher truncation thresholds?

min_abs_coeff = 1e-3
@time psum_coeff = propagate(circuit, pstr, thetas; min_abs_coeff = min_abs_coeff)
  0.033702 seconds (14.82 k allocations: 794.938 KiB, 73.96% compilation time)





PauliSum(nqubits: 64, 415 Pauli terms:
 0.0029408 * IIIIIIIIIIIIIIIIIIII...
 -0.011275 * IIIIIIIIIIIIIIIIIIII...
 0.0059204 * IIIIIIIIIIIIIIIIIIII...
 0.0074633 * IIIIIIIIIIIIIIIIIIII...
 0.0045557 * IIIIIIIIIIIIIIIIIIII...
 -0.0032597 * IIIIIIIIIIIIIIIIIIII...
 0.0030926 * IIIIIIIIIIIIIIIIIIII...
 -0.0031138 * IIIIIIIIIIIIIIIIIIII...
 -0.0012633 * IIIIIIIIIIIIIIIIIIII...
 -0.0038215 * IIIIIIIIIIIIIIIIIIII...
 -0.03089 * IIIIIIIIIIIIIIIIIIII...
 0.0045263 * IIIIIIIIIIIIIIIIIIII...
 0.012333 * IIIIIIIIIIIIIIIIIIII...
 0.0019774 * IIIIIIIIIIIIIIIIIIII...
 -0.10911 * IIIIIIIIIIIIIIIIIIII...
 -0.0012252 * IIIIIIIIIIIIIIIIIIII...
 -0.0022773 * IIIIIIIIIIIIIIIIIIII...
 -0.0036414 * IIIIIIIIIIIIIIIIIIII...
 -0.0020843 * IIIIIIIIIIIIIIIIIIII...
 0.0075567 * IIIIIIIIIIIIIIIIIIII...
  ⋮)
print("The error with `min_abs_coeff = $min_abs_coeff` is ", abs(overlapwithzero(psum_exact) - overlapwithzero(psum_coeff)))
The error with `min_abs_coeff = 0.001` is 0.0008314830754397873

The expectation value is still surprisingly precise. As a side-note, something that can be very useful and efficient, is to change a PauliSum in-place. For example:

# input_psum = PauliSum(nq)
# add!(input_psum, :Z, 32)
# add!(input_psum, [:X, :X], [1, 64])
# add!(input_psum, [:X, :Y, :Z], [1, 4, 7], 1.3)

# @time propagate!(circuit, input_psum, thetas; min_abs_coeff=1e-3)
# @time propagate!(circuit, input_psum, thetas; min_abs_coeff=1e-3) # and again
# input_psum

Another very general but powerful truncation is one based on Pauli weight. We can truncate Pauli strings that have many non-Identity Paulis, which has been proven to be a valid truncation for random circuits, but it also works well in practice. We can pass this as the keyword argument max_weight.

max_weight = 5
@time psum_weight = propagate(circuit, pstr, thetas; max_weight = max_weight)
  1.055857 seconds (358.64 k allocations: 19.698 MiB, 84.26% compilation time)





PauliSum(nqubits: 64, 12764 Pauli terms:
 1.5982e-7 * IIIIIIIIIIIIIIIIIIII...
 -0.00039531 * IIIIIIIIIIIIIIIIIIII...
 7.9886e-7 * IIIIIIIIIIIIIIIIIIII...
 1.3274e-7 * IIIIIIIIIIIIIIIIIIII...
 4.6574e-9 * IIIIIIIIIIIIIIIIIIII...
 -6.9675e-6 * IIIIIIIIIIIIIIIIIIII...
 2.2851e-10 * IIIIIIIIIIIIIIIIIIII...
 -6.5901e-6 * IIIIIIIIIIIIIIIIIIII...
 1.7211e-5 * IIIIIIIIIIIIIIIIIIII...
 -0.00036713 * IIIIIIIIIIIIIIIIIIII...
 3.7909e-7 * IIIIIIIIIIIIIIIIIIII...
 -6.2404e-6 * IIIIIIIIIIIIIIIIIIII...
 3.7205e-7 * IIIIIIIIIIIIIIIIIIII...
 -6.4898e-9 * IIIIIIIIIIIIIIIIIIII...
 5.9261e-5 * IIIIIIIIIIIIIIIIIIII...
 -1.402e-9 * IIIIIIIIIIIIIIIIIIII...
 -0.0042916 * IIIIIIIIIIIIIIIIIIII...
 7.3402e-6 * IIIIIIIIIIIIIIIIIIII...
 0.00051561 * IIIIIIIIIIIIIIIIIIII...
 -7.2552e-7 * IIIIIIIIIIIIIIIIIIII...
  ⋮)
print("The error with `max_weight = $max_weight` is ", abs(overlapwithzero(psum_exact) - overlapwithzero(psum_weight)))
The error with `max_weight = 5` is 9.446910668420294e-5

One mindset to adopt when using this package (or any other computational physics package for that matter), is that exact computation will quickly be infeasible. Truncations introduce a necessary trade-off between computational cost and accuracy.

Time vs Accuracy

Truncations are absolutely vital for Pauli propagation. Choose the truncations too tight and the results will be systematically wrong, choose the truncations too loose and the computation will be too hard.

Here we show how varying the minimal coefficient threshold min_abs_coeff affects both the error and the runtime.

# using Pkg; Pkg.add("Plots")

using Plots
Plots.scalefontsizes(1.2)
# the min_abs_coeff to check
trunc_coeffs = 10.0 .^ (-1:-1:-10)

# One more exact computation. Record the expectation value and time.
res = @timed propagate(circuit, pstr, thetas; min_abs_coeff=0)
exact_expectation = overlapwithzero(res.value)
exact_time = res.time

# Sweep over the truncation values
expectations = zeros(length(trunc_coeffs))
times = zeros(length(trunc_coeffs))
for (ii, min_abs_coeff) in enumerate(trunc_coeffs)
    res = @timed propagate(circuit, pstr, thetas; min_abs_coeff=min_abs_coeff)
    expectations[ii] = overlapwithzero(res.value)
    times[ii] = res.time
end

# calculate the absolute error to the exact expectation value
abs_errors = abs.(expectations .- exact_expectation)

A plot of truncation threshold vs error (lower is better):

plot(trunc_coeffs, abs_errors, xscale=:log10, yscale=:log10, marker=:o, label="With coefficient truncation", xticks=sort(trunc_coeffs), xlabel="Truncation Threshold", ylabel="Error", yticks=10.0 .^ (-10:1:0), legend=:topleft)

svg

A plot of truncation threshold vs runtime in seconds (lower is better):

plot(trunc_coeffs, times, xscale=:log10, yscale=:log10, marker=:o, label="With coefficient truncation", xticks=sort(trunc_coeffs), ylabel="Runtime [s]", xlabel="Truncation Threshold", yticks=10.0 .^ (-4:1:0), legend=:bottomleft)
hline!([exact_time], color="grey", label="Exact computation", linestyle=:dash, lw=2)

svg

And finally, plot the error against the runtime. This tells you, if you need the result to x accuraccy, you will need y time.

plot(abs_errors, times, xscale=:log10, marker=:o, yscale=:log10, label="With coefficient truncation", xticks=10.0 .^ (-10:2:0), xlabel="Error", ylabel="Runtime [s]", yticks=10.0 .^ (-4:1:0), legend=:bottomleft)
hline!([exact_time], color="grey", label="Exact computation", linestyle=:dash, lw=2)

svg

This highlights a very important concept: There is a time vs accuracy trade-off where one can get away with orders of magnitude less computing resources compared to exact results if one is happy with a certain error. The problem: You may never know what the exact result is, and thus you won't truly know how accurate the results are. You have to check for "convergence", which is a stabilization of the result as you lower the truncation. See for example this plot of the expectation value without the exact value for reference:

plot(trunc_coeffs, expectations, xscale=:log10, marker=:o, label="", xticks=sort(trunc_coeffs), xlabel="Truncation Threshold", ylabel="Expectation Value", legend=:topleft)

svg

An important note about this notebook: The runtime and errors for different combinations of observable, circuit, and initial state vary tremedously. Truly challenging simulations may require min_abs_coefficient below 1e-6 or max_weight significantly above 10. This will require heavy compute and perhaps new truncation rules or Pauli merging strategies (but this is left for another tutorial).