Clifford-Based Circuit Cutting For Quantum Simulation
Clifford-Based Circuit Cutting For Quantum Simulation
Quantum Simulation
Kaitlin N. Smith§ , Michael A. Perlin§ , Pranav Gokhale, Paige Frederick,
David Owusu-Antwi, Richard Rines, Victory Omole, and Frederic T. Chong
Super.tech, a division of Infleqtion
Abstract—Quantum computing has potential to provide expo- computer (QC) hardware features critical to the success of
nential speedups over classical computing for many important quantum algorithms, leading to engineering improvements that
arXiv:2303.10788v1 [quant-ph] 19 Mar 2023
applications. However, today’s quantum computers are in their deliver to more powerful quantum systems. Unfortunately,
early stages, and hardware quality issues hinder the scale of
program execution. Benchmarking and simulation of quantum state-of-art classical methods for quantum circuit simulation
circuits on classical computers is therefore essential to advance suffer from scaling challenges: resource requirements grow
the understanding of how quantum computers and programs exponentially with the size of the quantum problem. While this
operate, enabling both algorithm discovery that leads to high- inability to easily translate many quantum computations into
impact quantum computation and engineering improvements a classical equivalent enables QCs to demonstrate advantage
that deliver to more powerful quantum systems. Unfortunately,
the nature of quantum information causes simulation complexity when applied toward certain domains, it also hinders the
to scale exponentially with problem size. understanding of large-scale quantum processing. Fortunately,
In this paper, we debut Super.tech’s SuperSim framework, not all classical simulation of quantum circuits is characterized
a new approach for high fidelity and scalable quantum circuit by intractable scaling. If a circuit is expressed with a special
simulation. SuperSim employs two key techniques for accelerated family of operations, known as Clifford operations, quantum
quantum circuit simulation: Clifford-based simulation and circuit
cutting. Through the isolation of Clifford subcircuit fragments circuit simulation becomes efficient on classical hardware,
within a larger non-Clifford circuit, resource-efficient Clifford with a classical simulation complexity that grows quadratically
simulation can be invoked, leading to significant reductions in with quantum system size.
runtime. After fragments are independently executed, circuit In this industry-track paper, we describe Super.tech’s de-
cutting and recombination procedures allow the final output of velopment of SuperSim, the first simulator to unite two tech-
the original circuit to be reconstructed from fragment execution
results. Through the combination of these two state-of-art tech- niques: Clifford circuit simulation and circuit cutting. Clifford
niques, SuperSim is a product for quantum practitioners that circuit simulation is the efficient simulation of an important
allows quantum circuit evaluation to scale beyond the frontiers class of quantum circuits found in many critical quantum
of current simulators. Our results show that Clifford-based applications. Circuit cutting is a divide-and-conquer frame-
circuit cutting accelerates the simulation of near-Clifford circuits, work for simulating large quantum circuits by cutting them
allowing 100s of qubits to be evaluated with modest runtimes.
into smaller subcircuits that can be independently executed on
I. I NTRODUCTION smaller QCs. By uniting these techniques, SuperSim expands
Quantum computing is an emerging information processing the set of quantum applications for which classical simulation
paradigm that shows great theoretical promise for applications is tractable and helpful.
such as chemistry [24], optimization [33], cryptography [45], The class of circuits that SuperSim addresses is referred
and machine learning [4], among others. Because quantum to as near-Clifford circuits. Prior work has also addressed
bits, or qubits, have the ability to demonstrate quantum near-Clifford circuits. However, our approach is the first to
superposition, interference, and entanglement, quantum algo- leverage the recent development of quantum circuit cutting.
rithms enable significant speedups when applied toward certain We also benefit from the recent emergence of new applica-
classes of problems, usually those characterized by a large tions of near-Clifford circuits to solving chemistry [42] and
search space in which the “optimal” solution lives. optimization [32] problems.
Quantum circuit simulation with classical hardware offers SuperSim is not a panacea for simulating near-Clifford
an effective means to quantify performance and troubleshoot circuits. Although our specific setting has some computa-
potential issues within quantum programs without needing tionally advantageous factors over generic circuit cutting (see
direct access to quantum hardware. Through quantum circuit Section IX), SuperSim still shares the limitations of generic
simulation, the computational power of quantum and classical circuit cutting. Specifically, if a circuit requires many (k)
computers can be distinguished, enabling the discovery of cuts to partition into Clifford and non-Clifford subcircuits, the
applications that have non-trivial quantum advantage. Ad- reconstruction time (scaling as 4k ) can be intractable. Despite
ditionally, quantum circuit simulators that faithfully reflect this asymptotic limitation in the limit of a large number of
real machine noise are a viable pathway to identify quantum cuts, we show in Section VII that in two example applica-
tions, SuperSim offers orders-of-magnitude faster simulation
§ These authors contributed equally. over other state-of-the-art approaches. We hypothesize that
1
additional near-Clifford quantum circuits with potential for A. Quantum Information
improved classical simulation via SuperSim have yet to be
An isolated qubit can hold a complex-valued linear combi-
discovered. These advantages are apparent even in a straight-
nation (superposition) of the basis states |0i and |1i, which
forward, naive implementation of SuperSim; we discuss how
are identified with the standard (“computational”) basis of a
the performance of SuperSim can be further improved by
2-dimensional vector space. Upon measurement in the compu-
leveraging parallelization and GPU acceleration in Section X.
tational basis {|0i , |1i}, a qubit in the state |ψi = α |0i+β |1i
The launch of SuperSim will encompass the release of collapses onto a classical outcome of either |0i or |1i, re-
the open-source codebase, as well as accompanying docu- spectively, with probabilities |α|2 and |β|2 . The magnitude
mentation, tutorials, and product framework that are currently of a quantum state vector is thereby always equal to one:
under development. Our hope is that SuperSim will be an | |ψi |2 = |α|2 + |β|2 = 1. A state of many qubits can be
invaluable tool for architects and quantum practitioners, who represented by tensor products of single-qubit state vectors, as
will apply it to study near-Clifford circuits pertinent to error well as linear combinations of such tensor products. Quantum
correction design and to end-user applications. Moreover, we operations are represented by matrices U that transform state
will release SuperSim under an open-source license to en- vectors; these matrices are unitary, UU−1 = U−1 U = 1. A
courage developers to continuously improve the codebase and consequence of unitarity is that the number of input and output
expand SuperSim’s applications. From an industry standpoint, qubits to a quantum circuit are equal.
this open-source model aligns with our strategic goals – we Single-qubit operations frequently used in quantum com-
anticipate commercial benefit if and when users go beyond putation include the bit-flip operator X, which exchanges |0i
local simulators to running on classical accelerators (i.e. GPUs, with |1i, the phase-flip operator Z, which takes α |0i + β |1i
TPUs) and real quantum hardware. to α |0i − β |1i, and the combined operation Y = −iZ · X.
This paper provides an evaluation of the SuperSim frame- These primitives, along with the identity gate I, are referred
work: a new software solution that accelerates the classical to as Pauli gates. When a Pauli operation has a control qubit,
simulation of near-Clifford quantum circuits. As SuperSim CX, CY , CZ, the two-qubit operation transforms the state
is in its alpha version, we plan to release upcoming im- of the target qubit with a Pauli gate depending on the state of
provements in terms of infrastructure, verification, and bench- the control.
marking that are part of Super.tech’s product road map. The
remainder of this paper is structured as follows. Section II B. Quantum Applications
provides background information on quantum computation
In the near term, one of the most promising applications for
that aids paper comprehension. Section III motivates the
QCs are variational quantum algorithms (VQAs). In a VQA,
development of the SuperSim framework by describing the
a cost function is defined and encoded into a parameterized
limitations involved with the classical simulation of quantum
quantum circuit, referred to as an ansatz. VQAs can be
circuits, the benefits of Clifford simulation, and the basics of
thought of as hybrid algorithms: a classical optimizer tunes
quantum circuit cutting. Section IV explains the importance of
the ansatz parameters based on circuit outcomes over many
near-Clifford circuits and provides example applications for
QC evaluations, working toward a solution, the quantum state
their simulation. Section V introduces SuperSim itself. Sec-
prepared by the circuit, that either minimizes or maximizes the
tion VI describes our methodology to evaluate SuperSim when
target cost function. VQAs have a wide range of applications
targeted toward select applications described in Section IV.
including quantum chemistry [41] and optimization [33]. Two
Section VIII includes a discussion of the implications of this
key type of VQAs are the variational quantum eigensolver
paper’s preliminary SuperSim evaluations as well as related
(VQE) and the quantum approximate optimization algorithm
work. Section IX discusses in-progress framework optimiza-
(QAOA). These algorithms are well-poised for near-term
tions that are specific to SuperSim’s Clifford-based cutting
demonstrations of quantum advantage because of their ability
techniques. Section X presents other general SuperSim perfor-
to adapt to the intrinsic noise profile of a QC.
mance improvements outlined in our development roadmap.
Section XII describes lessons learned and how we plan to Current quantum hardware is limited in that it (a) assigns
improve the evaluation of SuperSim. Finally, Section XIII each logical qubit of an algorithm to a single physical qubit in
summarizes this work’s findings and provides conclusions. hardware, and (b) suffers from noise (i.e., hardware errors). In-
deed, many quantum algorithms, such as Shor’s algorithm for
quantum factoring [45] and quantum algorithms that accelerate
solving linear equations [23], are extremely sensitive to noise,
II. Q UANTUM C OMPUTING F UNDAMENTALS
which can irreversibly corrupt the result of a computation.
A central goal in quantum hardware design is therefore to
This section contains an overview of quantum computing achieve physical error rates below the thresholds required for
fundamentals that are essential to our study. The subsections fault-tolerant quantum error correction (QEC). In fault-tolerant
here are not all-inclusive. The authors encourage the curious QEC, a logical qubit is encoded into the state of many physical
reader to find a more detailed discussion of the following qubits, in such a way that physical errors can be diagnosed
topics in Ref. [34]. and corrected without corrupting the state of the logical qubit.
2
Current QC error rates are rapidly approaching, and have thus rapidly becomes intractable for growing qubit numbers.
in some cases surpassed, the theoretical thresholds required Current classical quantum circuit simulations appear to be
to run the surface code (∼ 1%) [15]. Going forward, achiev- bounded at 61 qubits for quantum application circuits [55] and,
ing useful, fault-tolerant, error-corrected quantum computation conditionally, 100 qubits for random quantum circuits [29]. In
requires further reducing physical error rates while simultane- addition, noisy simulation that attempts to model the coherence
ously scaling up qubit numbers. and gate characteristics of real quantum hardware escalates
the complexity of classical quantum circuit simulation even
C. Clifford Circuits further.
The Clifford group is a set of quantum computing operations Improving the viability of quantum circuit simulation on
that transform Pauli strings (i.e., tensor products of single- classical hardware is an active area of research. Many ap-
qubit Pauli operators) into Pauli strings by conjugation; that proaches have been proposed, such as those based on state vec-
is, if P is the set of Pauli strings and C the Clifford group, then tor representation [12], Feynman path integrals [27], decision
cpc−1 ∈ P for all p ∈ P and c ∈ C [20]. The Clifford group diagrams [22], ZX calculus [26], and tensor networks [30],
consists√ of single-qubit
√ √ Pauli gates, the square roots of these
√ [31], [37], [44], [54]. Exact quantum simulators, such as stat-
gates ( X, Y , Z), the Hadamard gate H = (Z +X)/ 2, evector simulators, have especially poor scaling because they
the two-qubit controlled-X gate, and all operations that can be faithfully capture quantum state in its entirety. Approximate
obtained by composing these gates. Smaller qubit rotations that simulators, like tensor network simulators based on matrix
allow fine-grain control of qubit state, such as Z 1/4 = T are product states [56], can exchange accuracy for scalability,
non-Clifford. The Clifford group does not provide a universal producing outcomes with error margins. Approximate methods
set of quantum gates and cannot be used for arbitrary quan- can be exceedingly accurate in restricted scenarios, such as low
tum computation [19]. However, there are important quantum or geometrically local entanglement, but these methods can fail
domains that have applications focused on the Clifford-space, catastrophically outside their limited regimes of applicability.
including quantum networks [53], error-correcting codes [43], Prior work estimates that at minimum, millions of high-
teleportation [21] and error mitigation [8], [46]. quality physical qubits will be required for fault-tolerant
A striking and important property of the Clifford group, quantum computation [35]. These qubits will implement error
summarized by the Gottesman-Knill theorem, is that quantum correcting codes, and since QEC circuits primarily consist of
circuits consisting solely of Clifford operations are efficiently Clifford operations, classical simulation of n qubits executing
simulable on a classical computer [19]. This insight has QEC code cycles is possible in time poly(m, n), where m
been applied toward quantum circuit optimization such as is the number of gates in the circuit [25]. Quantum circuits
dynamical decoupling scheduling [10] and improved ansatz including tens of thousands of qubits and millions of oper-
initialization for variational quantum algorithms (VQAs) [42]. ations can be evaluated with Clifford simulation frameworks
While using only Clifford gates provides the advantage of such as Stim [17]. In a Clifford simulator, the noise model
tractable simulation, the use of a non-universal gate set during can only include Pauli channels, which can be thought of as
quantum computation prohibits the quantum system from probabilistic Pauli operations interspersed throughout a circuit.
exploring the full richness of the quantum state space. Clifford simulators are incapable of representing more realistic
non-Clifford errors that occur in real quantum hardware.
III. P RIOR W ORK AND M OTIVATION Stim is a high performance simulation framework for quan-
In this section, we present state-of-art techniques for quan- tum stabilizer circuits [17]. Fig. 1 provides insight into the
tum circuit simulation with classical hardware. We also de- runtime differences between Clifford and (naive) non-Clifford
scribe the theory that enables quantum circuit cutting. Both simulation with randomly generated circuits that range from 2
of these techniques provide inspiration for the SuperSim to 20 qubits in size. The depth of these circuits is equal to their
framework. width (qubit number), and results are averaged over 100 ran-
domly generated circuits. The Clifford circuit simulator, Stim,
A. Classical Simulation of Quantum Circuits runs significantly faster than the statevector simulator. The
Quantum circuit simulation with classical hardware offers significant performance improvements of the Stim simulator
an effective means to quantify performance and troubleshoot motivates its integration within SuperSim. However, additional
potential issues within quantum algorithms without needing methods are required to handle non-Clifford circuit elements
direct access to quantum hardware. Additionally, quantum during SuperSim simulation.
circuit simulators that faithfully reflect real machine noise are When the number of qubits in a circuit is large but the
a viable pathway to identify QC hardware features critical to number of non-Clifford operations comparably small, the
the success of quantum algorithms. Unfortunately, state-of- classical simulation complexity of quantum circuits scales
art classical methods for quantum circuit simulation suffer polynomially according to number of qubits and Clifford gates
from scaling challenges. As algorithms increase in qubit but exponentially with the addition of each non-Clifford gate.
count n (“circuit width”), the memory requirements to store Thus, recent work has explored how to efficiently simulate
a quantum state vector grow exponentially, 2n . Even us- near-Clifford circuits, or quantum circuits containing a few
ing high-performance supercomputers, state vector simulation non-Clifford gates, typically chosen to be T gates. Examples
3
Stim Simulator
of a fragment, the corresponding qubit must be measured in
4
below are different in nature and pertain to recently discovered VQE is a type of VQA often applied in quantum chemistry
end-user applications of near-Clifford quantum circuits. In applications. There are a variety of different types of circuit
these contexts, SuperSim will enable end users of quantum structures for VQE, and the hardware efficient ansatz (HWEA)
computation to benchmark the capabilities of quantum hard- is particularly promising as it is customizable to a targeted QC,
ware. making it low depth and well-suited for near-term quantum
hardware [24]. At its core, the HWEA consists of single-qubit
A. Simulations for Quantum Error Correction (QEC) rotation gates (the rotation angles are the tunable parameters),
The best known application of simulating near-Clifford a layer of two-qubit entangling gates, and a final layer of
circuits is toward estimating the performance of a QEC circuit single-qubit rotation gates. Alternatively, QAOA is often ap-
against a non-Clifford noise model. In particular, in order to plied toward optimization problems, such as MaxCut. Here,
numerically validate a fault-tolerant QEC code, one must run we study the Sherrington-Kirkpatrick (SK) model for MaxCut
repeated simulations of error correction circuits with stochastic on complete graphs with edge weights randomly drawn from
errors injected according to a physical noise model. Even {−1, +1} [14]. Our QAOA ansatz matches the SK model
with sophisticated tensor network methods, direct state vector exactly, meaning that it requires all-to-all connectivity in the
or density matrix simulations have classical computing costs quantum circuit structure.
(as measured by both memory footprint and the number of The expressiveness of a QC’s gate set allows VQAs to
floating-point operations) that scale exponentially with the be more precise. In addition, more layers in the ansatz
number of qubits, precluding the numerical investigation of (i.e., single-qubit rotations, entangling operations, and single-
error-correcting codes that involve more than a few dozen qubit rotations) help boost VQA accuracy. A challenging
qubits. problem for both VQE and QAOA is how to initialize the
The dominant approach is therefore to perform stabilizer parameters at the start of VQA execution. Parameters can be
simulations. While these simulations are compatible with randomly initialized, but finding approximately optimal angles
many typical error correcting codes (in that all gates are for the single-qubit gates can help the VQA more rapidly
Clifford gates), the only errors that are supported in this and accurately converge on a desired answer. As a solution,
model are Pauli errors, which are highly idealized and cannot the work in [42] presents the Clifford Ansatz for Quantum
capture many physically relevant sources of error in a quantum Accuracy (CAFQA) that uses an initial ansatz containing
computer. For example, prevalent noise channels such as only Clifford operations, optimizing ansatz parameters with
amplitude damping and overrotation cannot be captured by efficient Clifford simulation. This Clifford-based optimization
stabilizer simulations [3]. Stabilizer-compatible approxima- for VQA demonstrated promising results - convergence to the
tions to these noise channels have been shown to substantially desired VQA solution was found to be quick and accurate.
underestimate the impact of noise [9]. For example, in [9], it However, [42] recognizes that the addition of just a single T
was observed that for a width-5 surface code lattice modeled gate allows the ansatz tuning space to become richer, opening
with a systematic 1◦ overrotation noise, the estimated logical doors for greater VQA accuracy. As discussed and preliminar-
error rate is 10−16 based on a Pauli noise approximation, ily evaluated in [42], more optimal VQA initialization results
whereas the actual logical error rate is 10−6 – ten orders of if a few non-Clifford gates are permitted in the VQA ansatz
magnitude greater. initialization via classical simulation. Unfortunately, the addi-
Simulation approaches for QEC are required that can scale tion of a T gate to a Clifford circuit removes opportunities for
to hundreds of qubits without needing an approximated variant tractable scaling, motivating the search for practical Clifford+T
of the noise. In the case of Clifford QEC circuits, even simulation solutions. With SuperSim, new opportunities exist
adding a few non-Clifford gates to create a near-Clifford QEC for near-Clifford variational optimization with a near-CAFQA
description could improve modeling, creating a more granular ansatz initialization framework.
picture of the impact error has on quantum computation. A
primary motivation for the development of SuperSim was C. Generative Modeling
delivering a near-Clifford quantum circuit simulator that helps In [2] and [16], it was rigorously proven that Clifford cir-
fill the gap that exists between approximate and realistic noisy cuits exhibit unconditional quantum advantages for generative
QEC simulation. modeling machine learning tasks. Typically, this advantage
is quadratic, which coincides with the fact that classically
B. Variational Optimization with CAFQA simulating an n-qubit Clifford circuit requires tracking O(n2 )
As introduced in Section II-B, VQAs are hybrid quantum- stabilizer bits. Interestingly however, this quadratic advantage
classical algorithms in which a classical optimization subrou- in memory can sometimes be lifted to a quartic O(n4 )
tine makes queries to a QC in the form of gate parameters. advantage in time [2], which has motivated deeper study of
The information the QC sends as a response then helps inform Clifford-based quantum speedups.
the classical optimizer’s next quantum query. An ansatz is the However, a challenge in both [2] and [16] is to train these
parameterized quantum circuit used within the aforementioned machine learning models, which requires non-Clifford gates.
feedback loop, and there are many ansatz structures that can SuperSim is ideally matched to this context of primarily Clif-
be used within a VQA. ford gates, with non-Clifford gates to enable gradient descent
5
in the parameter landscape. We have initiated collaboration disjoint subcircuits, which are referred to as circuit fragments.
with the authors of [2] and [16] to validate this approach. The circuit fragments are then sent to the SuperSim fragment
evaluation module.
D. Fingerprinting
Finally, we envision applying SuperSim to SupercheQ [18],
a protocol for quantum speedup in fingerprinting that we B. SuperSim Fragment Evaluator
released last year. The task of fingerprinting involves vali-
dating that two files are equal, against a worst-case adver- All circuit fragments are tagged as either Clifford or non-
sarial model (which obviates even cryptographic hashes to Clifford. Clifford fragments are then passed to a Clifford
avoid hash collisions). With just Clifford circuits, SupercheQ’s circuit simulator (Stim), and non-Clifford fragments are passed
Incremental Encoding (IE) variant asymptomatically matches to an exact (e.g. statevector) simulator. Each fragment induces
the best possible classical protocol in space, while attaining a several fragment variants, where one variant of a fragment
quantum advantage in terms of incrementality. We also show corresponds to the fragment with some additional operations
that SupercheQ’s Efficient Encoding (EE) variant, which is attached at the locations of cuts incident to that fragment.
entirely non-Clifford, attains an exponential advantage over the Fragment inputs and outputs are designated as follows:
best possible classical protocol, though it lacks incrementality. • Circuit Input: An input of the original, uncut circuit. All
We thus propose using SuperSim to investigate the middle- circuit input qubits are initialized in the |0i state, so no
ground, in particular by enriching the Clifford-only space of additional operations are associated with circuit inputs.
SupercheQ-EE with a few non-Clifford gates. Work towards • Circuit Output: An output of the original, uncut circuit.
studying this is under way by other members of the Super.tech All circuit output qubits are measured in the computa-
team. tional basis {|0i , |1i}, so no additional operations are
V. S UPER S IM : A C IRCUIT- CUTTING Q UANTUM required prior to measuring the circuit outputs on frag-
S IMULATION F RAMEWORK ments.
• Quantum Input: An input of the fragment, but not of the
Motivated by the need for scalable quantum circuit simu- original circuit. Every quantum input is downstream of
lation solutions and inspired by the performance of Clifford a cut in the original circuit. Additional operations are
simulation, we developed the SuperSim simulation framework required to prepare different states at quantum inputs (see
based on Clifford-based circuit cutting. SuperSim is part of Fig. 2).
the Super.tech family of quantum software solutions, nicely • Quantum Output: An output of the fragment, but not of
complimenting the SuperstaQ optimized “write once, run any- the original circuit. Every quantum output is upstream
where” compiler [50] and the SupermarQ suite of application- of a cut in the original circuit. Additional operations are
centric benchmarks [51]. SuperSim is on track to be an open- required to measure quantum output qubits in different
source and consumer ready product with a launch planned for bases (see Fig. 2).
summer 2023. The launch will encompass the release of the
open-source codebase along with accompanying documenta- After generating a batch of variants for each Clifford fragment,
tion, tutorials, and product framework that are currently under where one variant corresponds to a fixed choice of initial states
development. and final measurement bases for that fragment, a simulator
The SuperSim framework is written in Python. Its three determines the probability over measurement outcomes at the
critical components include the circuit cutting algorithm, the circuit outputs of each variant. These probability distributions,
fragment evaluator, and the probability distribution reconstruc- tagged by associated choices of initial states and measurement
tion algorithm. SuperSim uses Cirq [13] for representing and bases, are later classically postprocessed to reconstruct a
manipulating quantum circuits. The Clifford circuit evalua- probability distribution over measurement outcomes for the
tion invokes the Stim simulation backend while non-Clifford original circuit [40]. While the current version of SuperSim
fragments use the Qsim statevector simulator by default [49], simulates fragment variants sequentially, future iterations of
although the Cirq statevector simulator is also supported. In SuperSim will support parallel simulation strategies that will
the rest of this section, the three main elements of SuperSim yield significant reductions in runtime [39].
will be described. In principle, tensor network methods can be used to col-
lect all necessary non-Clifford fragment data directly without
A. SuperSim Circuit Cutter needing to simulate different fragment variants. In practice, the
The circuit cutter begins the SuperSim flow – it is the non-Clifford fragments in the applications considered in this
first transformer that an input quantum circuit encounters. work are so small that this optimization would at best result in
The cutter is fundamentally a circuit parser that identifies the negligible performance improvements. Nonetheless, to accom-
Clifford and non-Clifford elements within a circuit. As input modate unforeseen applications for which a significant fraction
circuits are intended to be near-Clifford, locations for circuit of classical computing resources is devoted to non-Clifford
cuts isolate the non-Clifford operations. After developing a fragment simulation, future iterations of SuperSim will use a
set of cut locations, the original input circuit is separated into tensor network simulator for non-Clifford fragments.
6
C. SuperSim Distribution Builder (and any Z a operation) is included in the basis gate set. Up
The final key component of SuperSim combines fragment to 63 qubits are supported in a circuit.
simulation data to build, or reconstruct, the probability distri- Qiskit Matrix Product State Simulator: A tensor-network
bution outcomes of the original, uncut circuit using the mea- simulator that relies on Matrix Product States (MPSs). Tensor-
surement data of the fragment variants. The theoretical details network simulators in general are able to scale to larger
of fragment data postprocessing and combination are provided quantum systems as compared to statevector simulators at the
in [40]. In a nutshell, postprocessing fragment data first cost of depth. Thus this simulator can support more qubits
involves applying maximum-likelihood corrections to build than statevector and extended stabilizer simulators as long as
fragment models that are self-consistent, thereby mitigating entanglement between qubits is low and critical path is small.
the effect of sampling error that is unavoidable in probabilistic A large gate set is supported.
Clifford simulation methods, as well as circuit executions Statevector (SV) Simulator: statevector simulation, such
on real quantum hardware. After constructing self-consistent as with the Qsim simulator, uses exact methods to represent
fragment models, individual probabilities for measurement all 2n state amplitudes.
outcomes in the original circuit can be expressed as the result B. Benchmark Circuits
of a tensor network contraction, with one tensor per fragment.
This tensor network has a number of edges equal to the number Simulator performance is currently evaluated with three
of cuts in the original circuit, resulting in an overall compu- benchmark circuit types, with varying numbers of qubits.
tational cost to contraction that is roughly exponential in the These circuits were chosen according to the applications pro-
number of cuts. Altogether, the current version of SuperSim posed in IV. In the future, we will sharpen these benchmarks,
uses the codes provided in [40] for postprocessing fragment and additionally benchmark SuperSim against other simulators
data and building a probability distribution over measurement in a wider array of test circuits, as discussed in Section XII.
outcomes at the end of a recombined circuit. Future iterations Near-Clifford VQE HWEA: As discussed in Section IV-B,
of SuperSim will leverage additional techniques to parallelize the near-Clifford HWEA is a helpful tool for VQE optimiza-
the post-processing of fragment data [39], which is currently tion. The base structure for this quantum circuit (or a single
implemented in a sequential fashion. round) is a Clifford HWEA consisting of a single layer of
As a final point, we emphasize that SuperSim does not single-qubit gates, a layer of entangling stages, and final layer
rely on any approximations; its only source of inaccuracy is of single-qubit gates. Together, these three layers of gates
statistical error from sampling the outputs of Clifford circuit constitute a single layer of the HWEA. Circuit width (number
fragments. SuperSim is thereby guaranteed to be at least as ac- of qubits), depth (total number of layers), and non-Clifford
curate as the evaluation of a circuit on real quantum hardware, gate injections are varied during experimentation.
with which sampling error is fundamental and unavoidable. In Near-Clifford MaxCut QAOA: Similar to the near-Clifford
addition, the SuperSim framework is readily amenable to so- VQE HWEA, this benchmark is used for optimization of
called “strong simulation” of a quantum circuit, whereby the QAOA initalization. Since the ansatz matches the SK model
probability to observe a particular bitstring at the output of the exactly, each round requires a significant amount of qubit-qubit
circuit can be computed to machine precision without added connectivity.
computational overheads. Phase Flip Repetition Code: The repetition code is a
classical error correcting code that constitutes an important
VI. M ETHODS precursor to QEC. In our preliminary experiments to evaluate
In this section, we describe our experimental environment SuperSim performance for QEC applicaions, we use a single
used to analyze SuperSim performance. This includes the round of the phase code implementation included in Super-
simulators compared against SuperSim, the chosen benchmark marQ [51].
circuits, and the experimental platform.
C. Experiment Details
A. Included Simulators SuperSim will be a product targeted for a wide range of
Performance of the SuperSim framework is compared to end users, from students in an academic setting to users in
the simulators available in Qiskit [47], IBM’s open-source industry. As such, results of this work were generated using
quantum software development kit, as well as to statevector both a MacBook Pro with a 2.4 GHz i9 Intel Processor (32
simulation. As a note, the SuperSim framework builds a GB of RAM) as well as a Linux server with up to 16 vCPU
probability distribution from the fragment simulation results. (32 GB of RAM).
As a result, we used all of the below simulators as samplers, To quantify accuracy, we frequently include simulation
using 5000 shots to build output distributions. fidelity along with runtime. On dense distributions, like those
Qiskit Extended Stabilizer Simulator: A Clifford+T sim- often seen in VQAs, we use the Hellinger fidelity of marginal
ulator that is an implementation of [5]. The extended stabi- probability distributions for measurement outcomes on indi-
lizer uses ranked-stabilizer decomposition and is approximate. vidual qubits, as compared to ideal statevector simulations.
Complexity scales with the number of non-Clifford gates as This is motivated by emerging research that supports training
non-Cliffords set the number of stabilizer terms. The T gate VQAs on a sum of local cost functions instead of a single
7
global cost function for scalability [6]. On sparse distributions takes considerably less time than if a T gate is cut out of the
(i.e. ones with one or few measured observables), we use middle of a single large Clifford circuit.
Hellinger fidelity on the complete distribution.
B. QAOA Application Evaluation
VII. R ESULTS In this section, we analyze SuperSim’s performance on
the near-Clifford MaxCut QAOA benchmark. This evaluation,
A. VQE Application Evaluation
seen in Fig. 6, was completed on server-based compute re-
Classical simulation of a near-Clifford HWEA can be ap- sources, and experiment timeout was set to 30 minutes. Once
plied toward the near-CAFQA framework proposed in Sec- again, we analyze the relationship between simulation runtime
tion IV-B. To begin, we analyze the relationships between sim- and circuit width. Accuracy is included in this analysis - all
ulation runtime and circuit width. We generate VQE HWEA data points had an fidelity higher than ∼ 0.99.
circuits that range from two to 38 qubits in size. Each circuit The QAOA benchmark is generated with sizes ranging from
has five layers of entangling operations sandwiched between three to 26 qubits. One round of QAOA is implemented that
single-qubit parameterized (Clifford) operations. The results features all-to-all connectivity among circuit qubits, and one
for time vs. qubits for the HWEA benchmark with five HWEA T gate is randomly injected into the circuit. To generate
layers and one randomly injected T gate are provided in each datapoint in Fig. 6, five experiments are executed, and
Fig. 3. Results average multiple simulation experiments (five) results are averaged to create each runtime datapoint / fidelity
for a generated benchmark and randomly placed T gate. The outcome. Similar to the results in Fig. 5, we initally see MPS
SuperSim Clifford cut simulator, Qiskit extended stabilizer and the statevector simulator outperforming SuperSim, but a
simulator, and Qiskit MPS simulator are compared to stat- point of crossover is eventually reached. These results further
evector simulation. Accuracy is included in this analysis - all reinforce the viability of SuperSim in VQA optimization.
data points had an fidelity higher than ∼ 0.99 when compared
to statevector outcomes. For smaller circuits, the runtime of C. QEC Application Evaluation
SuperSim is larger than all of the simulators by nearly an order This section describes preliminary results for SuperSim per-
of magnitude, with the exception of qiskit extended stabilizer. formance in the domain of QEC. Here, we use the SupermarQ
However, at about 25 qubits, we observe a crossover where implementation of phase repitition code and study runtime
SuperSim becomes the most efficient simulator in terms of scaling with increasing qubit number.
execution time. These results were generated using server- We note that our use of the phase code is an artificial
based resources. “proxy” benchmark that provides merely preliminary results
In the next analysis based on the near-Clifford VQE HWEA for QEC circuit simulation - this benchmark only addresses
benchmark, we focus on SuperSim and MPS simulator per- phase flip errors, not bit flip errors, and it merely detects errors
formance relationship with circuit depth. To this end, Fig. 4 without correcting them. Fig 7 shows runtime as a funciton
shows SuperSim and MPS runtime scaling as the near-Clifford of qubit number for one round of the phase repetition code
HWEA with a fixed width of 20 qubits, but a depth that with one randomly injected T gate. On the plot, points are
varies from one to 10 layers of the HWEA, in every case annotated where fidelity to statevector simulation results are
with one injected T gate. We observe that SuperSim begins to less than 1.00. In other words, if a point on the curve does not
demonstrate a strong advantage over MPS at around 6 layers have a label (as seen in Figs. 5, 6), its fidelity is greater than
of the HWEA. These results were generated using a laptop 0.99. In this experiment, a significant disparity was noted in
computer. the accuracy of the extended stabilizer simulator.
For our third experiment with the near-Clifford VQE This initial step toward QEC provides insight toward the
HWEA benchmark, we study large-scale SuperSim scaling - benefits of SuperSim: our framework scales favorably with
up to 300 qubits in system size. Fig. 5 shows the scalability qubit number, while most other simulators have poor runtime
of the SuperSim framework based on Clifford circuit cutting. scaling, taking longer than SuperSim past 25 qubits. Moreover
Results for simulation time as a function of qubit number for the extended stabilizer simulator quickly becomes useless in
HWEA with 5 rounds, up to 300 qubits, and one randomly terms of accuracy for the phase-repetition code.
injected T gate. Fig. 5 was produced with a laptop, and these A clear exception to the favorability of SuperSim over other
results show that SuperSim can simulate circuits that are much methods in Figure 7 is the MPS simulator, which outperforms
larger than are feasible with statevector and extended stabilizer SuperSim for all qubit numbers provided. This exception is an
simulators. Furthermore, the runtimes included in Fig. 5 are artifact of the fact that the circuit for a phase repetition code
generally smaller than the runtimes of other Fig. 5 methods cycle generates very little entanglement between the qubits
past 30 qubits. We note that the runtimes in Fig. 5 do not involved. Whereas SuperSim performance is unaffected by the
increase monotonically with qubit number. This behavior is an degree of entanglement generated in a Clifford circuit, MPS
artifact of simulating, for each data point in Fig. 5 represents is well-known to fail catastrophically in terms of runtime,
only a single circuit with one randomly injected T gate. If memory footprint, accuracy, or a combination of all three
a T gate happens to cleanly split a circuit into two Clifford (depending on the precise implementation of MPS) in the
sub-circuits, for example, the resulting SuperSim simulation regime of high and nonlocal (“volume-law”) entanglement [7],
8
VQE HWEA, 5 Rounds, 1 Non-Clifford Gate
103
102
100
Fig. 3. Averaged simulation time as a function of qubit number for the VQE HWEA with 5 rounds and one randomly injected T gate for a variety of
simulators specified in the legend. Experiments were set to timeout at 30 minutes thus the SV simulator curve is truncated at 28 qubits. SuperSim outperforms
all other simulators past 26 qubits, even as SuperSim continues to have modest runtimes with hundreds of qubits (see Figure 5).
40
102
Simulation Time (s)
101
20
100 10
Fig. 4. Simulation time as a function of VQE HWEA rounds for 20 qubit Fig. 5. Simulation time as a function of qubit number for the VQE HWEA
HWEA base circuit. In every case, a single T gate is injected into the circuit with 5 rounds and one randomly injected T gate. These results are essentially
at a random location. This plot demonstrates SuperSim Clifford cut scalability the same experiment as in Figure 3, but for qubit numbers up to 300,
with circuit depth and entanglement growth, in contrast to the Qiskit MPS demonstrating that that SuperSim can simulate circuits that are much larger
Simulator. Note that SuperSim simulation time in this experiment is insensitive than is possible with statevector and extended stabilizer simulators. The
to HWEA rounds because it spends the bulk of its time postprocessing “noisy” dependence of runtime on qubit number in this figure is an artifact
fragment simulation data, rather than simulating fragments themselves. of simulating, for each qubit number, only a single circuit with one randomly
injected T gate. Generally speaking, T gate location can significantly affect
runtime for SuperSim simulation of near-Clifford circuits, especially as the
location of the T gate changes the number of fragments that need to be
[54]. Near-future benchmarks that are more faithful to QEC simulated and recombined.
simulation performance will involve codes that also correct bit-
flip errors, thereby requiring the simulation of highly entangled
states that are simply out of reach for MPS.
gates into benchmarks, the SuperSim circuit cutter is able to
VIII. D ISCUSSION identify, isolate, and create fragments for other non-Clifford
The alpha version of the SuperSim framework was moti- gates, such as arbitrary Z a operations.
vated, presented, and evaluated in this paper. The novelty of The benefits of quantum circuit simulation via SuperSim
this framework is that it combines the benefits of quantum have potential to unlock improvements in the debugging,
circuit cutting and Clifford quantum circuit simulation while performance quantification, and noise tolerance evaluation
knowing that both of these techniques suffer from their unique of quantum algorithms. Further, SuperSim will help guide
constraints. Circuit cutting scales poorly, and we note that hardware development by pushing the limits of circuit simu-
the complexity of SuperSim is bounded by twice the number lation, which is essential for co-design, assessing architectural
of non-Clifford gates, having an overall simulation cost that choices in hardware, and pinpointing hardware features that
is exponential in the number of non-Cliffords. Further, the are critical to the success of algorithms. We have ongoing
Clifford-only application domain is limited. Nonetheless, we efforts to optimize and improve the SuperSim framework, but
identify near-Clifford quantum circuits as a target for acceler- it already shows potential as a domain-specific quantum circuit
ated simulation that balances tradeoffs with complexity. As a simulator. Our initial evaluation of SuperSim targeted VQA
note, although our evaluation focused on injecting T (= Z 1/4 ) optimization and QEC applications, and our results highlight
9
QAOA, 1 Round, 1 Non-Clifford Gate
102
100
10 1
SuperSim Clifford cut
SV simulator
Qiskit mps
10 2 Qiskit extend. stabilizer
# : |< 0| 1>|2 iff < 0.99
5 10 15 20 25
Qubits
Fig. 6. Averaged simulation time as a function of qubit number for QAOA MaxCut with one round and one randomly injected T gate for a variety of
simulators specified in the legend. Experiments were set to time out at 30 minutes.
Repetition Code Simulation, 1 T gate (Phase Code)
102 SuperSim Clifford cut
SV simulator
80
00
54
01
03
61
34
19
00
00
0.9
Qiskit mps
0.0
0.6
0.0
0.0
0.0
0.2
0.0
0.0
0.0
Qiskit extend. stabilizer
# : |< 0| 1>|2 iff < 0.99
101
Simulation Time (s)
100
10 1
3 5 7 9 11 13 15 17 19 21 23 25 27 29 31
Qubits
Fig. 7. Simulation time as a function of qubit number in the simulation of a single phase-correcting repetition code cycle for a variety of simulators specified
in the figure legend. Points are annotated where the fidelity of simulation results compared to exact statevector simulations are at or below 0.99. Note that the
MPS simulator outperforms SuperSim in this benchmark due to the fact that repetition code circuit generates very little entanglement between qubits. Future
benchmarks of SuperSim will include fullly quantum error correcting codes in high entanglement regimes that are known to be out of reach for MPS [7],
[54].
instances where SuperSim shows advantage over alternative outperforms SuperSim. This finding is not surprising given
open source quantum circuit simulation solutions. For in- the nature of benchmarks chosen, as MPS is an approximate
stance, the experiments in Section VII show that compared simulation scheme that is well-known to perform well in
to the statevector simulation, SuperSim demonstrates signifi- (and was, in fact, designed for) low-entanglement regimes,
cantly lower runtimes as the VQE HWEA, QAOA MaxCut, while failing to provide accurate results for the simulation
and repetition code benchmarks approach 30 qubits in size. of highly entangled states [7], [54]. This fact is seen clearly
Based on the discussion of Section III-A, this is an expected as the HWEA circuit increases in depth in Fig. 4 and MPS
result. However, in both the Fig. 5 and Fig. 7 experiments, simulation time grows exponentially. In future work, we will
SuperSim outperforms the Qiskit extended stabilizer simulator, investigate more near-Clifford quantum circuits that are highly
which is significant as this simulator is based on state-of- entangled, including fully quantum error correcting circuits (as
art Clifford+T simulation methods [5]. We highlight that in opposed to classical repetition codes), as candidate SuperSim
some cases, SuperSim demonstrates runtime improvments of applications. We also plan to explore relationships between
around 100× as compared to the Qiskit extended stabilizer. number of non-Clifford gates within a near-Clifford circuit
Further, the results in Fig. 7 show that SuperSim outshines and simulation runtime.
Qiskit extended stabilizer in accuracy.
IX. O NGOING : C LIFFORD - SPECIFIC C UTTING
SuperSim can scale to hundreds of qubits, Fig. 5, and
O PTIMIZATIONS
simulate these large quantum circuits in tens of seconds. We
must highlight, though, that our benchmarks also have large- While the circuit cutting procedure described so far effec-
qubit regimes (Fig. 7) in which the Qiskit MPS simulator tively leverages fast Clifford simulation at a per-subcrircuit
10
level, the procedure for stitching together results has thus far Parallelizing these operations has been shown to reduce sim-
been agnostic to whether subcircuits are Clifford or not. As it ulation costs for large circuits [39]. We intend to further im-
turns out, breaking this abstraction barrier leads to substantial prove the efficiency of the SuperSim framework by exploring
performance improvements. As a concrete example, we con- parallelization methods and leveraging the parallel processing
sider a Clifford subcircuit Ci upon which we must measure power of GPUs and multi-core computing clusters.
a cut qubit in multiple bases—for instance, C0 in Fig. 2.
Following the standard circuit cutting procedure, we must XI. O NGOING : S UPPORT OF A DDITIONAL F RAGMENT
make Pauli observable measurements hP i after running Ci . We E VALUATORS
next describe two Clifford-specific cutting optimizations: (1) SuperSim is currently a classical simulator of quantum
significantly fewer requisite shots and (2) fewer downstream circuits based on Cirq and Stim simulation backends. Future
stitching calculations. versions of SuperSim will support additional fragment evalua-
The first optimization—fewer requisite shots—stems from tion backends, producing composite probability distributions
the observation that for Clifford circuits, hP i is either −1, 0, or that consist of results from a variety of technologies that
+1 [34]. After rotating to computational basis measurements, include both real QCs and state-of-art classical quantum circuit
these correspond to being in the |1i, |+i (or any other equal simulators. SuperSim cuts circuits in order to maximize the
superposition), and |0i states respectively. It turns out this size of Clifford circuits and minimize the size of non-Clifford
significantly reduces the number of shots that are needed circuits. These fragments are simulated classically, but minor
to determine hP i versus generic estimation where accuracy adjustments to our framework would enable the non-Clifford
would scale with the square root of number of shots. By circuits to be run on a real quantum machine. In the future, we
analogy, our setting is akin to being promised that we have a envision the development of dynamic techniques that identify
completely-biased heads coin (-1), fair coin (0), or completely- circuit fragment and quantum machine / circuit simulator
biased tails coin (+1). If we flip the coin a few times and find pairings, running fragment variants on the most resource-
at least one heads and at least one tails, we immediately know efficient backend.
we have a fair coin. In the other two cases, we will get only
heads or only tails. In a similar fashion, in our setting, if we XII. O NGOING : ROAD M AP FOR F UTURE E VALUATION
measured just 10 shots but found |1i (|0i) every time, we can
be quite confident that hP i is -1 (+1). The results presented here are a preliminary effort to assess
SuperSim’s performance in terms of runtime and processing
The second optimization—fewer downstream stitching cal-
quality in terms of output fidelity. However, they are incom-
culations—stems from the observations that in Clifford cir-
plete, and efforts are in-progress to expand the coverage of
cuits, the output state must have hP i = 0 for many Pauli ob-
SuperSim benchmarking to discover the applications for which
servables. Consider the one-qubit case, where the six Clifford
it provides significant performance gains for quantum circuit
states belong on the Bloch octahedron. For each of these states,
simulation. Our intuition guided us to explore near-Clifford
two of hXi, hY i, and hZi must be equal to 0. Plugging in zeros
circuits as our first set of experiments, but our results in
to the circuit cutting stitching equations, we find that some of
Section VII lead us to believe that we need to additionally
the “downstream” circuits that would otherwise need to be
search for near-Clifford circuits that are characterized by high
evaluated can be skipped. Moreover this phenomenon scales
entanglement. Further, we need to increase the scale of our
favorably: for multi-qubit observables, the fraction of non-
evaluation so that we can discover what upper bounds exist in
zero Pauli observables approaches 0, significantly reducing
terms of SuperSim’s qubit capacity. These next steps are all
downstream calculations.
part of the SuperSim development roadmap.
Pending the careful theoretical development and software
implementation of these ideas, such optimizations will be XIII. C ONCLUSION
incorporated into future iterations of SuperSim, potentially
providing major computational complexity and runtime im- Quantum computing benefits from quantum circuit simula-
provements. tion for applications such as VQAs and QEC. Exact quan-
tum circuit simulators quickly become intractable to evaluate
X. O NGOING : P ERFORMANCE I MPROVEMENTS THROUGH on classical hardware as a quantum system grows in size.
PARALLELIZATION However, there has recently been progress in developing
more efficient quantum circuit simulation techniques for near-
We observe that the following steps in the circuit-cutting Clifford quantum circuits. In this industry track paper, we
process are amenable to parallelization: debut the Super.tech SuperSim framework, a new approach for
• Parsing circuits to find cut locations. high fidelity and scalable quantum circuit simulation based on
• Simulating each variant associated with a circuit frag- Clifford-based simulation and circuit cutting. Our results show
ment. that Clifford-based circuit cutting accelerates the simulation of
• Postprocessing the probability distributions generated by near-Clifford circuits beyond the frontier of alternative state-
the aforementioned simulations in order to reconstruct the of-art techniques, allowing 100s of qubits to be evaluated with
final probability distribution for the original circuit. modest runtimes.
11
R EFERENCES [26] A. Kissinger and J. van de Wetering, “Simulating quantum circuits with
zx-calculus reduced stabiliser decompositions,” Quantum Science and
[1] S. Aaronson and D. Gottesman, “Improved simulation of stabilizer Technology, 2022.
circuits,” Physical Review A, vol. 70, no. 5, p. 052328, 2004. [27] D. E. Koh, M. D. Penney, and R. W. Spekkens, “Computing quopit
[2] E. R. Anschuetz, H.-Y. Hu, J.-L. Huang, and X. Gao, “Interpretable clifford circuit amplitudes by the sum-over-paths technique,” arXiv
quantum advantage in neural sequence learning,” arXiv preprint preprint arXiv:1702.03316, 2017.
arXiv:2209.14353, 2022. [28] J. Liu, A. Gonzales, and Z. H. Saleem, “Classical simulators as quantum
[3] R. S. Bennink, E. M. Ferragut, T. S. Humble, J. A. Laska, J. J. Nutaro, error mitigators via circuit cutting,” arXiv preprint arXiv:2212.07335,
M. G. Pleszkoch, and R. C. Pooser, “Unbiased simulation of near-clifford 2022.
quantum circuits,” Physical Review A, vol. 95, no. 6, p. 062337, 2017. [29] Y. A. Liu, X. L. Liu, F. N. Li, H. Fu, Y. Yang, J. Song, P. Zhao,
[4] J. Biamonte, P. Wittek, N. Pancotti, P. Rebentrost, N. Wiebe, and Z. Wang, D. Peng, H. Chen, C. Guo, H. Huang, W. Wu, and D. Chen,
S. Lloyd, “Quantum machine learning,” Nature, vol. 549, no. 7671, pp. “Closing the ”quantum supremacy” gap: Achieving real-time simulation
195–202, 2017. of a random quantum circuit using a new sunway supercomputer,”
[5] S. Bravyi, D. Browne, P. Calpin, E. Campbell, D. Gosset, and in Proceedings of the International Conference for High Performance
M. Howard, “Simulation of quantum circuits by low-rank stabilizer Computing, Networking, Storage and Analysis, ser. SC ’21. New
decompositions,” Quantum, vol. 3, p. 181, 2019. York, NY, USA: Association for Computing Machinery, 2021. [Online].
[6] M. Cerezo, A. Sone, T. Volkoff, L. Cincio, and P. J. Coles, “Cost function Available: https://doi.org/10.1145/3458817.3487399
dependent barren plateaus in shallow parametrized quantum circuits,” [30] D. Lykov, R. Schutski, A. Galda, V. Vinokur, and Y. Alexeev, “Tensor
Nature communications, vol. 12, no. 1, p. 1791, 2021. network quantum simulator with step-dependent parallelization,” in 2022
[7] J. I. Cirac, D. Perez-Garcia, N. Schuch, and F. Verstraete, “Matrix IEEE International Conference on Quantum Computing and Engineer-
product states and projected entangled pair states: Concepts, symmetries, ing (QCE). IEEE, 2022, pp. 582–593.
theorems,” Reviews of Modern Physics, vol. 93, no. 4, p. 045003, 2021. [31] I. L. Markov and Y. Shi, “Simulating quantum computation by contract-
[8] P. Czarnik, A. Arrasmith, P. J. Coles, and L. Cincio, “Error mitigation ing tensor networks,” SIAM Journal on Computing, vol. 38, no. 3, pp.
with clifford quantum-circuit data,” 2020. 963–981, 2008.
[9] A. S. Darmawan and D. Poulin, “Tensor-network simulations of the [32] M. Medvidović and G. Carleo, “Classical variational simulation of the
surface code under realistic noise,” Physical review letters, vol. 119, quantum approximate optimization algorithm,” npj Quantum Informa-
no. 4, p. 040502, 2017. tion, vol. 7, no. 1, pp. 1–7, 2021.
[10] P. Das, S. Tannu, S. Dangwal, and M. Qureshi, “Adapt: Mitigating idling [33] N. Moll, P. Barkoutsos, L. S. Bishop, J. M. Chow, A. Cross, D. J. Egger,
errors in qubits via adaptive dynamical decoupling,” in MICRO-54: S. Filipp, A. Fuhrer, J. M. Gambetta, M. Ganzhorn et al., “Quantum op-
54th Annual IEEE/ACM International Symposium on Microarchitecture, timization using variational algorithms on near-term quantum devices,”
2021, pp. 950–962. Quantum Science and Technology, vol. 3, no. 3, p. 030503, 2018.
[11] C. M. Dawson and M. A. Nielsen, “The solovay-kitaev algorithm,” arXiv [34] M. A. Nielsen and I. Chuang, Quantum computation and quantum
preprint quant-ph/0505030, 2005. information. Cambridge University Press, 2010.
[12] K. De Raedt, K. Michielsen, H. De Raedt, B. Trieu, G. Arnold, [35] J. O’Gorman and E. T. Campbell, “Quantum computation with realistic
M. Richter, T. Lippert, H. Watanabe, and N. Ito, “Massively parallel magic-state factories,” Physical Review A, vol. 95, no. 3, p. 032338,
quantum computer simulator,” Computer Physics Communications, vol. 2017.
176, no. 2, pp. 121–136, 2007. [36] H. Pashayan, O. Reardon-Smith, K. Korzekwa, and S. D. Bartlett,
[13] C. Developers, “Cirq,” Dec. 2022, See full list of authors “Fast estimation of outcome probabilities for quantum circuits,” PRX
on Github: https://github .com/quantumlib/Cirq/graphs/contributors. Quantum, vol. 3, no. 2, p. 020361, 2022.
[Online]. Available: https://doi.org/10.5281/zenodo.7465577 [37] E. Pednault, J. A. Gunnels, G. Nannicini, L. Horesh, T. Magerlein,
[14] E. Farhi, J. Goldstone, S. Gutmann, and L. Zhou, “The quantum ap- E. Solomonik, and R. Wisnieff, “Breaking the 49-qubit barrier in
proximate optimization algorithm and the sherrington-kirkpatrick model the simulation of quantum circuits,” arXiv preprint arXiv:1710.05867,
at infinite size,” Quantum, vol. 6, p. 759, 2022. vol. 15, 2017.
[15] A. G. Fowler, M. Mariantoni, J. M. Martinis, and A. N. Cleland, “Surface [38] T. Peng, A. W. Harrow, M. Ozols, and X. Wu, “Simulating large quantum
codes: Towards practical large-scale quantum computation,” Physical circuits on a small quantum computer,” Physical Review Letters, vol.
Review A, vol. 86, no. 3, p. 032324, 2012. 125, no. 15, p. 150504, 2020.
[16] X. Gao, E. R. Anschuetz, S.-T. Wang, J. I. Cirac, and M. D. Lukin, “En- [39] M. Perlin, T. Tomesh, B. Pearlman, W. Tang, Y. Alexeev, and
hancing generative models via quantum correlations,” Physical Review M. Suchara, “Parallelizing simulations of large quantum circuits,” 2019.
X, vol. 12, no. 2, p. 021037, 2022. [40] M. A. Perlin, Z. H. Saleem, M. Suchara, and J. C. Osborn, “Quantum
[17] C. Gidney, “Stim: a fast stabilizer circuit simulator,” Quantum, vol. 5, circuit cutting with maximum-likelihood tomography,” npj Quantum
p. 497, 2021. Information, vol. 7, no. 1, pp. 1–8, 2021.
[18] P. Gokhale, E. Anschuetz, C. Campbell, F. Chong, E. Dahl, P. Frederick, [41] A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-Q. Zhou, P. J.
E. Jones, B. Hall, S. Issa, P. Goiporia et al., “Supercheq: Quantum Love, A. Aspuru-Guzik, and J. L. O’brien, “A variational eigenvalue
advantage for distributed databases,” arXiv preprint arXiv:2212.03850, solver on a photonic quantum processor,” Nature communications, vol. 5,
2022. no. 1, pp. 1–7, 2014.
[19] D. Gottesman, “The heisenberg representation of quantum computers,” [42] G. S. Ravi, P. Gokhale, Y. Ding, W. Kirby, K. Smith, J. M. Baker, P. J.
arXiv preprint quant-ph/9807006, 1998. Love, H. Hoffmann, K. R. Brown, and F. T. Chong, “Cafqa: A classical
[20] D. Gottesman, “Theory of fault-tolerant quantum computation,” Physical simulation bootstrap for variational quantum algorithms,” in Proceedings
Review A, vol. 57, no. 1, p. 127, 1998. of the 28th ACM International Conference on Architectural Support for
[21] D. Gottesman and I. L. Chuang, “Demonstrating the viability of Programming Languages and Operating Systems, Volume 1, 2022, pp.
universal quantum computation using teleportation and single-qubit 15–29.
operations,” Nature, vol. 402, no. 6760, pp. 390–393, 1999. [43] J. Roffe, “Quantum error correction: an introductory guide,”
[22] T. Grurl, J. Fuß, and R. Wille, “Noise-aware quantum circuit simulation Contemporary Physics, vol. 60, no. 3, p. 226–245, Jul 2019.
with decision diagrams,” IEEE Transactions on Computer-Aided Design [Online]. Available: http://dx.doi.org/10.1080/00107514.2019.1667078
of Integrated Circuits and Systems, 2022. [44] U. Schollwöck, “The density-matrix renormalization group in the age of
[23] A. W. Harrow, A. Hassidim, and S. Lloyd, “Quantum algorithm for matrix product states,” Annals of physics, vol. 326, no. 1, pp. 96–192,
linear systems of equations,” Physical review letters, vol. 103, no. 15, 2011.
p. 150502, 2009. [45] P. W. Shor, “Polynomial-time algorithms for prime factorization and
[24] A. Kandala, A. Mezzacapo, K. Temme, M. Takita, M. Brink, J. M. discrete logarithms on a quantum computer,” SIAM review, vol. 41, no. 2,
Chow, and J. M. Gambetta, “Hardware-efficient variational quantum pp. 303–332, 1999.
eigensolver for small molecules and quantum magnets,” Nature, vol. [46] A. Strikis, D. Qin, Y. Chen, S. C. Benjamin, and Y. Li, “Learning-based
549, no. 7671, pp. 242–246, 2017. quantum error mitigation,” 2021.
[25] A. Kerzner, “Clifford simulation: Techniques and applications,” Master’s [47] A. tA v, M. S. ANIS, Abby-Mitchell, H. Abraham, AduOffei, R. Agar-
thesis, University of Waterloo, 2021. wal, G. Agliardi, M. Aharoni, V. Ajith, I. Y. Akhalwaya, G. Alek-
12
sandrowicz, T. Alexander, M. Amy, S. Anagolum, Anthony-Gandon, I. F. F. Truger, TsafrirA, G. Tsilimigkounakis, D. Tulsi, D. Tuna, W. Turner,
Araujo, E. Arbel, A. Asfaw, A. Athalye, A. Avkhadiev, C. Azaustre, Y. Vaknin, C. R. Valcarce, F. Varchon, A. Vartak, A. C. Vazquez,
P. BHOLE, V. Bajpe, A. Banerjee, S. Banerjee, W. Bang, A. Bansal, P. Vijaywargiya, V. Villar, B. Vishnu, D. Vogt-Lee, C. Vuillot, WQ,
P. Barkoutsos, A. Barnawal, G. Barron, G. S. Barron, L. Bello, J. Weaver, J. Weidenfeller, R. Wieczorek, J. A. Wildstrom, J. Wilson,
Y. Ben-Haim, M. C. Bennett, D. Bevenius, D. Bhatnagar, P. Bhatnagar, E. Winston, WinterSoldier, J. J. Woehr, S. Woerner, R. Woo, C. J.
A. Bhobe, P. Bianchini, L. S. Bishop, C. Blank, S. Bolos, S. Bopardikar, Wood, R. Wood, S. Wood, J. Wootton, M. Wright, L. Xing, J. YU,
S. Bosch, S. Brandhofer, Brandon, S. Bravyi, Bryce-Fuller, D. Bucher, Yaiza, B. Yang, U. Yang, J. Yao, D. Yeralin, R. Yonekura, D. Yonge-
L. Burgholzer, A. Burov, F. Cabrera, P. Calpin, L. Capelluto, J. Carballo, Mallo, R. Yoshida, R. Young, J. Yu, L. Yu, Yuma-Nakamura, C. Zachow,
G. Carrascal, A. Carriker, I. Carvalho, R. Chakrabarti, A. Chen, C.-F. L. Zdanski, H. Zhang, E. Zheltonozhskii, I. Zidaru, B. Zimmermann,
Chen, E. Chen, J. C. Chen, R. Chen, F. Chevallier, K. Chinda, R. Chol- B. Zindorf, C. Zoufal, aeddins ibm, alexzhang13, b63, bartek bart-
arajan, J. M. Chow, S. Churchill, CisterMoke, C. Claus, C. Clauss, lomiej, bcamorrison, brandhsn, nick bronn, chetmurthy, choerst ibm,
C. Clothier, R. Cocking, R. Cocuzzo, J. Connor, F. Correa, Z. Crockett, comet, dalin27, deeplokhande, dekel.meirom, derwind, dime10, ehchen,
A. J. Cross, A. W. Cross, S. Cross, J. Cruz-Benito, C. Culver, A. D. ewinston, fanizzamarco, fs1132429, gadial, galeinston, georgezhou20,
Córcoles-Gonzales, N. D, S. Dague, T. E. Dandachi, A. N. Dang- georgios ts, gruu, hhorii, hhyap, hykavitha, itoko, jeppevinkel, jessica
wal, J. Daniel, M. Daniels, M. Dartiailh, A. R. Davila, F. Debouni, angel7, jezerjojo14, jliu45, johannesgreiner, jscott2, kUmezawa, klinvill,
A. Dekusar, A. Deshmukh, M. Deshpande, D. Ding, J. Doi, E. M. Dow, krutik2966, ma5x, michelle4654, msuwama, nico lgrs, nrhawkins, ntgi-
P. Downing, E. Drechsler, M. S. Drudis, E. Dumitrescu, K. Dumon, wsvp, ordmoj, sagar pahwa, pritamsinha2304, rithikaadiga, ryancocuzzo,
I. Duran, K. EL-Safty, E. Eastman, G. Eberle, A. Ebrahimi, P. Eendebak, saktar unr, saswati qiskit, sebastian mair, septembrr, sethmerkel, sg495,
D. Egger, ElePT, I. Elsayed, Emilio, A. Espiricueta, M. Everitt, D. Fa- shaashwat, smturro2, sternparky, strickroman, tigerjack, tsura crisaldo,
coetti, Farida, P. M. Fernández, S. Ferracin, D. Ferrari, A. H. Ferrera, upsideon, vadebayo49, welien, willhbang, wmurphy collabstar, yang.luh,
R. Fouilland, A. Frisch, A. Fuhrer, B. Fuller, M. GEORGE, J. Gacon, yuri@FreeBSD, and M. Čepulkovskis, “Qiskit: An open-source frame-
B. G. Gago, C. Gambella, J. M. Gambetta, A. Gammanpila, L. Garcia, work for quantum computing,” 2021.
T. Garg, S. Garion, J. R. Garrison, J. Garrison, T. Gates, N. Gavrielov, [48] W. Tang, T. Tomesh, M. Suchara, J. Larson, and M. Martonosi, “Cutqc:
G. Gentinetta, H. Georgiev, L. Gil, A. Gilliam, A. Giridharan, Glen, using small quantum computers for large quantum circuit evaluations,” in
J. Gomez-Mosquera, Gonzalo, S. de la Puente González, J. Gorzinski, Proceedings of the 26th ACM International conference on architectural
I. Gould, D. Greenberg, D. Grinko, W. Guan, D. Guijo, Guillermo- support for programming languages and operating systems, 2021, pp.
Mijares-Vilarino, J. A. Gunnels, H. Gupta, N. Gupta, J. M. Günther, 473–486.
M. Haglund, I. Haide, I. Hamamura, O. C. Hamido, F. Harkins, K. Hart- [49] Q. A. team and collaborators, “qsim,” Sep. 2020. [Online]. Available:
man, A. Hasan, V. Havlicek, J. Hellmers, Ł. Herok, R. Hill, S. Hillmich, https://doi.org/10.5281/zenodo.4023103
C. Hong, H. Horii, C. Howington, S. Hu, W. Hu, C.-H. Huang, J. Huang, [50] S. D. Team, “SuperstaQ: Connecting applications to quantum hardware,”
R. Huisman, H. Imai, T. Imamichi, K. Ishizaki, Ishwor, R. Iten, T. Itoko, www.super.tech/about-superstaq, 2021.
A. Ivrii, A. Javadi, A. Javadi-Abhari, W. Javed, Q. Jianhua, M. Jivrajani, [51] T. Tomesh, P. Gokhale, V. Omole, G. S. Ravi, K. N. Smith, J. Viszlai, X.-
K. Johns, S. Johnstun, Jonathan-Shoemaker, JosDenmark, JoshDumo, C. Wu, N. Hardavellas, M. R. Martonosi, and F. T. Chong, “Supermarq:
J. Judge, T. Kachmann, A. Kale, N. Kanazawa, J. Kane, Kang-Bae, A scalable quantum benchmark suite,” in 2022 IEEE International Sym-
A. Kapila, A. Karazeev, P. Kassebaum, T. Kehrer, J. Kelso, S. Kelso, posium on High-Performance Computer Architecture (HPCA). IEEE,
H. van Kemenade, V. Khanderao, S. King, Y. Kobayashi, Kovi11Day, 2022, pp. 587–603.
A. Kovyrshin, R. Krishnakumar, P. Krishnamurthy, V. Krishnan, K. Kr- [52] G. Uchehara, T. M. Aamodt, and O. Di Matteo, “Rotation-inspired
sulich, P. Kumkar, G. Kus, R. LaRose, E. Lacal, R. Lambert, H. Landa, circuit cut optimization,” arXiv preprint arXiv:2211.07358, 2022.
J. Lapeyre, D. Lasecki, J. Latone, S. Lawrence, C. Lee, G. Li, T. J. Liang, [53] V. Veitch, S. A. Hamed Mousavian, D. Gottesman, and J. Emerson,
J. Lishman, D. Liu, P. Liu, Lolcroc, A. K. M, L. Madden, Y. Maeng, “The resource theory of stabilizer quantum computation,” New Journal
S. Maheshkar, K. Majmudar, A. Malyshev, M. E. Mandouh, J. Manela, of Physics, vol. 16, no. 1, p. 013009, Jan 2014. [Online]. Available:
Manjula, J. Marecek, M. Marques, K. Marwaha, D. Maslov, P. Maszota, http://dx.doi.org/10.1088/1367-2630/16/1/013009
D. Mathews, A. Matsuo, F. Mazhandu, D. McClure, M. McElaney, [54] G. Vidal, “Efficient classical simulation of slightly entangled quantum
J. McElroy, C. McGarry, D. McKay, D. McPherson, S. Meesala, computations,” Physical review letters, vol. 91, no. 14, p. 147902, 2003.
D. Meirom, C. Mendell, T. Metcalfe, M. Mevissen, A. Meyer, A. Mez- [55] X.-C. Wu, S. Di, E. M. Dasgupta, F. Cappello, H. Finkel, Y. Alexeev,
zacapo, R. Midha, D. Millar, D. Miller, H. Miller, Z. Minev, A. Mitchell, and F. T. Chong, “Full-state quantum circuit simulation by using data
N. Moll, A. Montanez, G. Monteiro, M. D. Mooring, R. Morales, compression,” in Proceedings of the International Conference for High
N. Moran, D. Morcuende, S. Mostafa, M. Motta, R. Moyard, P. Murali, Performance Computing, Networking, Storage and Analysis, ser. SC
D. Murata, J. Müggenburg, T. NEMOZ, D. Nadlinger, K. Nakan- ’19. New York, NY, USA: Association for Computing Machinery,
ishi, G. Nannicini, P. Nation, E. Navarro, Y. Naveh, S. W. Neagle, 2019. [Online]. Available: https://doi.org/10.1145/3295500.3356155
P. Neuweiler, A. Ngoueya, T. Nguyen, J. Nicander, Nick-Singstock, [56] Y. Zhou, E. M. Stoudenmire, and X. Waintal, “What limits the simulation
P. Niroula, H. Norlen, NuoWenLei, L. J. O’Riordan, O. Ogunbayo, of quantum computers?” Physical Review X, vol. 10, no. 4, p. 041038,
P. Ollitrault, T. Onodera, R. Otaolea, S. Oud, D. Padilha, H. Paik, S. Pal, 2020.
Y. Pang, A. Panigrahi, V. R. Pascuzzi, S. Perriello, E. Peterson, A. Phan,
K. Pilch, F. Piro, M. Pistoia, C. Piveteau, J. Plewa, P. Pocreau, C. Possel,
A. Pozas-Kerstjens, R. Pracht, M. Prokop, V. Prutyanov, S. Puri, D. Puz-
zuoli, Pythonix, J. Pérez, Quant02, Quintiii, R. I. Rahman, A. Raja,
R. Rajeev, I. Rajput, N. Ramagiri, A. Rao, R. Raymond, O. Reardon-
Smith, R. M.-C. Redondo, M. Reuter, J. Rice, M. Riedemann, Rietesh,
D. Risinger, P. Rivero, M. L. Rocca, D. M. Rodrı́guez, RohithKarur,
B. Rosand, M. Rossmannek, M. Ryu, T. SAPV, N. R. C. Sa, A. Saha,
A. Ash-Saki, A. Salman, S. Sanand, M. Sandberg, H. Sandesara,
R. Sapra, H. Sargsyan, A. Sarkar, N. Sathaye, N. Savola, B. Schmitt,
C. Schnabel, Z. Schoenfeld, T. L. Scholten, E. Schoute, M. Schulter-
brandt, J. Schwarm, P. Schweigert, J. Seaward, Sergi, I. F. Sertage,
K. Setia, F. Shah, N. Shammah, W. Shanks, R. Sharma, P. Shaw, Y. Shi,
J. Shoemaker, A. Silva, A. Simonetto, D. Singh, D. Singh, P. Singh,
P. Singkanipa, Y. Siraichi, Siri, J. Sistos, J. Sistos, I. Sitdikov, S. Sivara-
jah, Slavikmew, M. B. Sletfjerding, J. A. Smolin, M. Soeken, I. O.
Sokolov, I. Sokolov, V. P. Soloviev, SooluThomas, Starfish, D. Steenken,
M. Stypulkoski, A. Suau, S. Sun, K. J. Sung, M. Suwama, O. Słowik,
R. Taeja, H. Takahashi, T. Takawale, I. Tavernelli, C. Taylor, P. Taylour,
S. Thomas, K. Tian, M. Tillet, M. Tod, M. Tomasik, C. Tornow, E. de la
Torre, J. L. S. Toural, K. Trabing, M. Treinish, D. Trenev, TrishaPe,
13