Fast Reconstruction Algorithm Based On HMC Samplin
Fast Reconstruction Algorithm Based On HMC Samplin
com/scientificreports
In Noisy Intermediate-Scale Quantum (NISQ) era, the scarcity of qubit resources has prevented
many quantum algorithms from being implemented on quantum devices. Circuit cutting technology
has greatly alleviated this problem, which allows us to run larger quantum circuits on real quantum
machines with currently limited qubit resources at the cost of additional classical overhead. However,
the classical overhead of circuit cutting grows exponentially with the number of cuts and qubits, and
the excessive postprocessing overhead makes it difficult to apply circuit cutting to large scale circuits.
In this paper, we propose a fast reconstruction algorithm based on Hamiltonian Monte Carlo (HMC)
sampling, which samples the high probability solutions by Hamiltonian dynamics from state space
with dimension growing exponentially with qubit. Our algorithm avoids excessive computation when
reconstructing the original circuit probability distribution, and greatly reduces the circuit cutting post-
processing overhead. The improvement is crucial for expanding of circuit cutting to a larger scale on
NISQ devices.
Quantum computing has demonstrated its superiority over classical computer in various scientific fi elds1, includ-
2,3 4,5 6 7,8 9
ing machine learning , chemistry , code breaking , finance , and other fields , where some quantum algo-
rithms have exponential speedup in theory over classical algorithms. However, due to technical limitations,
available quantum devices are currently called Noisy Intermediate-Scale Quantum devices. "Noisy" indicates
that quantum devices are affected by noise, while "intermediate-scale" signifies that the number of qubits in these
devices ranges from 50 to several hundred10. In the NISQ era, the challenge of constructing a reliable quantum
device increases significantly as the number of qubits increases due to quantum noise. As a result, the current
NISQ devices suffer from limited reliability and scalability11. The limitation of qubit resources leads to many
quantum algorithms that cannot be applied to practical problems and do not reflect the superiority of quantum
computing.
To address this problem, scholars worldwide have been studying circuit knitting techniques, which can
simulate larger quantum circuits on small-scale quantum c omputing12–18. One of the crucial methods to realize
circuit knitting is by implementing circuit cutting. Circuit cutting can divide large quantum circuits into multiple
small quantum subcircuits, which can be executed independently. Moreover, the results of all subcircuits can
be reconstructed into the theoretical results of the original circuit through additional classical post-processing
algorithms. This approach enables us to run larger quantum circuits on a limited qubit resource quantum device,
while incurring an increase in classical overhead costs.
Although circuit cutting is crucial to alleviate this problem of insufficient qubit resources in the NISQ era,
circuit cutting currently has some limitations. The classical resources consumed by the post-processing of circuit
cutting grow exponentially with the required number of cuts and the total number of qubits in the quantum
circuit. Many algorithms have been developed to reduce post-processing overhead. For example, Lowe et al.
addressed this issue by implementing random measurements19. Saleem et al. reduced the overhead by reducing
the number of cuts17. Tang et al. proposed a dynamic-definition (DD) query algorithm, which can reduce the
overhead by recursively and efficiently finding the circuit probability distribution from state s pace18. Chen et al.
proposed an approximate reconstruction algorithm to sample the high probability solution space by MCMC
sampling to reduce the o verhead20.
This paper introduces a fast reconstruction algorithm. Through the evaluation of various random circuits,
our reconstruction algorithm demonstrates an average runtime that is 45.6 times faster than the traditional exact
reconstruction algorithm [Eq. (4)], 2.32 times faster than the DD algorithm, and approximately 1.47 times faster
than the approximate reconstruction algorithm. Unlike the traditional exact reconstruction that reconstructs
the probability distribution of the original circuit by traversing all possible quantum states, our reconstruction
algorithm samples the high probability solutions by Hamiltonian dynamics from state space. By HMC sampling,
our reconstruction algorithm reduces the overhead associated with reconstructing the original circuit results
after each cut. Specifically, the contributions of this paper are as follows:
1. We propose a fast reconstruction algorithm based on HMC. The algorithm allows faster selection of high
probability circuit results from state space by Hamiltonian dynamics, greatly reducing the time overhead.
2. In our reconstruction algorithm, only high probability solutions are sampled without the need for recon-
structing all quantum states. This eliminates any 0-probability solution states or states with extremely low
probabilities. The algorithm returns solution states with high probabilities, leading to a substantial reduction
in space overhead.
3. We propose a method to decrease the number of cuts by decomposing two-qubit gates. This approach is
employed in the experimental circuit described in this paper, which successfully reduces the number of cuts
to 1.
This paper is organized as follows. In “Background”, we provide a brief introduction to the relevant fundamen-
tals. In “Methods”, we outline the main content of our reconstruction algorithm. In “Results”, we conduct single
cut reconstruction experiments using our reconstruction algorithm along with other approaches. In this section,
a virtual two-qubit gate technique is employed to preprocess the reconfigured experimental circuit, enabling
successful single cut splitting. In “Conclusion”, we conclude the paper and offers perspectives for future work.
Background
Circuit cutting
Circuit cutting is also known as a time-like cut. In quantum computing, any unitary matrix can be theoretically
decomposed into a set of combinations of Pauli matrices {I, X, Y , Z}18. Specifically, any matrix A can be decom-
posed into the following formulas:
Tr(AI)I + Tr(AX)X + Tr(AY )Y + Tr(AZ)Z
A= (1)
2
We can further decompose the Pauli matrix into its eigenvalues and eigenvectors, and again derive the fol-
lowing Eq. (2)
A1 = [Tr(AI) + Tr(AZ)]|0 �� 0|
A2 = [Tr(AI) − Tr(AZ)]|1 �� 1|
A3 = Tr(AX)[2|+ �� +| − |0 �� 0| − |1 �� 1|] (2)
A4 = Tr(AY )[2|+i �� −i| − |0 �� 0| − ||1 �� 1|]
A = (A1 + A2 + A3 + A4 )/2
Each trace operator corresponds to a set of measurements in a particular Pauli basis, while the density matrix
comprising of eigenvectors corresponds to a set of initialization operations. Since the physical implementation
of the measurement in the I base and Z base is identical, we can knit together the cut points by three measure-
ment operations ( X, Y , Z ) and four initialization operations (|0 �, |1 �, |+ �, |i �). This facilitates the generation
of three distinct upstream subcircuits (referred to as Fragment 1) with different measurement bases, and four
downstream subcircuits (referred to as Fragment 2) with different initializations, as illustrated in Fig. 1. These
subcircuits can be run independently, and the results can be measured.
However, it should be noted that since the last qubit of the upstream subcircuit does not appear in the final
output of the uncut circuit, the results of the upstream subcircuits need additional processing. The results of the
upstream subcircuits need to be multiplied by a factor α = ±1. The sign of α depends on both the measurement
base and the measurement result of the last qubit. When the measurement base is I , α = 1 regardless of the
measurement result of the last qubit. However, when the measurement base is {X, Y , Z}, α = 1 if the measure-
ment result of the last qubit is |0 �, and α = −1 if the measurement result is |1 �.This relationship is summarized
in Eq. (3) below:
Figure 1. A quantum circuit can be divided into two parts using a single cut. These two parts create multiple
subcircuits by incorporating different measurement bases and initialization operations. These subcircuits can
run independently and the output of the original circuit can be reconstructed through classical post-processing.
x0, x1 → +x, M = I
x0 → +x, (3)
x1 → −x, M = {X, Y , Z}
Following the processing of the upstream subcircuit, it is necessary to compute the terms that correspond to
both the upstream and downstream subcircuits to reconstruct the original circuit’s probability distribution.
Assuming that the original is cut as in Fig. 1, the cut position is at the n/2nd qubit, and the
circuit quantum state
is |x0 x1 x2 . . . xn �. The quantum state associated with the upstream subcircuit is xup = x{ n +1} . . . xn �, and the
2
quantum state associated with the downstream subcircuit is xdown = x0 . . . x 2 � ,which xi ∈ {0, 1}.
n
According to Eqs. (2) and (3), we know that during the reconstruction process, the upstream subcircuit
consists of four terms,
p1,1 = p(|0xup � | I) + p(|1xup � | I) + p(|0xup � | Z) − p(|1xup � | Z)
The probability distribution of the original circuit is obtained simply by calculating the probability of each
quantum state in the original circuit using Equation (4).
1
+ cos θ sin θ α1 α2 [S ((I + α1 A1 ) ⊗ (I + iα2 A2 )) (6)
8 2
(α1 ,α2 )∈{±1}
+S ((I + iα1 A1 ) ⊗ (I + α2 A2 ))]
Therefore, it can be deduced that a two-qubit gate of form eiθ A1 ⊗A2 can be decomposed into a series of single-
qubit gates as shown in Fig. 2
If A ∈ {X, Y , Z}, then I + α1 A1 can be realized by projection measurements on the A-base, and I + iα2 A2 can
be realized by a single qubit gate in rotation around the A-base, as derived in Ref.21.
Figure 2. Decompose the two-qubit gate into a series of single-qubit gate sequences.
(7)
H x, p = U(x) + K p
The Hamiltonian characterizes the interconversion between kinetic energy and potential energy as an object
moves. The following Hamiltonian equation can be analyzed quantitatively by differentiation.
∂xi ∂H ∂K(p)
∂t = ∂pi = ∂pi
∂pi (8)
∂t = − ∂H
∂xi = − ∂U(x)
∂xi
If it is possible to express ∂U(x)/∂xi and ∂K(p)/∂pi with some initial conditions (e.g., at time t0, initial posi-
tion point x0, and initial momentum p0), it becomes feasible to predict the object’s position and momentum at
any subsequent time instant t = t0 + T .
The Hamiltonian equation captures the continuous time evolution of an object’s motion. To numerically simu-
late Hamiltonian dynamics, it is essential to discretize time to obtain an approximate solution for the Hamiltonian
equation. The time interval T can be divided into smaller sub-intervals of length δ, allowing for an approximate
continuity of time. This process can be accomplished using the leapfrog algorithm23, which sequentially updates
the momentum and position variables. The algorithm proceeds by first calculating the momentum over a period,
updating the object’s position over a slightly extended period δ, and finally completing the calculation of momen-
tum for the next time interval. The algorithm follows the steps outlined below.
a. Begin by calculating the change in momentum after half the time interval δ/2 :
δ δ ∂U
pi t + = pi (t) − (9)
2 2 ∂xi (t)
b. Next, compute the change in position over the entire time interval δ:
∂x
xi (t + δ) = xi (t) + δ (10)
∂pi (t + 2δ )
The core idea of HMC is to construct the Hamiltonian function H(x, p). By leveraging this function, it
becomes more efficient to explore the target distribution P(x). The canonical distribution of statistical mechanics
can be used to relate H(x, p) to P(x). Given the energy function of a state as E(θ), the corresponding canonical
distribution can be defined as
1 −E(θ )
P(θ) =
e (12)
Z
where Z is the regularization factor that ensures that P(θ)dθ = 1, thus creating a valid probability distri-
bution. Since the energy function is equal to the sum of potential and the kinetic energy in the system, it gives
the following:
E(θ) = H(x, p) = U(x) + K(p) (13)
Then the canonical function of the Hamiltonian kinetic energy function can be expressed as
According to Eq. (14), we can decompose the joint distribution P(x, p) into the product of the distributions
of P(x) and P(p), indicating that the two variables are independent of each other. As a result, their respective
distributions can be utilized to sample their joint probability distributions. The introduction of an auxiliary
variable p can expedite the convergence of the Markov chain. Given that variables x and p are independent, the
momentum variable p can be sampled from any distribution, with N(0, 1) commonly being selected. The func-
tion connected to the potential energy in the Hamiltonian is given by:
pT p
K(p) = (15)
2
In HMC, after defining K(p), the remaining work is how to find the potential energy function U(x) for a given
target distribution P(x), then the potential energy function is usually defined as:
U(x) = −logP(x) (16)
Calculate again the gradient function G(x) of the potential energy function:
∂U(x)
G(x) = (17)
∂x
Next, Hamiltonian dynamics can be applied to MCMC to sample the objective function P(x). However, dis-
cretizing the time may introduce a specific error that may not match the target distribution, so the acceptance
rate can be induced to offset the error, and the acceptance rate α is
(18)
α = min 1, exp −U(x L ) + U(x 0 ) − K pL + K p0
Here, x0 , p0 represents the initial state, while the new state xL , pL is obtained after executing the jump point
algorithm L times. A random point u is chosen from a uniform distribution between 0 and 1. If the acceptance
rate α is greater than u, the point xL is accepted in the Markov chain. Upon performing multiple samples to reach
the burn-in period of HMC sampling, the Markov chain converges towards a stationary distribution, which
corresponds to the target distribution P(x).
The random walk in the MCMC algorithm can lead the Markov chain to converge to a stationary distribution,
p(x), but it is often considered inefficient. Hamiltonian Monte Carlo leverages the principles of Hamiltonian
dynamics in physics to calculate the future states of the Markov chain, rather than relying solely on a probability
distribution. This approach enables more efficient exploration of the state space and achieves faster convergence
compared to random wandering.
Methods
Fast reconstruction algorithm
Traditional exact reconstruction algorithms involve traversing through all possible quantum states in the original
circuit’s state space to reconstruct its probability distribution. This requires processing all potential combinations
of qubit strings through brute force computation. However, the time complexity of these algorithms grows expo-
nentially. As the number of qubits in the circuit increases, reconstruction time also experiences an exponential
explosion. Additionally, exact reconstruction algorithms may encounter difficulties when applied to large-scale
quantum circuits due to overhead. To overcome these challenges, this paper presents a fast reconstruction algo-
rithm that differs from the approximate reconstruction method proposed by Chen et al. based on MH sampling20.
In this work, we draw inspiration from MCMC sampling. However, our reconstruction algorithm deviates from
the traditional MH approach, where the future state of a Markov chain is calculated through random wander-
ing. Instead, our method employs Hamiltonian dynamics, rooted in the physical system concept, to determine
future states. This technique enables more efficient analysis of the state space compared to random wandering
and facilitates faster sampling of high probability solutions from state space, thus accelerating the convergence
of Markov chains towards the original circuit’s probability distribution. For a comprehensive understanding of
HMC sampling, please refer to the references24,25.
Our reconstruction algorithm follows the steps outlined below.
Step 1: Assume that the original circuit probability distribution is P(x), then customize the potential energy
pT p
function U(x) = −logP(x), the gradient function G(x) = ∂U(x) ∂x , and the kinetic energy function K p = 2 .
Step 2: Once the potential, gradient, and kinetic energy functions have been initialized, we must establish
an initial state x0 , p0 for HMC sampling. Here, x0 represents a randomly chosen quantum state from the
original circuit’s probability distribution, while p0 typically denotes a random number drawn from a standard
normal distribution N(0, 1). Additionally, it is essential to initialize a dictionary to store the original circuit’s
probability distribution.
Step 3: Based on the HMC sampling principle mentioned above, simulating Hamiltonian dynamics in numeri-
cal terms requires discretizing continuoustime. This is typically achieved using the leapfrog algorithm. To
obtain the new state xl , pl , the initial state x0 , p0 is iteratively updated L times using the leapfrog algorithm.
Step 4: Once the new state has been obtained,
it is necessary to
calculate the acceptance rate
α = min 1, exp −U(xl ) + U(x0 ) − K pl + K p0 for the state xl , pl . This calculation is crucial because
HMC sampling estimates the posterior distribution through probabilistic sampling. In essence, HMC sam-
pling constructs a Markov chain that traverses the state space. The traversal is achieved by computing future
states using the principles of dynamics in physical systems. To ensure that this Markov chain possesses the
properties of a stationary distribution, it is necessary to define an acceptance rate α for determining the viabil-
ity of transitioning to future states. If the acceptance rate α is greater than a threshold u, the point is accepted
as a value from the original circuit’s probability distribution and stored in the dictionary for multiple itera-
tions. Following the burn-in period of the algorithm, the Markov chain produced by HMC sampling stabilizes
into the desired stationary distribution, which corresponds to the original circuit’s probability distribution.
The pseudocode for our reconstruction algorithm can be obtained based on the description above.
Results
As the overhead of circuit reconstruction grows exponentially with the number of cuts, this paper also investi-
gates how to reduce the minimum number of cuts required for circuit cutting, which can be achieved by virtual
two-qubit gate decomposition.
The two-qubit gate decomposition technique can be applied to certain fully connected quantum circuits to
reduce their structural complexity, resulting in fewer cuts and a significant reduction in the circuit’s processing
overhead. Figure 3 illustrates the change in the minimum number of required cuts before and after applying a
two-qubit gate decomposition on a fully connected 5-qubit Q FT26 circuit.
Figure 3. Comparison of the change in the minimum number of cuts required before and after two-qubit gate
decomposition for a 5-qubit QFT circuit.
If the circuit shown in Fig. 3a is cut directly, we need to cut it at least 4 times to decompose the circuit. We try
to decompose the CP gate between qubit 0 and qubit 3 of the 5-qubit QFT circuit, the decomposition method
is shown in Equation (6), and the decomposed circuit is shown in Fig. 3b. The CP-Cut in Fig. 3b indicates the
gates after the CP gate is decomposed.
After applying the two-qubit decomposition principle to decompose the 5-qubit QFT circuit, it was found that
only three cuts were needed to separate the circuit. Furthermore, we conducted additional experiments using
the circuit-knitting-toolbox toolkit to explore the furthest two-qubit gate decomposition for varying qubit sizes
in QFT circuits. The minimum number of cuts required before and after decomposing the two-qubit gate was
then calculated. The experimental results are depicted in Fig. 4.
Analysis of Fig. 4 reveals that decomposing two-qubit gates effectively reduces the number of cuts in 4-qubit
to 21-qubit QFT circuits. This reduction is more prominent as the number of qubits in the QFT circuit increases.
This trend can be extrapolated to complex quantum circuits, where decomposition of two-qubit gates serves as a
strategy to minimize circuit cuts. Larger circuits benefit more from this technique, experiencing a greater reduc-
tion in the number of cuts after the decomposition of two-qubit gates. This paper evaluates the performance of
our reconstruction algorithm through several experiments. Specifically, we measure the runtime of single cut
reconstruction for randomized circuits with varying qubit sizes and compare it against three other algorithms:
the traditional exact reconstruction algorithm, Tang’s DD algorithm, and Chen’s approximate reconstruction
algorithm. In cases where a single cut is insufficient to cut the circuit, we apply two-qubit gate decomposition
to minimize the number of required cuts. These experiments were conducted using the qasm simulator, and the
corresponding results are illustrated in Fig. 5a.
Figure 5a shows that the traditional exact reconstruction algorithm (Exact) shows remarkably short runtime
for small qubit sizes. However, due to its requirement to traverse all quantum states, the runtime of Exact experi-
ences exponential growth as the number of qubits in the quantum circuit increases. Conversely, the other three
reconstruction algorithms demonstrate smoother time distributions in Fig. 5a and do not show significant fluc-
tuations with increasing qubit counts. Among the three algorithms, the DD algorithm shows the longest average
runtime due to its utilization of continuous recursion for merging active qubits into bins , aiming to reconstruct
Figure 4. Comparison of the minimum number of cuts required for QFT circuit before and after decomposing
a two-qubit gate.
Figure 5. Comparison of runtime and MSE for different post-processing algorithms for a single cut of random
circuits.
solution states18. However, the time-consuming process of merging quantum states per recursion contributes to
the relatively slower average runtime of the DD algorithm compared to both the fast reconstruction algorithm
(FRA) and the approximate reconstruction algorithm (ARA) based on sampling. Notably, FRA outperforms ARA
in terms of speed as it incorporates the concept of Hamiltonian dynamics in physical systems to compute future
states within the Markov chain, in contrast to the random wandering approach employed by ARA.
The experimental results indicate that FRA outperforms other reconstruction algorithms in terms of runtime.
Specifically, FRA’s average runtime is 45.6 times faster than the traditional reconstruction algorithm, 2.32 times
faster than the DD algorithm, and around 1.47 times faster than ARA.
In addition to the runtime analysis, this paper also compares the correctness of all quantum states of the
probability distribution of the reconstruction results, and the evaluation metric used in this paper is the mean
squared error ( MSE ), as in Eq. (19):
2
MSE =
i
xi − yi (19)
Here, xi refers to the reconstructed result, while yi represents the result obtained from original circuit. A
smaller MSE corresponds to a lower error rate, as depicted in Fig. 5b, which presents a comparison of the MSE
values for different algorithms. Based on the experimental findings, the Exact shows the lowest error rate and
closest proximity to the probability distribution of the original circuit. It is followed by the DD algorithm, the
FRA proposed in this paper, and finally ARA.
The MSE of FRA shows improvement compared to the MSE of ARA in this experiment. However, there
exists a disparity between traditional exact reconstruction and DD algorithms. This discrepancy arises due to
the inherent sampling-based nature of FRA and ARA, which introduces a certain degree of error in contrast to
alternative algorithms.
Quantum circuits can be broadly classified into two types. The first type of circuit results in a sparse probabil-
ity distribution, where only a few solution states have a significantly high probability, while non-solution states
have a probability of 0. Quantum algorithms like QAOA27, Grover28, and BV29 generally belong to this type. On
the other hand, the second type of circuit produces a dense probability distribution, where many quantum states
have non-zero probabilities, such as the 2-D random circuits30. In QAOA and similar algorithms, it is sufficient
to highlight the solution state in the reconstructed probability distribution to obtain the algorithm’s solution.
There is no need to excessively focus on achieving high accuracy for all possible solutions. Consequently, we
have tested the first type of circuit, specifically the QAOA circuit, as demonstrated below using an example to
address the Max-Cut problem.
This paper makes a single cut to the 6-qubit QAOA circuit. This circuit is used to solve the Max-Cut problem
of Fig. 6a, and the probability distribution reconstructed using FRA in this paper is experimentally compared
with the results of run of the original circuit.
The Max-Cut problem is a common combinatorial optimization problem in graph theory with important
applications in statistical physics and circuit design. Michel Goemins and David Williamson proposed a classical
algorithm based on semidefinite programming (SDP) approximation for solving the Max-Cut Problem in 1995,
which is the best-known approximation algorithm for polynomial time31. The effectiveness of QAOA depends
on the number of layers of the unitary transformation used, and in theory, it is possible to find an excellent
approximation with enough layers, but it can also be time-consuming.
The Max-Cut problem involves dividing the nodes of a graph into two sets so that the number of edges
between the sets is maximized. The objective function for its transformation into a combinatorial optimization
problem can be formulated as follows:
1
(20)
C= Zi Zj − I
2 ij∈E
Assuming a max-cut is performed on the graph shown in Fig. 6a, we can then construct a quantum circuit
graph illustrated in Fig. 6b based on the objective function of the Max-cut, represented by Eq. (20).
Figure 6. (a) is a diagram of the maximum cut problem to be solved, and (b) is a QAOA quantum circuit built
by the objective function; we decompose the farthest CZ gate of the circuit and make one cut of the circuit.
Figure 7. Figures (a) and (b) illustrate the expected value of the QAOA circuit and the variance of the
measurements with the parameter β, γ for p = 1 (where p represents the number of iterations of the QAOA
algorithm), respectively.
We optimize the parameters β, γ (2β is the rotation angle of the Rzz gate, 2γ is the rotation angle of the Rx
gate) by the classical COBYLA optimizer. The changes in the circuit’s expected value and variance are observed
and presented in Fig. 7. In the expectation value measurement, a larger shaded region indicates a smaller value,
indicating a better fit of the parameters and closer proximity of the circuit results to the approximate optimal
solution. On the other hand, the variance represents the stability of the solution, where a larger shaded region
corresponds to a smaller variance, indicating a more stable and reliable solution.
After finding the optimal parameters to construct the complete QAOA circuit in conjunction with Fig. 7, the
farthest two-qubit gate Rzz gate of the QAOA circuit is decomposed due to
θ
Rzz(θ) = e−i 2 Z⊗Z (21)
According to Eq. (6), the Rzz gate can be decomposed into a series of Z gates and a combination of Rz gates.
Subsequently, we divide the circuit into two parts by making a single cut and then reconstruct the circuit results
using FRA. We conducted a comparative analysis of the results obtained from running the original circuit, the
circuit after decomposing the two-qubit gate, and the reconstructed circuit after decomposing the two-qubit gate
and performing a single cut. These results are visually represented in Fig. 8.
From Fig. 8, we see that the correct results of the circuit are |010101> and |101010>, and the correct solution
probabilities of the circuit after decomposing the two-qubit gate are 0.13 and 0.12 respectively, which are lower
Figure 8. Results of the original circuit, the circuit after two-qubit gate decomposition, and results of running
FRA on a single cut circuit after decomposing the two-qubit gate, with the probability values of some high
probability solution states also marked in figure.
than that of the original circuit at 0.15. However, the fast reconstruction algorithm is also able to reconstruct the
high probability solution state based on the circuit after the two-qubit gate decomposition, with a correct solution
probability of 0.19 and 0.18, which is even higher than that of the original circuit of 0.15. Although FRA does
not fully reconstruct all quantum states within the state space of the original circuit, such as the low-probability
solutions |011111> and others, this limitation results in a relatively low MSE . However, FRA shows stability in
reconstructing high probability solutions, which closely resemble the original distribution. This level of accuracy
is sufficient for solving circuits like QAOA.
To further validate the conclusion, we performed additional experiments on multiple QAOA circuits. Each
circuit consisted of two high probability solutions, which were then evaluated through FRA after a single cut.
The experiment results were sorted in descending order of probability after running all circuits. Subsequently,
the first two solutions from the original circuits’ running results were compared to the first two solutions of
the FRA reconstruction results. The evaluation metric used in this comparison was the coincidence rate (CR).
0, no coincident solutions
CR = 0.5, one solution is coincident (22)
1, the first two solutions are coincident
The CR is equal to 1 when both first two solutions coincide completely, 0.5 when only one solution coincides,
and 0 when there are no coinciding solutions.
We conducted experiments on the QAOA circuits with 3-qubit to 13-qubit. As a result of the experiments,
the CR of the reconstructed results of the FRA for all experimental circuits is 1, when compared to the results
obtained from the original circuits. This signifies that FRA is capable of accurately reconstructing the solution
states for quantum circuits, such as QAOA. Additionally, we have counted the number of results obtained by
various reconstruction algorithms in the experiments to estimate the space overhead, as depicted in Fig. 9.
As shown in Fig. 9, of all the reconstruction algorithms for this experiment, the FRA reconstruction yields
the least number of solutions. The Exact and DD algorithms reconstruct all quantum states within the state space
of the circuit. However, the results obtained by these algorithms increase exponentially as the number of qubits
grows. At a certain point, the storage device may be incapable of accommodating all the results. Conversely, FRA
reconstructs high probability solutions from the original circuit, eliminating the presence of 0-probability or low-
probability solutions. Consequently, the storage space overhead can be significantly reduced, transforming from
exponential to polynomial scale when compared to Exact and DD algorithm for large-scale quantum circuits.
In summary, FRA can efficiently acquire the reconstruction probability distribution of a quantum circuit in
a shorter runtime compared to the Exact, DD, and ARA. Unlike the Exact and DD algorithm, FRA does not
require the reconstruction of probabilities for all quantum states. Instead, it selectively focuses on high prob-
ability solutions, reducing time overhead and space overhead while finding the correct solution for the circuit.
This means that FRA is a low overhead circuit-cutting post-processing algorithm.
Conclusion
This paper introduces an algorithm based on Hamiltonian Monte Carlo in circuit cutting reconstruction. Addi-
tionally, we apply two-qubit gate decomposition to minimize the number of cuts in experimental circuits during
our experiments. Our comprehensive experiments demonstrate that our reconstruction algorithm efficiently
samples high probability solutions from state space, without the need to traverse all possible states. Furthermore,
it significantly reduces both time and space overheads.
In the NISQ era, circuit cutting plays a crucial role in extending NISQ devices to larger scales. However, due to
its excessive post-processing overhead, operating on large-scale quantum circuits becomes impractical. Therefore,
the algorithm proposed in this paper to reduce the post-processing overhead of circuit cutting holds the utmost
Figure 9. The Figure shows the number of reconstructed results obtained through various algorithms, namely
Exact, DD, ARA, and FRA, for the 3-qubit to 13-qubit qubits QAOA experiment.
importance for the NISQ era. Additionally, the concept of the fast reconstruction algorithm, introduced in this
paper, solely obtaining high probability solutions carries significant implications for the future development of
quantum computing.
Currently, our reconstruction algorithm is only applicable to circuits with a single cut, and it has not been
extended to circuits with multiple cuts. Our reconstruction algorithm is particularly suitable for circuits such
as QAOA. Investigating and exploring how the algorithm can be adapted for multiple cuts and extended to all
quantum circuits are promising directions for future research.
Data availability
The data presented in this paper is available online at https://github.com/hang-dev/FRA.
Code availability
The code for the specific implementation of our method is available online at https://g ithub.c om/h
ang-d
ev/F
RA.
References
1. Shalf, J. M. & Leland, R. Computing beyond Moore’s law. Computer 48, 14–23 (2015).
2. Biamonte, J. et al. Quantum machine learning. Quantum 549, 195–202 (2017).
3. Lloyd, S., Mohseni, M. & Rebentrost, P. J. Quantum Algorithms for Supervised and Unsupervised Machine Learning (Springer, 2013).
4. Lanyon, B. P. et al. Towards quantum chemistry on a quantum computer. Quantum 2, 106–111 (2010).
5. Abrams, D. S. & Lloyd, S. J. Simulation of many-body Fermi systems on a universal quantum computer. Phys. Rev. Lett. 79, 2586
(1997).
6. Shor, P. W. Polynomial-time algorithms for prime factorization and discrete logarithms on a quantum computer. SIAM Rev. 41,
303–332 (1999).
7. Bouland, A., van Dam, W., Joorati, H., Kerenidis, I. & Prakash, A. Prospects and Challenges of Quantum Finance. (2020).
8. Lee, R. S. & Lee, R. S. Future trends in quantum finance. Quant. Financ. 1, 399–405 (2020).
9. Montanaro, A. Quantum algorithms: An overview. NPJ Quant. Inf. 2, 1–8 (2016).
10. Preskill, J. J. Q. Quantum computing in the NISQ era and beyond. Quantum 2, 79 (2018).
11. Bharti, K. et al. Noisy intermediate-scale quantum algorithms. Quantum 94, 015004 (2022).
12. Bravyi, S., Smith, G. & Smolin, J. A. Trading classical and quantum computational resources. Phys. Rev. X 6, 021043 (2016).
13. Eddins, A. et al. Doubling the size of quantum simulators by entanglement forging. PRX Quant. 3, 010309 (2022).
14. Peng, T., Harrow, A. W., Ozols, M. & Wu, X. J. Simulating large quantum circuits on a small quantum computer. Phys. Rev. Lett.
125, 150504 (2020).
15. Perlin, M. A., Saleem, Z. H., Suchara, M. & Osborn, J. C. Quantum circuit cutting with maximum-likelihood tomography. NPJ
Quant. Inf. 7, 64 (2021).
16. Piveteau, C. & Sutter, D. J. Circuit Knitting with Classical Communication. (2022).
17. Saleem, Z. H., Tomesh, T., Perlin, M. A., Gokhale, P. & Suchara, M. J. Quantum Divide and Conquer for Combinatorial Optimization
and Distributed Computing. (2021).
18. Tang, W., Tomesh, T., Suchara, M., Larson, J. & Martonosi, M. in Proceedings of the 26th ACM International Conference on Archi-
tectural Support for Programming Languages and Operating Systems, 473–486.
19. Lowe, A. et al. Fast quantum circuit cutting with randomized measurements. Quantum 7, 934 (2023).
20. Chen, D. et al. in 2022 IEEE International Conference on Quantum Computing and Engineering (QCE), 509–515 (IEEE).
21. Mitarai, K. & Fujii, K. J. N. Constructing a virtual two-qubit gate by sampling single-qubit operations. New J. Phys. 23, 023021
(2021).
22. Neal, R. M. MCMC using Hamiltonian dynamics. Handb. Markov Chain Monte Carlo 2, 2 (2011).
23. Calvo, M. P., Sanz-Alonso, D. & Sanz-Serna, J. M. HMC: Reducing the number of rejections by not using leapfrog and some results
on the acceptance rate. J. Comput. Phys. 437, 110333 (2021).
24. Betancourt, M. J. A Conceptual Introduction to Hamiltonian Monte Carlo. (2017).
25. Duane, S., Kennedy, A. D., Pendleton, B. J. & Roweth, D. Hybrid Monte Carlo. Phys. Lett. B 195, 216–222 (1987).
26. Cooley, J. W. & Tukey, J. W. J. An algorithm for the machine calculation of complex Fourier series. Math. Comput. 19(90), 297–301
(1965).
27. Farhi, E., Goldstone, J. & Gutmann, S. J. A Quantum Approximate Optimization Algorithm. ArXiv (2014).
28. Grover, L. K. in Proceedings of the twenty-eighth annual ACM symposium on Theory of computing, 212–219.
29. Bernstein, E. & Vazirani, U. in Proceedings of the twenty-fifth annual ACM symposium on Theory of computing, 11–20.
30. Boixo, S. et al. Characterizing quantum supremacy in near-term devices. Nat. Phys. 14, 595–600 (2018).
31. Goemans, M. X. & Williamson, D. P. J. Improved approximation algorithms for maximum cut and satisfiability problems using
semidefinite programming. J. ACM 42, 1115–1145 (1995).
Author contributions
Material preparation, data collection and analysis were performed by H.L., J.X. and Y.Z. The first draft of the
manuscript was written by H.L. and Z.S. Y.L. and Z.F. prepared the figures. All authors contributed to the study
conception and design.
Competing interests
The authors declare no competing interests.
Additional information
Correspondence and requests for materials should be addressed to Z.S.
Reprints and permissions information is available at www.nature.com/reprints.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International
License, which permits use, sharing, adaptation, distribution and reproduction in any medium or
format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the
Creative Commons licence, and indicate if changes were made. The images or other third party material in this
article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the
material. If material is not included in the article’s Creative Commons licence and your intended use is not
permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from
the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
1. use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
2. use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
3. falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
4. use bots or other automated methods to access the content or redirect messages
5. override any security feature or exclusionary protocol; or
6. share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at