Provably-Safe Neural Network Training Using Hybrid Zonotope Reachability Analysis
Provably-Safe Neural Network Training Using Hybrid Zonotope Reachability Analysis
Abstract— Even though neural networks are being increas- robust to adversarial attacks, and more. An overview of our
ingly deployed in safety-critical applications, it remains difficult method is shown in Fig. 1.
to enforce constraints on their output, meaning that it is hard to
guarantee safety in such settings. Towards addressing this, many
existing methods seek to verify a neural network’s satisfaction A. Related Work
of safety constraints, but do not address how to correct an
arXiv:2501.13023v1 [cs.LG] 22 Jan 2025
“unsafe” network. On the other hand, the few works that We now review three key approaches to enforce constraints
extract a training signal from verification cannot handle non- on neural network: sampling-based approaches that do not
convex sets, and are either conservative or slow. To address have formal guarantees, verification approaches that only
these challenges, this work proposes a neural network training check constraint satisfaction, and approaches that combine
method that can encourage the exact reachable set of a non- verification with training, which our method belongs to.
convex input set through a neural network with rectified linear
unit (ReLU) nonlinearities to avoid a non-convex unsafe region, Finally, we review relevant literature on hybrid zonotopes,
using recent results in non-convex set representation with which is the set representation used in our method.
hybrid zonotopes and extracting gradient information from 1) Training with Soft Constraints: Many existing work
mixed-integer linear programs (MILPs). The proposed method capture safety in neural networks by penalizing constraint
is fast, with the computational complexity of each training
violations on sampled points during training [9]–[13]. How-
iteration comparable to that of solving a linear program (LP)
with number of dimensions and constraints linear to the number ever, these soft approaches, while often fast and easy to
of neurons and complexity of input and unsafe sets. For a neural implement, do not provide any safety guarantees beyond the
network with three hidden layers of width 30, the method was training samples. While there are works that are capable of
able to drive the reachable set of a non-convex input set with enforcing hard constraints in neural networks by modifying
55 generators and 26 constraints out of a non-convex unsafe
the training process [14], [15], they can only handle simple
region with 21 generators and 11 constraints in 490 seconds.
affine constraints.
I. I NTRODUCTION 2) Neural Network Verification: A different approach is to
certify safety with respect to a set of inputs. Methods in this
Neural networks are universal approximators [1] that have category tend to analyze the reachable set (i.e. image) of the
seen success in many domains. However, they are also input set through the neural network, either exactly [16]–[20]
well-known as “black-box” models, where the relationship or as an over-approximation [16]–[18], [21]–[23] depending
between their inputs and outputs is not easily interpretable or on the choice of set representation. That said, most of these
directly analyzable due to non-linearity and high-dimensional works only focus on neural network verification. That is,
parameterizations. As such, it is very difficult to certify their these methods only answer the yes-no question of “safe” or
safety (e.g. satisfaction of constraints). This limitation im- “unsafe”, with the aftermath of fixing an “unsafe” network
poses many significant drawbacks. For example, robots crash left largely unexplored. As a result, engineers can only train
frequently when training their neural network controllers via trial-and-error until the desired safety properties have
with deep reinforcement learning (RL) algorithms, limiting been achieved, which can be slow and ineffective.
deep RL’s success in robots where hardware failures are 3) Training with Verification: To the best of our knowl-
costly, simulations are not readily available, or the sim-to- edge, there are only two works that attempted to extract
real gap is too large for reliable performance [2]. In addition, learning signals from the safety verification results using
neural networks can also be susceptible to adversarial attacks, reachability analysis.
where minor perturbations in the input can lead to drastically First, in [24], given an input set as an H-polytope (i.e.
different results in the output [3], [4]. This makes deploying polytope represented by intersection of halfplanes) and a
neural networks in safety-critical tasks a questionable choice, neural network controller embedded in a nonlinear dynamical
even though it has already been widely done [5]–[7], leading system, the polytope is expressed as the projection of a high-
to many injuries and accidents [8]. In this paper, we present a dimensional hyperrectangle, enabling the use of the CROWN
method to enforce safety in neural networks by encouraging verifier [21] for interval reachability. Then, using a loss
their satisfaction of a collision-free constraint, which has po- function that encourages the vector field of the reachable set
tential application in making deep RL safe, neural networks to point inwards, the authors were able to train the neural net-
work until the system is forward invariant. With this method,
All authors are with the Department of Mechanical Engineering,
Georgia Institute of Technology, Atlanta, GA. Corresponding author: the input set is limited to being a convex polytope. Moreover,
[email protected]. since [21] and the interval reachability techniques used are
1
Hybrid Zonotope Input Set Hybrid Zonotope Reachable Set Safe Hybrid Zonotope Reachable Set
2
1
Neural 2
-1 -1 Unsafe -1
-1 -0.5 0 0.5 1 -1 0 1 2 -1 0 1 2
Loss from
Update
Relaxed LP
Fig. 1. A flowchart of our method, using the example from Sec. VI. Our method takes in a non-convex input set (green), then computes its exact
reachable set (blue) through the neural network. Then, we formulate the reachable set’s collision with the unsafe set (red) as a loss function using a linear
program (LP), which enables us to update the neural network’s parameters via backpropagation. Every several iterations, we check if the reachable set
collides with the unsafe set using a mixed-integer linear program (MILP). If it does not, then the training is complete and our method is successful.
over-approximations, its space of discoverable solutions may ReLU neural networks training based on exact reacha-
be limited. bility analysis with hybrid zonotope. This loss function
Second, given an input set and an unsafe region expressed encourages the reachable set of the input set to avoid the
as constrained zonotopes (a convex polytopic representation unsafe region, the satisfaction of which can be checked
[25]), our prior work [26] computed the exact reachable set using a mixed-integer linear program (MILP).
of a neural network as a union of constrained zonotopes. 2) We show that this method is fast and scales fairly well
Then, by using a loss function to quantify the “emptiness” with respect to input dimensions, output dimensions,
of the intersection between the reachable set and the unsafe network size, complexity of the input set, and com-
region, we were able to train the neural network such that plexity of the unsafe region. The results significantly
the reachable set no longer collides with the unsafe region. outperform our prior method for exact reachability
Similarly, the input set and the obstacle in this method is analysis in training [26].
limited to being a convex polytope. Moreover, the number The remainder of the paper is organized as follows:
of sets needed to represent the reachable set grows exponen- we provide preliminary information in Sec. II, formalize
tially with the size of the neural network, making the method our problem statement in Sec. III, detail our proposed
numerically intractable even for very small neural networks. method in Sec. IV, provide experimental analysis in Sec. V,
4) Hybrid Zonotopes: Recently, a non-convex polytopic demonstrate the utility of our method in Sec. VI, then give
set representation called the hybrid zonotope [27] was pro- concluding remarks and limitations in Sec. VII.
posed. Hybrid zonotopes are closed under affine mapping,
Minkowski sum, generalized intersection, intersection [27], II. P RELIMINARIES
union, and complement [28], with extensive toolbox support We now introduce our notation conventions, define hy-
in MATLAB [29] and Python [30]. They can also exactly brid zonotopes and ReLU neural networks, and summarize
represent the forward reachable set (image) [19] and back- existing work [19], [31] on representing image of a hybrid
ward reachable set (preimage) [31] of a neural network with zonotope through a ReLU neural network exactly as a hybrid
rectified linear units (ReLU) using basic matrix operations, zonotope.
with complexity scaling only linearly with the size of the
network. However, existing methods for hybrid zonotopes A. Notation
enforce safety on robots either by formulating a model In this paper, we denote the set of real numbers as R, non-
predictive control (MPC) [28] or a nonlinear optimization negative real numbers as R+ , natural numbers as N, scalars in
problem [32] without neural networks in the loop, whereas lowercase italic, sets in uppercase italic, vectors in lowercase
those with neural networks only use hybrid zonotope for bold, and matrices in uppercase bold. We also denote a
verification but not training [19], [31], [33], [34]. In this matrix of zeros as 0, a matrix of ones as 1, and an identity
paper, our contribution is extracting and using learning matrix as I, with their dimensions either implicitly defined
signals from neural network reachability analysis with hybrid from context or explicitly using subscripts, e.g. 0n1 ×n2 ⊂
zonotopes. Rn1 ×n2 , In ⊂ Rn×n . An empty array is [ ]. Finally, inequalities
≤, ≥ between vectors are compared element-wise.
B. Contributions
Our contributions are twofold: B. Hybrid Zonotope
1) Given a non-convex input set and a non-convex unsafe A hybrid zonotope HZ(Gc , Gb , c, Ac , Ab , b) ⊂ Rn is a set
region, we propose a differentiable loss function for parameterized by a continuous generator matrix Gc ∈ Rn×ng ,
2
a binary generator matrix Gb ∈ Rn×nb , a center c ∈ Rn , a hybrid zonotope [19], [31]:
continuous linear constraint matrix Ac ∈ Rnc ×ng , a binary
{max (Wi xi−1 + wi , 0) | xi−1 ∈ Pi−1 }
linear constraint matrix Ab ∈ Rnc ×nb , and a constraint vec-
(5)
tor b ∈ Rnb on continous coefficients zc ∈ Rng and binary
= 0 Ini Hni ∩h i (Wi Pi−1 + wi ) ,
coefficients zb ∈ {−1, 1}nb as follows [27, Definition 3]: I 0
where Hni ⊂ R2ni is the graph of an ni -dimensional ReLU
HZ(Gc , Gb , c, Ac , Ab , b)
activation function over a hypercube domain {x | −a1 ≤ x ≤
={Gc zc + Gb zb + c | Ac zc + Ab zb = b, ∥zc ∥∞ ≤ 1, (1) a1} for some a > 0, which can be represented exactly by a
nb
zb ∈ {−1, 1} }. hybrid zonotope as in [31]:
We denote ng as the number of continuous generators, nb as x
Hni = | −a1 ≤ x ≤ a1 ,
the number of binary generators, and nc as the number of max(x, 0)
a a
a
constraints in a hybrid zonotope. I⊗ − 2 −a 2 0 0 , − 2 I , a 1, (6)
= HZ
Consider a pair of hybrid zonotopes P1 = I⊗ 0 −2 0 0 0 2
HZ(Gc1 , Gb1 , c1 , Ac1 , Ab1 , b1 ) ⊂ Rn1 and P2 =
1
HZ(Gc2 , Gb2 , c2 , Ac2 , Ab2 , b2 ) ⊂ Rn2 . In this paper, we I ⊗ I2 I , I ⊗ ,1 ,
−1
make use of their closed form expressions in generalized
intersection under some R ⊂ Rn2 ×n1 , denoted as ∩R [27, where ⊗ is the Kronecker product. Note that (5) holds as long
Proposition 7]: as a is large enough [19]. As such, the reachable set Pd ⊂ Rnd
of a hybrid zonotope Z = P0 ⊂ Rn0 through a ReLU neural
P1 ∩R P2 ={x ∈ P1 | Rx ∈ P2 }, network can be obtained by applying (5) d − 1 times, before
Ac1 0
applying an affine transformation parameterized by Wd and
=HZ Gc1
0 , Gb1
0 , c1 , 0 Ac2 , wd . This way, if Z has ng,Z continuous generators, nb,Z binary
RGc1 −Gc2 generators, and nc,Z constraints, then Pd will have ng,Z + n0 +
! 4nn continuous generators, nb,Z + nn binary generators, and
Ab1 0 b1
0 nc,Z + n0 + 3nn constraints [19], where nn := n1 + · · · + nd−1
Ab2 , b2 .
denotes the number of neurons.
RGb1 −Gb2 c2 − Rc1
(2) III. P ROBLEM S TATEMENT
Our goal in this paper is to design a ReLU neural network
Note that their “regular” intersection {x ∈ P1 | x ∈ P2 }, which
training method such that the reachable set of a given input
we denote as P1 ∩ P2 , is a particular case of the generalized
set through the network avoids some unsafe regions. As per
intersection with R = I.
most other training methods, we assume that the structure
Finally, a hybrid zonotope P = HZ(Gc , Gb , c, Ac , Ab , b) ⊂
(i.e. depth and widths) of the ReLU neural network is fixed
Rn1 is also closed under affine transformation with any
as a user choice, and we focus only on updating its weights
matrix W ⊂ Rn2 ×n1 and vector w ⊂ Rn2 as [27, Proposition
and biases (a.k.a. trainable parameters). Mathematically, we
7]:
want to tackle the following problem:
WP + w = {Wx + w | x ∈ P}, Problem 1 (Training the Reachable Set of a Neural Net-
(3)
= HZ (WGc , WGb , Wc + w, Ac , Ab , b) . work to Avoid Unsafe Regions). Given an input set Z =
HZ(Gc,Z , Gb,Z , cZ , Ac,Z , Ab,Z , bZ ) ⊂ Rn0 with ng,Z continuous
C. ReLU Neural Network generators, nb,Z binary generators, and nc,Z constraints,
In this work, we consider a fully-connected, ReLU acti- an unsafe region U = HZ(Gc,U , Gb,U , cU , Ac,U , Ab,U , bU ) ⊂
vated feedforward neural network ξ : Rn0 → Rnd , with output Rnd with ng,U continuous generators, nb,U binary genera-
xd = ξ (x0 ) ∈ Rnd given an input x0 ∈ Rn0 . We denote by tors, and nc,U constraints, and a ReLU neural network ξ
d ∈ N the depth of the network and by ni the width of the with fixed depth d and widths n0 , · · · , nd , we want to find
ith layer. Mathematically, W1 , · · · , Wd , w1 , · · · , wd such that
Q := {ξ (x) | x ∈ Z} ∩U = 0.
/ (7)
xi = max (Wi xi−1 + wi , 0) , (4a)
xd = Wd xd−1 + wd , (4b) Of course, a trivial solution would be to set Wd = 0 and
wd ∈/ U, but this kind of solution is not useful. Instead, we
where Wi ∈ Rni ×ni−1 , wi ∈ Rni , i = 1, · · · , d − 1, Wd ∈ aim to design a differentiable loss function such that (7) can
Rnd ×nd−1 , wd ∈ Rnd , and max is taken elementwise. We be achieved by following a gradient and updating the train-
denote W1 , · · · , Wd as weights and w1 , · · · , wd as biases able parameters via backpropagation [35]. Doing so allows
of the network. The function max(·, 0) is known as an ni - our method to integrate with other loss functions to achieve
dimensional ReLU activation function for 0 ⊂ Rni . additional objectives, as well as makes the training applicable
Consider a hybrid zonotope Pi−1 ⊂ Rni−1 . By applying the to ReLU networks with other structural constraints, such as
operations in (2) and (3), its image through (4a) is exactly a when they are embedded in a dynamical system [5]–[7].
3
IV. M ETHODS B. Loss Function to Encourage Emptiness
We now construct a loss function which, when minimized,
In this section, we first formulate a MILP to check whether
makes Q empty. Naı̈vely, since Q = 0/ iff r∗ > 1, where r∗ is
a hybrid zonotope is empty. Then, we explain how to obtain
the optimal value of (9) with P = Q, we can construct the
useful gradient information from this MILP to train the ReLU
loss function ℓ ∈ R as:
network such that the reachable set is out of the unsafe
region. ℓ = 1 − r∗ , (10)
such that when ℓ is decreased to a negative value, we must
A. Hybrid Zonotope Emptiness Check have Q = 0. / To minimize ℓ using backpropagation, from
∗ ∗ ∗ ∂A
Before constructing a loss function for training, we first chain rule, we must compute ∂∂rℓ∗ , ∂ ∂Ar , ∂ ∂Ar , ∂∂br , ∂ Wc,Q ,
c,Q b,Q Q 1
need a way to check whether (7) is true. From (2) and (5), the ∂b ∂A ∂b
· · · , ∂ WQ , and ∂ wc,Q , · · · , ∂ wQ . Since expressing Ac,Q , Ab,Q ,
d 1 d
left-hand side of (7), Q, can be straightforwardly computed and bQ in terms of W1 , · · · , Wd , and w1 , · · · , wd involves
as a hybrid zonotope HZ(Gc,Q , Gb,Q , cQ , Ac,Q , Ab,Q , bQ ) ⊂ only basic matrix operations à la (2) and (5), ∂∂rℓ∗ , ∂ Wc,Q ,
∂A
Rnd with ng,Q = ng,Z + n0 + 4nn + ng,U continuous generators, ∂b ∂A ∂b
1
nb,Q = nb,Z + nn + nb,U binary generators, and nc,Q = nc,Z + · · · , ∂ WQ , and ∂ wc,Q , · · · , ∂ wQ can be straightforwardly ob-
d 1 d
n0 + 3nn + nd constraints. Then, the image of the input set tained from automatic differentiation [37]. However, obtain-
∗ ∗ ∗
is not in collision with the unsafe region iff Q is empty. To ing ∂ ∂Ar , ∂ ∂Ar , and ∂∂br involves differentiation through an
c,Q b,Q Q
check whether a hybrid zonotope is empty, existing methods MILP. Since the optima of an MILP can remain unchanged
formulate a feasibility MILP with ng,Q continuous variables under small differences in its parameters, its gradient can
and nb,Q binary variables [27]: be 0 or non-existent, which are uninformative [38]. Instead,
consider the following convex relaxation of (9):
find zc , zb , min r̃ − µ(1ln(z̃c1 ) + 1ln(z̃c2 ) + 1ln(z̃b ) + ln(r̃) + 1ln(s)),
s.t. Ac,Q zc + Ab,Q zb = bQ , s.t. Ac (z̃c1 − z̃c2 ) + Ab (2z̃b − 1) = b,
(8)
∥zc ∥∞ ≤ 1,
z̃c1 − z̃c2 − r̃1
0nc ×1
zb ∈ {−1, 1}nb , z̃c2 − z̃c1 − r̃1 + s = 0nc ×1 ,
z̃b 1
which is infeasible iff Q = 0.
/ Note that (8) is NP-complete (11)
[36]. However, not only is it not always feasible, it is also
unclear how to derive a loss function from the optimizers to where r̃ ∈ R, z̃c1 ∈ Rng , z̃c2 ∈ Rng , z̃b ∈ Rnb , s ∈ Rng +ng +nb ,
drive Q to be empty. Instead, consider the following MILP µ ∈ R+ is the cut-off multiplier from the solver [39], and
with one more continuous variable than (8): ln(·) is applied elementwise. (11) is the standard linear
program (LP) form of (9) with log-barrier regularization and
Proposition 2 (Hybrid Zonotope Emptiness Check). Given without the integrality constraints, and can be obtained by
a hybrid zonotope P = HZ(Gc , Gb , c, Ac , Ab , b) ⊂ Rn , where replacing r with r̃, zc with z̃c1 − z̃c2 , and zb with 2z̃b −1 (such
Ac ∈ Rnc ×ng and Ab ∈ Rnc ×nb . Consider the following MILP: that all constraints are non-negative), and introducing slack
variable s (such that inequality constraints become equality
min r, constraints) [40].
s.t. Ac zc + Ab zb = b, The optimization problem (11) can be solved quickly
(9)
∥zc ∥∞ ≤ r, using solvers such as IntOpt [39]. Moreover, if r̃∗ is the
∗ ∗ ∗
optimal value of (11), ∂∂Ar̃ c , ∂∂Ar̃ , and ∂∂r̃b can be obtained by
zb ∈ {−1, 1}nb , b
differentiating the Karush-Kuhn-Tucker (KKT) conditions of
where r ∈ R. Then, if r∗ is the optimal value of (9), then (11), which we refer the readers to [38, Appendix B] for
P = 0/ iff r∗ > 1. the mathematical details. Not only are these gradients well-
defined, easily computable, and informative, but also, they
Proof. This follows from the definition of hybrid zonotope have been shown to outperform other forms of convex re-
in (1). laxation in computation speed and minimizing loss functions
derived from MILPs [38, Appendix E].
By construction, (9) is feasible as long as ∃ zc ∈ Rng , zb ∈ Therefore, instead of the loss function ℓ, we propose to
{−1, 1}nb such that Ac zc + Ab zb = b. If this condition is not backpropagate with respect to a surrogate loss function ℓ̃ ∈ R:
met for Q, then we have Q = 0/ anyway and no training
ℓ̃ = 1 − r̃∗ , (12)
is needed. Importantly, it has been shown in [26] that
the minimum upper bound of the norm of the continuous where r̃∗ is the optimal value of (11) with Ac = Ac,Q and
coefficients is useful for gauging the extent of collision Ab = Ab,Q .
between two constrained zonotopes, which are subsets of Unfortunately, since ℓ̃ does not necessarily equal ℓ, we
a hybrid zonotope. As such, (9) gives a good foundation for cannot use (12) to simultaneously verify and train the neural
constructing a loss function for encouraging Q to be empty. network. In practice, we solve (8) in between some iterations
4
of training with (12) to check whether (7) has been achieved. f : Rn0 → Rnd defined as:
If it has, then the training is complete and Problem 1 has been 2
xodd + sin(xeven )
solved. f (x) = 10.5nd ×1 ⊗ 2 , (15a)
xeven + sin(xodd )
n0
1
V. E XPERIMENTS xodd = ∑ xi 1odd (i),
⌈0.5n0 ⌉ i=1
(15b)
n0
We now assess the scalability of our method by observing 1
the results under different problem parameters. We also wish xeven = ∑ xi (1 − 1odd (i)), (15c)
⌊0.5n0 ⌋ i=1
to compare our results with [26] to assess our contribution
to the state of the art. All experiments were performed on a where ⌈·⌉ is the ceiling function, ⌊·⌋ is the floor function,
desktop computer with a 24-core i9 CPU, 32 GB RAM, and x = [x1 , · · · , xn0 ]⊺ , and 1odd : R+ → {0, 1} is the indicator
an NVIDIA RTX 4090 GPU on Python1 . function for odd numbers, such that 1odd (i) = 1 if i is odd
and 1odd (i) = 0 if i is even.
Given the pretrained network, we begin training to obey
A. Experiment Setup and Method the safety constraint. In each training iteration, we use IntOpt
We test our method’s performance under different condi- [39] to compute the loss function (12) and PyTorch [37]
tions by varying the width of the first layer n1 ∈ {10, 20, 30}, with optim.SGD as the optimizer to update the trainable
the depth of the network d ∈ {2, 3, 4}, the input dimen- parameters in the network. Every 10 iterations, we use
sion n0 ∈ {2, 4, 6}, the output dimension nd ∈ {2, 4, 6}, Gurobi [41] to solve the MILP in (8) to check the emptiness
the complexity of the input set nb,Z ∈ {0, 10, 20}, and the of Q. We are successful in solving Problem 1 if Q = 0, / at
complexity of the unsafe region nb,U ∈ {0, 10, 20}. We opted which point we terminate the training instead of updating
not to show results from higher dimensions, set complexities, the parameters. Note that each training iteration is done on
and larger networks here as we do not wish to introduce CPU instead of GPU. Furthermore, we chose not to solve
large confounding variables from the increased difficulties the MILP in every iteration because solving (8) can be many
in training with standard supervised learning. times slower than solving (11).
We define input and unsafe sets as follows. The input set We also compare against a constrained zonotope safe
is given by: training method [26]. We tested the method with n1 = 10,
d = 2, n0 = 2, nd = 2, nb,Z = 0, and nb,U = 0, which are
the parameters used in the example in [26]. To compare the
1 1
Z = HZ I, 1 I, 0, [ ], [ ], [ ] , (13a)
mZ mZ 1×(mZ −1) scalability of both methods, we also tested [26] on n1 = 20
nb,Z and n1 = 30. To ensure fairness, we do not include the
mZ = + 1, (13b) objective loss and only add the constraint loss when it is
n0
positive (see [26] for details). We terminate the training once
which is a hypercube with length 2 centered at the origin the constraint loss has reached zero (i.e. the reachable set is
n
formed from a union of mZ0 smaller hypercubes (repre- out of collision with the unsafe set).
sented as 2mZ −1 overlapping hypercubes). We want its image
through the neural network to avoid the unsafe region: B. Hypotheses
Since the most complex operations in our method are
0.5 0.5 solving the relaxed LP (11) and the MILP (8), we expect
U = HZ I, 1 I, 1.51, [ ], [ ], [ ] , (14a)
mU mU 1×(mU −1) our performance to be dependent on the solvers’ (i.e. IntOpt
nb,U and Gurobi) ability to scale with the number of variables and
mU = + 1, (14b)
nd constraints, which in turn scale linearly with the dimensions,
network size, and set complexity (see Sec. IV-A). As such,
which is a hypercube with length 1 centered at 1.51nd ×1 we expect the computation time for each iteration of our
n
formed from a union of mUd smaller hypercubes (represented method to be significantly faster than that of [26], which
as 2mU −1 overlapping hypercubes). We choose these particu- scales exponentially with the number of neurons. That said,
lar parameters such that the reachable set of the input set and since [26] verifies (7) in every iteration (whereas our method
the unsafe region all have shapes similar to those shown in only checks it every 10 iterations), it is also possible for [26]
Fig. 2a before we apply our method in IV. Also, when n0 = 2, to terminate the training earlier than our method does.
nd = 2, nb,Z = 0, and nb,U = 0, we recover the problem setup
in [26], which we will compare our method against. C. Results and Discussion
We then ensure our ReLU neural network represents a We report the results of our experiments in Table I. All
nonlinear function that intersects the unsafe set. In partic- reachable sets have been successfully driven out of the unsafe
ular, we use standard supervised learning (implemented in regions, except for [26] with n1 of 30, which failed to even
PyTorch [37]) to train the network to approximate a function compute the reachable set. We show the training progression
of one of the experiments in Fig. 2, which clearly shows the
1 We are preparing our code for open-source release loss function driving the reachable set out of collision.
5
0 Iterations 10 Iterations 20 Iterations 30 Iterations 40 Iterations
2 2 2 2 2
1 1 1 1 1
0 0 0 0 0
-1 -1 -1 -1 -1
-2 -2 -2 -2 -2
-1 0 1 2 -1 0 1 2 -1 0 1 2 -1 0 1 2 -1 0 1 2
6
TABLE I
Summary of duration required to drive a neural network’s reachable set (i.e. image of a given input set) out of an unsafe region under different network
sizes, input and output dimensions, and complexity of the input set and the unsafe region. The dimensions of the hybrid zonotope intersection of the
reachable set and the unsafe region ng,Q , nc,Q , and nb,Q represents the complexity of the LPs and MILPs that must be solved during the training
iterations, which took up a majority of the computation time.
in the LPs and MILPs solved are also a magnitude larger. VII. C ONCLUSION
Despite this, our method is still able to drive the reachable
This work proposes a new training method for enforcing
set out of collision with the unsafe set in 20 iterations after
constraint satisfaction by extracting learning signals from
490.290 s. A majority of the computation time was spent
neural network reachability analysis using hybrid zonotopes.
solving the MILPs, which took 54.740 s and 194.987 s at
This method is exact and can handle non-convex input sets
the 10th and 20th iteration. In contrast, solving (11) took less
and unsafe regions, and has been shown to be fast and scale
than 0.1 s in each iteration.
fairly well with respect to network sizes, dimensions, and set
This demo presents preliminary results on how to train complexities, significantly outperforming our pervious work
a neural network to obey non-convex constraints with for- in [26].
mal guarantees for the first time. However, it also reveals Limitations: Our current implementation has several draw-
the method’s computational bottleneck of solving the NP- backs to be addressed in future work. Firstly, while the
complete problem in (8), which limits its utility in appli- training step remains fast and efficient with an increase in
cations that require larger networks and more complex sets. network sizes and set complexities, the verification step does
We plan to address this in future work by experimenting with not, since the MILP in (8) is NP-complete. Secondly, the
other neural network verification techniques, or by develop- method is limited to fully-connected networks with ReLU
ing over-approximation methods using hybrid zonotopes with activation functions, which prevents it from being applied to
simpler representations. more interesting problems such as those with convolutional
7
neural networks (CNNs) or those with neural networks [13] K.-C. Hsu, D. P. Nguyen, and J. F. Fisac, “Isaacs: Iterative soft
embedded in dynamical systems. Finally, as with other neural adversarial actor-critic for safety,” in Learning for Dynamics and
Control Conference, PMLR, 2023, pp. 90–103.
network training methods, backpropagation through the loss [14] R. Balestriero and Y. LeCun, “POLICE: Provably optimal linear
function does not guarantee convergence towards the global constraint enforcement for deep neural networks,” in ICASSP 2023-
minimum. Thus, our method cannot guarantee the discovery 2023 IEEE International Conference on Acoustics, Speech and
Signal Processing (ICASSP), IEEE, 2023, pp. 1–5.
of a solution, even if it exists. [15] J.-B. Bouvier, K. Nagpal, and N. Mehr, “POLICEd RL: Learning
Future Work: Going forward, we hope to explore our Closed-Loop Robot Control Policies with Provable Satisfaction of
method’s compatibility with other verification methods to Hard Constraints,” arXiv preprint arXiv:2403.13297, 2024.
[16] H.-D. Tran, X. Yang, D. Manzanas Lopez, et al., “NNV: the neural
overcome the NP-complete problem in solving the MILP, network verification tool for deep neural networks and learning-
at the cost of potentially losing exactness. We also plan enabled cyber-physical systems,” in International Conference on
to apply techniques from [31], [33], [34] to train ReLU Computer Aided Verification, Springer, 2020, pp. 3–17.
[17] H.-D. Tran, D. Manzanas Lopez, P. Musau, et al., “Star-based reach-
networks embedded in dynamical systems, and leverage ability analysis of deep neural networks,” in Formal Methods–The
tricks from [18] to apply hybrid zonotope techniques on Next 30 Years: Third World Congress, FM 2019, Porto, Portugal,
CNNs. If successful, they could advance safety in camera- October 7–11, 2019, Proceedings 3, Springer, 2019, pp. 670–686.
[18] H.-D. Tran, S. Bak, W. Xiang, and T. T. Johnson, “Verification of
based control for autonomous driving [16] or aircraft landing deep convolutional neural networks using imagestars,” in Interna-
[44]–[46]. tional conference on computer aided verification, Springer, 2020,
Another particularly exciting possibility this method can pp. 18–42.
[19] J. Ortiz, A. Vellucci, J. Koeln, and J. Ruths, “Hybrid zonotopes
enable is a form of set-based training, where instead of exactly represent ReLU neural networks,” in 2023 62nd IEEE
training a neural network with features and labels as points, Conference on Decision and Control (CDC), IEEE, 2023, pp. 5351–
we can represent them as sets around the points, which 5357.
[20] Y. Zhang and X. Xu, “Safety verification of neural feedback systems
can make the network provably robust against attacks and based on constrained zonotopes,” in 2022 IEEE 61st Conference on
disturbances for seen examples. This could be enabled by Decision and Control (CDC), IEEE, 2022, pp. 2737–2744.
solving the optimization problems (8) and (11) in parallel [21] H. Zhang, T.-W. Weng, P.-Y. Chen, C.-J. Hsieh, and L. Daniel, “Effi-
cient neural network robustness certification with general activation
on GPU using methods similar to [47]. functions,” Advances in neural information processing systems,
vol. 31, 2018.
R EFERENCES [22] N. Kochdumper, C. Schilling, M. Althoff, and S. Bak, “Open-
[1] M. Leshno, V. Y. Lin, A. Pinkus, and S. Schocken, “Multilayer and closed-loop neural network verification using polynomial zono-
feedforward networks with a nonpolynomial activation function can topes,” in NASA Formal Methods Symposium, Springer, 2023,
approximate any function,” Neural networks, vol. 6, no. 6, pp. 861– pp. 16–36.
867, 1993. [23] T. Ladner and M. Althoff, “Automatic abstraction refinement in
[2] G. Dulac-Arnold, N. Levine, D. J. Mankowitz, et al., “Challenges neural network verification using sensitivity analysis,” in Proceed-
of real-world reinforcement learning: definitions, benchmarks and ings of the 26th ACM International Conference on Hybrid Systems:
analysis,” Machine Learning, vol. 110, no. 9, pp. 2419–2468, 2021. Computation and Control, 2023, pp. 1–13.
[3] K. Eykholt, I. Evtimov, E. Fernandes, et al., “Robust physical-world [24] A. Harapanahalli and S. Coogan, “Certified Robust Invariant
attacks on deep learning visual classification,” in Proceedings of the Polytope Training in Neural Controlled ODEs,” arXiv preprint
IEEE conference on computer vision and pattern recognition, 2018, arXiv:2408.01273, 2024.
pp. 1625–1634. [25] J. K. Scott, D. M. Raimondo, G. R. Marseglia, and R. D. Braatz,
[4] C. Szegedy, “Intriguing properties of neural networks,” arXiv “Constrained zonotopes: A new tool for set-based estimation and
preprint arXiv:1312.6199, 2013. fault detection,” Automatica, vol. 69, pp. 126–136, 2016.
[5] B. Ko, H.-J. Choi, C. Hong, J.-H. Kim, O. C. Kwon, and C. D. [26] L. K. Chung, A. Dai, D. Knowles, S. Kousik, and G. X. Gao,
Yoo, “Neural network-based autonomous navigation for a homecare “Constrained feedforward neural network training via reachability
mobile robot,” in 2017 IEEE International Conference on Big Data analysis,” arXiv preprint arXiv:2107.07696, 2021.
and Smart Computing (BigComp), IEEE, 2017, pp. 403–406. [27] T. J. Bird, H. C. Pangborn, N. Jain, and J. P. Koeln, “Hybrid
[6] E. N. Johnson, A. J. Calise, and J. E. Corban, “Adaptive guid- zonotopes: A new set representation for reachability analysis of
ance and control for autonomous launch vehicles,” in 2001 IEEE mixed logical dynamical systems,” Automatica, vol. 154, p. 111 107,
Aerospace Conference Proceedings (Cat. No. 01TH8542), IEEE, 2023.
vol. 6, 2001, pp. 2669–2682. [28] T. J. Bird and N. Jain, “Unions and complements of hybrid
[7] J. Ni, Y. Chen, Y. Chen, J. Zhu, D. Ali, and W. Cao, “A survey zonotopes,” IEEE Control Systems Letters, vol. 6, pp. 1778–1783,
on theories and applications for self-driving cars based on deep 2021.
learning methods,” Applied Sciences, vol. 10, no. 8, p. 2749, 2020. [29] J. Koeln, T. J. Bird, J. Siefert, J. Ruths, H. C. Pangborn, and N. Jain,
[8] N. H. T. S. Administration et al., “Summary report: standing general “zonoLAB: A MATLAB toolbox for set-based control systems anal-
order on crash reporting for level 2 advanced driver assistance ysis using hybrid zonotopes,” in 2024 American Control Conference
systems,” US Department of Transport, 2022. (ACC), IEEE, 2024, pp. 2513–2520.
[9] L. Brunke, M. Greeff, A. W. Hall, et al., “Safe learning in robotics: [30] L. Hadjiloizou, F. J. Jiang, A. Alanwar, and K. H. Johansson,
From learning-based control to safe reinforcement learning,” Annual “Formal Verification of Linear Temporal Logic Specifications Us-
Review of Control, Robotics, and Autonomous Systems, vol. 5, no. 1, ing Hybrid Zonotope-Based Reachability Analysis,” arXiv preprint
pp. 411–444, 2022. arXiv:2404.03308, 2024.
[10] S. Gu, L. Yang, Y. Du, et al., “A review of safe reinforce- [31] Y. Zhang, H. Zhang, and X. Xu, “Backward reachability analysis
ment learning: Methods, theory and applications,” arXiv preprint of neural feedback systems using hybrid zonotopes,” IEEE Control
arXiv:2205.10330, 2022. Systems Letters, vol. 7, pp. 2779–2784, 2023.
[11] Z. Liu, Z. Guo, Y. Yao, et al., “Constrained decision transformer [32] T. J. Bird, J. A. Siefert, H. C. Pangborn, and N. Jain, “A set-based
for offline safe reinforcement learning,” in International Conference approach for robust control co-design,” in 2024 American Control
on Machine Learning, PMLR, 2023, pp. 21 611–21 630. Conference (ACC), IEEE, 2024, pp. 2564–2571.
[12] K. Chakraborty, A. Gupta, and S. Bansal, “Enhancing Safety and [33] H. Zhang, Y. Zhang, and X. Xu, “Hybrid Zonotope-Based Backward
Robustness of Vision-Based Controllers via Reachability Analysis,” Reachability Analysis for Neural Feedback Systems With Nonlinear
arXiv preprint arXiv:2410.21736, 2024. Plant Models,” in 2024 American Control Conference (ACC), IEEE,
2024, pp. 4155–4161.
8
[34] Y. Zhang and X. Xu, “Reachability analysis and safety verification
of neural feedback systems via hybrid zonotopes,” in 2023 American
Control Conference (ACC), IEEE, 2023, pp. 1915–1921.
[35] D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning repre-
sentations by back-propagating errors,” nature, vol. 323, no. 6088,
pp. 533–536, 1986.
[36] T. Achterberg, R. E. Bixby, Z. Gu, E. Rothberg, and D. Weninger,
“Presolve reductions in mixed integer programming,” INFORMS
Journal on Computing, vol. 32, no. 2, pp. 473–506, 2020.
[37] A. Paszke, S. Gross, F. Massa, et al., “Pytorch: An imperative
style, high-performance deep learning library,” Advances in neural
information processing systems, vol. 32, 2019.
[38] X. Hu, J. Lee, and J. Lee, “Two-Stage Predict+ Optimize for MILPs
with Unknown Parameters in Constraints,” Advances in Neural
Information Processing Systems, vol. 36, 2024.
[39] J. Mandi and T. Guns, “Interior point solving for lp-based pre-
diction+ optimisation,” Advances in Neural Information Processing
Systems, vol. 33, pp. 7272–7282, 2020.
[40] S. Boyd and L. Vandenberghe, Convex optimization. Cambridge
university press, 2004.
[41] L. Gurobi Optimization, Gurobi optimizer reference manual, 2021.
[42] S. J. Wright, Primal-dual interior-point methods. SIAM, 1997.
[43] J. A. Siefert, T. J. Bird, A. F. Thompson, et al., “Reachability
analysis using hybrid zonotopes and functional decomposition,”
IEEE Transactions on Automatic Control, 2025.
[44] M. J. Kochenderfer and J. Chryssanthacopoulos, “Robust airborne
collision avoidance through dynamic programming,” Massachusetts
Institute of Technology, Lincoln Laboratory, Project Report ATC-
371, vol. 130, 2011.
[45] M. J. Kochenderfer, J. E. Holland, and J. P. Chryssanthacopoulos,
“Next generation airborne collision avoidance system,” Lincoln
Laboratory Journal, vol. 19, no. 1, pp. 17–33, 2012.
[46] M. J. Kochenderfer, C. Amato, G. Chowdhary, et al., “Optimized
airborne collision avoidance,” 2015.
[47] B. Amos and J. Z. Kolter, “Optnet: Differentiable optimization as a
layer in neural networks,” in International conference on machine
learning, PMLR, 2017, pp. 136–145.