Bayesian Filtering On Graphs
Bayesian Filtering On Graphs
ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) | 979-8-3503-6874-1/25/$31.00 ©2025 IEEE | DOI: 10.1109/ICASSP49660.2025.10887710
Abstract—Graph filters are ubiquitous for processing data over graphs. nodal data perturbed by additive white noise. With a Gaussian prior
However, most filters obtained from data are point-estimates and may over the filter coefficients, we apply expectation-maximization (EM)
be sensitive to changes in topology or data distributions. Thus, modeling
to obtain the Gaussian posterior distribution for the filter along with
uncertainty in filters is critical to quantify confidence in analyses or
improve downstream tasks in low-data regimes. We introduce a Bayesian constrained precision hyper-parameters [30]. In turn, this yields a
framework for graph filter design, termed Bayesian graph filters. Given random graph filter along with its uncertainty at different hops and
input-output realizations on a graph, we obtain the posterior filter and the correlations between different taps. We further analyze the filter
prior filter precision hyper-parameters via a constrained EM algorithm. frequency response, which ends up being multivariate Gaussian in
The posterior filter leads to uncertainty in its frequency response, which
has implications for stability. We study the stability via the integral the spectral domain as a consequence of the Gaussian filter density.
Lipschitz (IL) property and derive a lower bound for the probability of Unlike point-estimates, the uncertainty in the proposed Bayesian filter
Bayesian filters being IL. Results show that Bayesian filters can be more allows us to provide and empirically show probabilistic guarantees on
stable across the spectrum and under perturbations, provide uncertainty its stability. In particular, we provide a lower bound on the likelihood
estimates and can outperform point filters on multiple tasks.
Index Terms—Graph signal processing, graph filter design, Bayesian
that the proposed filter will be stable, i.e., integral Lipschitz (IL) [15].
learning, graph filter stability Our contributions are as follows.
(i) We define Bayesian graph filters for processing graph signals
I. I NTRODUCTION while quantifying their output uncertainty.
Graph filters are fundamental tools in graph signal processing (ii) We propose a filter design from an observed graph and pairs
(GSP) and machine learning [1], [2]. Their ubiquity has yielded of input-output signals through constrained EM with hyper-
numerous applications and theoretical results of their effects on parameters controlling how much uncertainty to permit.
graph signals [2–5]. Analysis of graph filters is thus not only (iii) We obtain the uncertainty of the Bayesian filter frequency re-
intrinsically valuable for processing data, but their obtention can also sponse in the spectral domain. Using this, we derive probabilistic
aid other tasks. For example, filter banks and graph neural networks bounds for the stability of the Bayesian graph filters, that is, we
(GNNs) learn compositions of graph filters from data for downstream show when the graph filters are IL [15].
predictions [6], [7], and network inference can be improved by jointly (iv) We showcase the differences between point and Bayesian fil-
estimating graph filters [8–10]. Myriad works obtain graph filters ters via numerical experiments on stability, perturbation, and
from nodal observations, either in the vertex domain [8], [11–13] or transferability for forecasting, demonstrating that uncertainty in
the frequency domain [6], [14]. estimation can improve stability and performance in some cases.
However, real-world data is often imperfect or missing, yielding II. BAYESIAN G RAPH F ILTERS
uncertain estimates. Many tasks can benefit from accounting for
Consider a graph G = {V, E} with node set V of N nodes and
confidence in graph filter quality, such as assessing the robustness
edge set E. Let S ∈ RN ×N be a shift operator of G [2], [31], [32].
of a model to perturbations [14], [15]. However, previous methods to
We have a set D = {(xr , yr )}R r=1 of R pairs of graph signals
estimate filters obtain point-estimates without measuring confidence
over G, where we consider xr as the input graph signal and yr
in the learned filters [5], [8], [11], [16], [17]. While uncertainty in
its corresponding output. In particular, P we consider each yr to be
GSP has been considered, it is typically restricted to that of graph K k
the output of a graph filter H(S) = k=0 hk S expressed as a
signals [18–20] or the underlying topology [21–23]. For example,
polynomial of S, which can be fully characterized by the filter with
robust graph filter identification aims to account for noisy data or
parameters h = [h0 , . . . , hK ]⊤ [1], [2]. To estimate the filter, we
perturbed connections [14], [16], [24–26]. Uncertainty in GSP has
define Sxr = [xr , . . . , SK xr ] ∈ RN ×(K+1) to collect xr and its K
been modeled for recovering graph signals or network topology
shifts over G. Our observation model is
through Bayesian approaches [19], [22], [23], [27–29], but confidence
in graph filter learning has yet to be explored. yr = H(S)xr + nr = Sxr h + nr , (1)
We propose a probabilistic model for Bayesian graph filters asso-
where the noise nr ∼ N (0N , β −1 IN ) is additive white Gaussian
ciated with input and output nodal observations. Representing filter
with precision β > 0. Typically, graph filters are obtained as point-
coefficients as a random variable allows us to quantify uncertainty
estimates by solving a problem in the form of
through the variance. Furthermore, we consider filter design from
min l(yr , xr , h) + g(h) (2)
This research is supported by the NSF under award CCF-2340481 and the h∈RK+1
TTW-OTP project GraSPA (project number 19497) financed by the Dutch where l(yr , xr , h) measures the discrepancy between the filtered
Research Council (NWO) and by the TU Delft AI programme. Research
was sponsored by the Army Research Office and was accomplished under input Sxr h and yr , while g(h) enforces a prior on h. Common
Grant Number W911NF-17-S-0002. The views and conclusions contained choices of prior include the squared ℓ2 norm of the filter ||h||22
in this document are those of the authors and should not be interpreted or an assumed distribution for the optimal h, which corresponds to
as representing the official policies, either expressed or implied, of the the maximum a-posteriori (MAP) filter [18]. Instead, our goal is to
Army Research Office or the U.S. Army or the U.S. Government. The
U.S. Government is authorized to reproduce and distribute reprints for
learn filters which are random variables that incorporate uncertainty
Government purposes notwithstanding any copyright notation herein. Emails: surrounding the task. We do this via learning a density for the filter
{b.das,e.isufi-1}@tudelft.nl, {nav,segarra}@rice.edu parameters h in a Bayesian fashion, which we outline as follows.
Authorized licensed use limited to: George Mason University. Downloaded on July 28,2025 at 23:40:23 UTC from IEEE Xplore. Restrictions apply.
⊤
As is common in Bayesian inference, we consider a prior p(h) vi := [1, λi , . . . , λK
i ] . The
PK frequency response of a filter hp ∼
k
on the parameters h as h ∼ N (0, A). We further specify A = N (µp , Σp ) is ĥp (λ) = k=0 [hp ]k λ . The response over the
diag(α0−1 , . . . , αK
−1
) implying independent filter coefficients with αk spectrum of G is ĥp (λ) = Vhp . Since hp is multivariate Gaussian,
as the prior precision for the k-th hop coefficient hk . Given the prior ĥp (λ) ∼ N (µ̂p , Σ̂p ) is also Gaussian with mean and covariance
p(h), the posterior density p(h|D) obeys
µ̂p := Vµp , Σ̂p := VΣp V⊤ . (6)
p(h|D) ∝ p(y|h, x)p(h)p(x), (3)
We note a few key implications. First, the frequency response at each
where p(y|h, x) and p(x) are the densities of the output and input, eigenvalue is now Gaussian with
respectively. We assume the input density p(x) does not depend on
h, a common assumption such as for stationary observations yr , ĥp (λi ) ∼ N (vi⊤ µp , vi⊤ Σp vi ). (7)
where p(x) is white Gaussian noise [33], [34]. The inference problem
Thus, µ̂p can serve as a point-estimate of the filter frequency response
thus corresponds to (i) finding the posterior filter density p(h|D)
and (ii) obtaining the precision parameters of the noise β and the at any eigenvalue, but more importantly, the variance of ĥp provides
filter α0 , . . . , αK . We do this via the expectation-maximization (EM) uncertainty for the frequency response. The off-diagonal elements
algorithm [30], a widely used statistical framework to maximize the of the covariance matrix Σ̂p model the relationship between the
joint likelihood of the data D and h. This can also be seen as solving response at different eigenvalues. The relation between the i-th and
a linear regression problem using EM. In the E step, the posterior j-th eigenvalues equals
Gaussian density is obtained with covariance and mean as K X
X K
R
!−1 R
! [Σ̂p ]ij = vi⊤ Σp vj = λm n
i [Σp ]mn λj . (8)
m=0 n=0
X ⊤ −1
X ⊤
Σp = β Sxr Sxr + A , µp = Σp β Sxr yr
r=1 r=1 Each term in the R.H.S. of (8) comprises a product of three factors:
(4)(i) λm n
i , related to the m-hop neighborhood of G; (ii) λj , related
to the n-hop neighborhood; and (iii) [Σ ]
p mn , which captures the
given the current {αk }K k=0 and β. In the M step, we update covariance between the filter coefficients for the m and n-hop
1 NR neighborhoods, respectively. However, observe from P (4) that as Σp
αk = , β = PR . ⊤
[Σp ]kk +[µp ]2k r=1 ||yr −Sx r µ ||
p 2
2
+ tr(Sx r Σp S⊤ )
x r
depends on the inverse of a matrix containing r S x r
S xr , it also
In cases where the filter order K is low or the data is smooth over G, Bayesian Filter Stability. A desirable property of graph filters is
the EM algorithm may converge to a point filter without measuring stability [15], [24], [37–40]. This ensures that filter outputs across
confidence. More specifically, uncertainty becomes negligible as nodes are robust to changes in topology, crucial for tasks like recom-
{αk }K k=0 and β increase, so we provide constraints for these values to mendation or stock prediction. Filters sampled from the posterior
control the level of uncertainty in the estimates. We apply constrained N (µp , Σp ) yield a multivariate Gaussian frequency response in
EM [35] by restricting αk ∈ [αmin , αmax ] and β ∈ [βmin , βmax ] such ĥp (λ), allowing us to consider uncertainty in the stability of such
that α1min and αmax1
are the maximum and minimum prior variance filters. Although graph filter stability has been studied from several
allowed for each filter-tap. This allows a degree of flexibility in the viewpoints, we consider the integral Lipschitz (IL) property [15],
filter design. Given the dataset D and the graph G, the constrained EM which offers a trade-off between the stability of a filter and its
approach provides our Bayesian graph filter hp ∼ N (µp , Σp ) and discriminability at high frequencies [15]. In the following proposition,
the hyper-parameters β and {αk }K k=0 . For pre-specified β and A, we characterize Bayesian graph filter stability w.r.t. its IL property.
the mean µp in (4) is the MAP filter [18]. Note that we solve for the
Proposition 1. Let hp ∼ N (µp , Σp ). Given C > 0 and the filter
hyper-parameters in a probabilistic sense (EM) whereas point filter 2
estimates typically rely on cross-validation and related techniques ĥp (λ), for which ĥp (λi ) − ĥp (λj ) ∼ N (µij , σij ), for any pair
2C|λi −λj |
[36]. (λi , λj ), we have for δ = (λi +λj )
With Bayesian inference, we can interpret the graph filter distri- h i
bution. The k-th element of µp is the LMMSE-type estimate of the Pr |ĥp (λi )− ĥp (λj )| < δ ≥ (9)
filter coefficient hk , which weights the information from the k-hop 2
σij (|µij |−δ)
1
neighborhood. The k-th diagonal of Σp indicates the confidence 1 − √2π δ−|µij | exp − 2σij
2
in this weight, or the uncertainty associated with accumulating
1 σ ij
(|µ ij |+δ) 2
− 2π |µij |+δ exp − 2σ2
√ , |µij | < δ
information from the k-th hop. The covariance Σp is influenced by
ij
the correlation between
P shifts evaluated across the training set. This is 2
(|µ |−δ)2
σij
σij
seen from the term R ⊤ √1 exp − ij
r=1 Sxr Sxr , which is the sum of outer products
2π |µij |−δ
1 − (|µ |−δ) 2 2σij2
ij
of vectors that contain the signal and all its shifts at a fixed node. The
2
σ (|µ |+δ)
1 ij ij
− 2π |µij |+δ exp − 2σ2 , |µij | > δ
√
off-diagonal elements of Σp highlight the relation between P filter taps ij
of different neighborhood hops, also originating from R ⊤
r=1 Sxr Sxr , i.e., the Bayesian graph filter is IL with lower bounded probability.
and can reveal information about how the graph structure influences
the filter. This is important in the context of sampling from the Proof. See Appendix.
posterior filter for unseen data and potentially on new graphs. This result shows that the IL property holds with probability
depending on the statistics of ĥp (λi ) − ĥp (λj ), and thus, on λi
III. F REQUENCY A NALYSIS and λj . A few comments are in order. First, consider two high
Bayesian Filter Frequency Response. Let λ = [λ1 , . . . , λN ]⊤ frequencies λi , λj which are closely separated, i.e., λi − λj = ϵ
be the vector of eigenvalues of S. Let V ∈ RN ×(K+1) and λi + λj ≈ 2λ for a small ϵ and large λ. If the difference in
be the Vandermonde matrix of the eigenvalues with i-th row frequency response has zero mean, i.e., µij = 0, the lower bound
Authorized licensed use limited to: George Mason University. Downloaded on July 28,2025 at 23:40:23 UTC from IEEE Xplore. Restrictions apply.
Fig. 1: IL probability of point (blue) and Bayesian (orange) filter with Fig. 2: Rel. perturbation for point (blue) filter and Bayesian filter for
lower bound (yellow) for BA graph with (left) αmin = 10−3 and (right) different values of αmin , evaluated over different fraction of edge rewiring
αmin = 1 for C = 0.1.
for (left) NOAA and (right) ML100K data sets.
σ λ
translates to 1 − √x2π exp(− x22 ) for x = Cϵ ij
. This bound reduces adding more stability to the proposed filter. Thus, the Bayesian filter
with x, i.e, a lower σij implies a higher chance of being IL. For λi , λj influences the IL nature, and we can tune it to be more stable or
σ
very low, assuming λ/ϵ → 1 or x = Cij , the lower bound is affected more discriminative depending on the application.
more by σij . Thus, if the posterior responses have the same mean at
consecutive eigenvalues, the variance of the difference allows some Perturbation. We evaluate the stability of Bayesian filters w.r.t. graph
level of discrimination between the frequencies while maintaining perturbations compared to point estimates for two real-world datasets.
IL behaviour in probability. Second, even when |µij | > δ, i.e., the First, we consider Movielens-100k (ML100K) comprising R = 943
frequency response difference is high, our Bayesian filter can still users and N = 1682 items [41]. We build a 35 nearest neighbour
satisfy the IL property. Thus, the Bayesian filter provides some prior (NN) item graph with the normalized Laplacian S from the ratings
information about the IL nature of the filter and its transferability of 500 randomly selected users based on Pearson correlation [3].
w.r.t. stability, if this filter is used for the same task on another graph. We use the rating graph signals from another 300 and 143 users for
The lower bound in (9) may also be used as a guide for this purpose. training and testing, where we learn the filter from the training users
and use them to test stability on the test users. Each xr contains 40%
IV. E XPERIMENTAL R ESULTS of a user’s ratings and the corresponding yr contains all the ratings.
We compare our Bayesian filter to the point-estimate filter, obtained Second, we consider the NOAA temperature dataset over N = 109
as a solution to an ℓ2 -norm regularized graph filter. Throughout, nodes, from which we build a 15 NN graph [42]. Each pair (xr , yr )
we consider S to be the degree-normalized Laplacian, i.e., S = is the temperature graph signal measured at 40% of the nodes and
1 1
D− 2 LD− 2 where D is a diagonal matrix containing the degree the full signal, respectively. We use 300 and 98 hourly temperature
of each node. We fix αmax = βmax = 10, i.e., the minimum prior signals for training and testing. We use a filter of order K = 7
variance for each filter tap is 0.1. to perform item collaborative filtering and temperature prediction,
respectively. We perturb S to obtain Ŝ via degree-preserving edge
Integral Lipschitz Nature. Here, we contrast the IL nature between re-wiring. We rewire from 10 up to 30% of the edges. For a test
Bayesian and point filters. We consider an N = 100 node Barabasi- signal x, we measure the relative squared perturbation
Albert (BA) graph. We generate R = 500 (xr , yr ) pairs where xr ∼
N (0, I), and we obtain a low-pass yr by filtering xr in the spectral ||H(S)x − H(Ŝ)x||22
domain. In particular, consider the eigendecomposition S = UΛU⊤ Per(x) = , (10)
||H(S)x||22
such that Λ is the diagonal matrix of eigenvalues λ1 , . . . , λN . We
compute yr = UĤ(Λ)U⊤ xr , where Ĥ(Λ) is diagonal with i-th which empirically measures the stability of the filter. We repeat the
diagonal element e−4λi . We estimate all filters of order K = 5. experiment 10 times for each data set.
Next, we grid eigenvalues over [0, 2], generate 104 instances of Figure 2 shows the test mean perturbation for both data-sets for
hp ∼ N (µp , Σp ) and compute the sample probability of the graph different αmin . For ML100K, the point filter returns lower perturbation
filters satisfying the IL criterion in Proposition 1 for a given C. We than the Bayesian filter when αmin is lower (higher variance as prior),
take λi and λj to be consecutive points on the grid with λi < λj . that is, decreasing αmin leads to higher relative perturbation. Indeed,
Since the point filter is fixed, its IL probability for each (λi , λj ) will users can vary greatly in rating styles, so allowing more uncertainty
either be zero or one. We also evaluate the lower bound in (9). in the estimated filter by decreasing αmin may exacerbate perturbation
Figure 1 showcases the empirical probabilities of (i) the point filter over the test users. When αmin is small enough, the set of feasible
and (ii) the Bayesian filter being IL, and (iii) the lower bound of estimates contains the optimal solution, so the perturbation varies
IL probability across the spectrum for C = 0.1 and two different less as αmin continues to decrease. Indeed, as αmin decreases, the
values of αmin ∈ {10−3 , 1}. The point filter is not IL at low and standard deviation of the perturbation is on the order of 10−2 to 10−3 .
high frequencies. The Bayesian filter, however, maintains a nonzero However, for the NOAA dataset with smooth temporal variations, the
IL probability at high frequencies and over a greater portion of the Bayesian filter is more stable for all αmin . The lesser perturbation or
spectrum. When αmin = 10−3 , the IL probability drops steeply higher stability alludes to the IL nature and how Bayesian filters tend
outside of the zone where the point filter is IL. Indeed, a higher to be IL in probability in regions where the point filter is not. As graph
prior variance in the filter taps can cause greater fluctuations in filters aggregate information over hops, as long as the K-hop neigh-
|hp (λi ) − hp (λj )|. This can also be seen from the second figure bourhoods do not change much with perturbation, the uncertainty
where αmin = 1 leads to a more gradual reduction in the IL learned from training will still be valid for accumulating information.
probability. The results suggest a trade-off: For regions where the The relative perturbation for NOAA reduces with increase in rewiring.
point filter is IL almost surely, the Bayesian filter is IL with lesser This occurs partly due to a progressively increasing concentration of
probability. However, this leads to regions where the point filter is not eigenvalues around a spectral point as a consequence of rewiring,
IL but the Bayesian filter has nonzero IL probability, probabilistically along with the filters being IL around that point. To conclude, the
Authorized licensed use limited to: George Mason University. Downloaded on July 28,2025 at 23:40:23 UTC from IEEE Xplore. Restrictions apply.
Fig. 3: (left) Relative squared error across test samples for Bayesian filters (blue box plots) and point filters (green) for one train-test sub-graph;
(centre) ĥ(λ) of the point filter and (right) ĥp (λ) of the Bayesian filter with spectral response at the training (black squares) and test (blue squares)
sub-graphs. Shaded area for the Bayesian filters shows the marginal variance at each eigenvalue.
Bayesian filter incorporates uncertainty across different hops while and transferability. For future works, we aim to analyze the effect
learning from data and thus can be more robust to perturbations. of Bayesian graph filters on graph convolutional neural networks.
Additionally, we will consider structural priors on inputs for a more
Transferability. We evaluate if the informed randomness learned by general posterior density.
Bayesian filters transfers to another graph. We consider the task of
forecasting temperature one hour ahead for the NOAA dataset. For VI. A PPENDIX
training, we sample a 50 node connected sub-graph via random walks Lemma 1. Consider t ∼ N (0, σ 2 ). Let Pr [t > x] ≤ g(x, σ) and
from the 35 NN graph. The pair (xr , yr ) corresponds to signals with Pr [t > x] ≤ h(x, σ) be tail bounds for x > 0 and suitable functions
10% observed data and the whole signal one hour ahead, defined on g(x, σ) and h(x, σ). Then for t̄ ∼ N (µ, σ 2 ), we have the following
the same subgraph. We train filters of order K = 5 over the first (
R = 300 hours to predict the temperature one hour ahead. For testing, 1 − g(x − |µ|, σ) − g(x + |µ|, σ), |µ| < x
Pr [|t̄| ≤ x] ≥ (11)
we sample another subgraph of the same size and use the trained filter h(|µ| − x, σ) − g(x + |µ|, σ), |µ| > x
for the same task, i.e., one hour forecasting given 10% observed data
Proof of Lemma 1. For |µ| < x, we have
over the following 98 hour samples. We use αmin = 10−3 and repeat
the experiment for 10 sampled train and test subgraph pairs. Pr [|t̄| ≥ x] = Pr [t̄ ≥ x] + Pr [t̄ ≤ −x]
Fig. 3 illustrates our findings. The leftmost plot shows the nor- = Pr [t > x − |µ|] + Pr [t < −x − |µ|]
malized squared errors across the 98 test samples for the point filter (12)
= Pr [t > x − |µ|] + Pr [t > x + |µ|]
(green) and the Bayesian filter (blue), where the mean squared errors
are 0.42 and 0.41, respectively. Bayesian filters provide uncertainty ≤ g(x − |µ|, σ) + g(x + |µ|, σ)
across predictions and thus the error, so we provide box plots for where we use t = t̄ − |µ| and Pr [t < −x − |µ|] = Pr [t < −x − |µ|]
Bayesian error, which the point filter cannot provide. The second and from the symmetry in t. Therefore
third plots show one advantage of Bayesian filters in the frequency
domain. We plot the point and mean of the Bayesian filter frequency Pr [|t̄| ≤ x] ≥ 1 − g(x − |µ|, σ) − g(x + |µ|, σ). (13)
response (for µp ) along with the responses at the training (black For |µ| > x, we have
squares) and test (blue squares) spectra for another train-test pair,
along with the theoretical marginal variance at each eigenvalue [cf. Pr [|t̄| ≥ x] = Pr [t̄ ≥ x] + Pr [t̄ ≤ −x]
(7)] for the Bayesian filter (shaded portion). As the NOAA data = Pr [t > x − |µ|] + Pr [t < −x − |µ|]
tends to be smoother, estimating the response at higher frequencies . (14)
= 1 − Pr [t < x − |µ|] + Pr [t < −x − |µ|]
is more difficult, hence the increased variance at high frequencies for
= 1 − Pr [t > |µ| − x] + Pr [t > x + |µ|]
the sampled Bayesian filters. However, the point filter gives a fixed
response, which can be unreliable at higher frequencies as shown in Given Pr [t > |µ| − x] ≥ h(|µ| − x, σ), we have
the figure if data comes from a low-pass filtering process. A Bayesian
Pr [|t̄| ≥ x] = −h(|µ| − x, σ) + g(x + |µ|, σ), or,
filter is more robust as it samples based on the learned uncertainty, (15)
thereby countering the effect of test eigenvalues placed further away. Pr [|t̄| ≤ x] ≥ h(|µ| − x, σ) − g(x + |µ|, σ).
Indeed, the mean errors for this instance are 0.53 and 0.26 for the Proof of Proposition 1. To prove the IL property, ĥp (λi ) − ĥp (λj )
point and Bayesian filter, respectively and the median errors 0.61 and is Gaussian with mean µij = (v(λi ) − v(λj ))⊤ µp and
0.32, respectively.
2
σij = (ei − ej )⊤ Σp (λ)(ei − ej ) (16)
V. C ONCLUSION
where ei is the ith column of the identity matrix IN . Additionally,
This work proposes Bayesian graph filter design given a graph two standard tail bounds for t are
and nodal observations. In particular, we obtain a joint posterior 2
x
distribution for the graph filter coefficients and their precision hyper- σ exp(− 2σ2 )
Pr [t > x] ≤ √ = g(x, σ), and (17)
parameters via Expectation Maximization, precluding the need to 2π x
specify prior distributions. Our Bayesian filter distribution not only
2
yields an estimate of the graph filter but also models the uncertainty x
exp(− 2σ 2) σ σ3
in its estimation. This uncertainty also extends into the frequency Pr [t > x] ≥ √ ( − 3 ) = h(x, σ) (18)
2π x x
response, which allows us to provide probabilistic bounds for the
stability of such filters via Integral Lipschitz-ness. Comparisons with The proof is complete by applying Lemma 1 with t̄ = ĥp (λi ) −
2C|λi −λj |
point filters showcase the advantages of such filters w.r.t stability ĥp (λj ), µ = µij , σ = σij , and x = λi +λ j
.
Authorized licensed use limited to: George Mason University. Downloaded on July 28,2025 at 23:40:23 UTC from IEEE Xplore. Restrictions apply.
R EFERENCES [27] A. Möllers, A. Immer, E. Isufi, and V. Fortuin, “Uncertainty in graph
contrastive learning with Bayesian neural networks,” in Advances in
[1] E. Isufi, F. Gama, D. I. Shuman, and S. Segarra, “Graph filters for Approx. Bayesian Inference, 2023.
signal processing and machine learning on graphs,” IEEE Trans. Signal [28] Q. Lu and K. D. Polyzos, “Gaussian process dynamical modeling for
Process., 2024. adaptive inference over graphs,” in IEEE Intl. Conf. Acoust., Speech and
[2] A. Sandryhaila and J. M. F. Moura, “Discrete signal processing on Signal Process. (ICASSP). IEEE, 2023, pp. 1–5.
graphs: Frequency analysis,” IEEE Trans. Signal Process., vol. 62, [29] L. Lorch, J. Rothfuss, B. Schölkopf, and A. Krause, “Dibs: Differentiable
no. 12, pp. 3042–3054, 2014. bayesian structure learning,” Adv. Neur. Inf. Proces. Sys., vol. 34, pp.
[3] W. Huang, A. G. Marques, and A. R. Ribeiro, “Rating prediction via 24 111–24 123, 2021.
graph signal processing,” IEEE Trans. Signal Process., vol. 66, no. 19, [30] A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood
pp. 5066–5081, 2018. from incomplete data via the EM algorithm,” J. Royal Stat. Soc: Series
[4] M. Coutino, E. Isufi, and G. Leus, “Advances in distributed graph B (Stat. Methodol.), vol. 39, no. 1, pp. 1–22, 1977.
filtering,” IEEE Trans. Signal Process., vol. 67, no. 9, pp. 2320–2333, [31] D. I. Shuman, S. K. Narang, P. Frossard, A. Ortega, and P. Van-
2019. dergheynst, “The emerging field of signal processing on graphs: Ex-
[5] S. Segarra, G. Mateos, A. G. Marques, and A. Ribeiro, “Blind identifi- tending high-dimensional data analysis to networks and other irregular
cation of graph filters,” IEEE Trans. Signal Process., vol. 65, no. 5, pp. domains,” IEEE Signal Process. Mag., vol. 30, no. 3, pp. 83–98, 2013.
1146–1159, 2016. [32] P. Djuric and C. Richard, Cooperative and Graph Signal Processing:
[6] M. He, Z. Wei, Z. Huang, and H. Xu, “BernNet: Learning arbitrary graph Principles and Applications. Academic Press, 2018.
spectral filters via Bernstein approximation,” in Advances in Neural Info. [33] A. G. Marques, S. Segarra, G. Leus, and A. Ribeiro, “Stationary graph
Process. Syst., vol. 34, 2021, pp. 14 239–14 251. processes and spectral estimation,” IEEE Trans. Signal Process., vol. 65,
[7] Y. Dong, K. Ding, B. Jalaian, S. Ji, and J. Li, “AdaGNN: Graph neural no. 22, pp. 5911–5926, 2017.
networks with adaptive frequency response filter,” in Intl. Conf. Info. & [34] B. Pasdeloup, V. Gripon, G. Mercier, D. Pastor, and M. G. Rabbat,
Knowledge Management, 2021, p. 392––401. “Characterization and inference of graph diffusion processes from ob-
[8] H. E. Egilmez, E. Pavez, and A. Ortega, “Graph learning from filtered servations of stationary signals,” IEEE Trans. Signal and Info. Process.
signals: Graph system and diffusion kernel identification,” IEEE Trans. over Netw., vol. 4, no. 3, pp. 481–496, 2018.
Signal and Info. Process. over Netw., vol. 5, no. 2, pp. 360–374, 2018. [35] K. Takai, “Constrained EM algorithm with projection method,” Comp.
[9] R. Shafipour, S. Segarra, A. G. Marques, and G. Mateos, “Identifying Stat., vol. 27, pp. 701–714, 2012.
the topology of undirected networks from diffused non-stationary graph [36] Z. Chen et al., “Bayesian filtering: From kalman filters to particle filters,
signals,” IEEE Open J. of Signal Process., vol. 2, pp. 171–189, 2021. and beyond,” Statistics, vol. 182, no. 1, pp. 1–69, 2003.
[10] A. Natali, M. Coutino, and G. Leus, “Topology-aware joint graph filter [37] H. Kenlay, D. Thanou, and X. Dong, “On the stability of polynomial
and edge weight identification for network processes,” in Intl. Wrkshp. spectral graph filters,” in IEEE Intl. Conf. Acoust., Speech and Signal
on Machine Learning for Signal Process. (MLSP), 2020, pp. 1–6. Process. (ICASSP). IEEE, 2020, pp. 5350–5354.
[11] S. Chen, A. Sandryhaila, J. M. F. Moura, and J. Kovačević, “Adaptive [38] H. Kenlay, D. Thano, and X. Dong, “On the stability of graph convolu-
graph filtering: Multiresolution classification on graphs,” in IEEE Global tional neural networks under edge rewiring,” in IEEE Intl. Conf. Acoust.,
Conf. Signal and Info. Process. (GlobalSIP), 2013, pp. 427–430. Speech and Signal Process. (ICASSP). IEEE, 2021, pp. 8513–8517.
[12] B. Das and E. Isufi, “Graph filtering over expanding graphs,” in IEEE [39] H. Kenlay, D. Thanou, and X. Dong, “Interpretable stability bounds
Data Science and Learning Wrkshp. (DSW), 2022, pp. 1–8. for spectral graph filters,” in Intl. Conf. on Machine Learning (ICML).
[13] ——, “Online filtering over expanding graphs,” in Asilomar Conf. PMLR, 2021, pp. 5388–5397.
Signals, Syst., and Computers, 2022, pp. 43–47. [40] R. Levie, E. Isufi, and G. Kutyniok, “On the transferability of spectral
[14] L. Testa, C. Battiloro, S. Sardellitti, and S. Barbarossa, “Stability of graph filters,” in Intl. Conf. Samp. Th. Appl. (SampTA). IEEE, 2019,
graph convolutional neural networks through the lens of small perturba- pp. 1–5.
tion analysis,” in IEEE Intl. Conf. Acoust., Speech and Signal Process. [41] F. M. Harper and J. A. Konstan, “The MovieLens datasets: History and
(ICASSP). IEEE, 2024, pp. 6865–6869. context,” ACM Trans. Interactive Intell. Syst., vol. 5, no. 4, pp. 1–19,
[15] F. Gama, J. Bruna, and A. Ribeiro, “Stability properties of graph neural 2015.
networks,” IEEE Trans. Signal Process., vol. 68, pp. 5680–5695, 2020. [42] A. Arguez, I. Durre, S. Applequist, R. S. Vose, M. F. Squires, X. Yin,
[16] S. Rey, V. M. Tenorio, and A. G. Marques, “Robust graph filter iden- R. R. Heim, and T. W. Owen, “NOAA’s 1981–2010 US climate normals:
tification and graph denoising from signal observations,” IEEE Trans. An overview,” Bul. Amer. Meteor. Soc., vol. 93, no. 11, pp. 1687–1697,
Signal Process., 2023. 2012.
[17] Y. Zhu, F. J. I. Garcia, A. G. Marques, and S. Segarra, “Estimating
network processes via blind identification of multiple graph filters,” IEEE
Trans. Signal Process., vol. 68, pp. 3049–3063, 2020.
[18] G. Sagi and T. Routtenberg, “MAP estimation of graph signals,” IEEE
Trans. Signal Process., 2023.
[19] R. Torkamani and H. Zayyani, “Statistical graph signal recovery using
variational Bayes,” IEEE Trans. Circuits and Syst. II: Express Briefs,
vol. 68, no. 6, pp. 2232–2236, 2020.
[20] A. Kroizer, T. Routtenberg, and Y. C. Eldar, “Bayesian estimation of
graph signals,” IEEE Trans. Signal Process., vol. 70, pp. 2207–2223,
2022.
[21] D. Romero, M. Ma, and G. B. Giannakis, “Kernel-based reconstruction
of graph signals,” IEEE Trans. Signal Process., vol. 65, no. 3, pp. 764–
778, 2017.
[22] Y.-C. Zhi, Y. C. Ng, and X. Dong, “Gaussian processes on graphs via
spectral kernel learning,” IEEE Trans. Signal and Info. Process. over
Netw., vol. 9, pp. 304–314, 2023.
[23] F. Opolka, Y.-C. Zhi, P. Lió, and X. Dong, “Adaptive Gaussian processes
on graphs via spectral graph wavelets,” in Intl. Conf. Artif. Intell. Stat.
(AISTATS), ser. PMLR, vol. 151, 2022, pp. 4818–4834.
[24] H.-S. Nguyen, Y. He, and H.-T. Wai, “On the stability of low pass graph
filter with a large number of edge rewires,” in IEEE Intl. Conf. Acoust.,
Speech and Signal Process. (ICASSP). IEEE, 2022, pp. 5568–5572.
[25] Z. Gao and E. Isufi, “Learning stochastic graph neural networks with
constrained variance,” IEEE Trans. Signal Process., vol. 71, pp. 358–
371, 2023.
[26] ——, “Learning stable graph neural networks via spectral regulariza-
tion,” in Asilomar Conf. Signals, Syst., and Computers, 2022, pp. 01–05.
Authorized licensed use limited to: George Mason University. Downloaded on July 28,2025 at 23:40:23 UTC from IEEE Xplore. Restrictions apply.