High level analysis of microarray data
Lecture 2
Claudio Altafini
SISSA
http://people.sissa.it/altafini
Claudio Altafini, February 9, 2007
p. 1/60
High level analysis of microarray data
High level analysis of microarrays
Clustering
Model-free methods
1.
Principal Component Analysis
Ontological enrichment
CLUSTERING ALGORITHMS
2.
PRINCIPAL COMPONENT ANALYSIS
Inferring Regulatory Networks
Bayesian Networks
3.
reduce the dimension of a data set keeping the most
significant directions
ONTOLOGICAL ENRICHMENT
Claudio Altafini, February 9, 2007
put together genes with similar expression profiles
add functional annotation (e.g. GO)
perform statistical tests on the ontological information
p. 2/60
High level analysis of microarray data
High level analysis of microarrays
Gene network inference methods
1. LESS MODEL - DEPENDENT METHODS (e.g. probabilistic,
statistical, etc.)
looking only for the core relationships of a network
not quantitative
can be used for large scale networks
2. MODEL - DEPENDENT METHODS (e.g. ODEs)
provide both the network topology and the functional
relationships
useful mostly for small/mid scale networks
quantitative
Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
warning: the classification is not sharp!!!!
Claudio Altafini, February 9, 2007
p. 3/60
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Clustering
Bayesian Networks
Claudio Altafini, February 9, 2007
p. 4/60
Clustering
High level analysis of microarrays
example: cluster these
Clustering
Clustering
k means clustering
Hierarchical clustering
SOM clustering
Quality indices
More clustering
Drawbacks
Example: hippocampus
hy. time series
clustering
clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Claudio Altafini, February 9, 2007
p. 5/60
Clustering
High level analysis of microarrays
Clustering
Clustering
k means clustering
Hierarchical clustering
SOM clustering
Quality indices
More clustering
Drawbacks
Example: hippocampus
hy. time series
clustering
clustering
Principal Component Analysis
typically: Lp norm
d(x, y) = kx ykp ,
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Claudio Altafini, February 9, 2007
Clustering = dividing a set of data into relatively
homogeneous groups according to a user-defined metric
d(x, x) = 0
d(x, y) > 0 such that d(x, y) = d(y, x)
d(x, y) 6 d(x, z) + d(z, y)
p = 1, . . . ,
example: Euclidean norm p = 2
p
d(x, y) = (x1 y1 )2 + . . . + (xn yn )2
3 main algorithms:
k-means
hierarchical clustering
SOM: Self Organizing Maps
p. 6/60
Clustering algorithms: k-means
data x1 , . . . , xn ,
# of clusters k
Output: k clusters
Algorithm:
1. select k centroids
2. assign each element xi to the cluster with nearest
centroid
3. recompute the centroid
4. repeat until it converges
Properties:
need to choose k
initialization step can change the result
sensitive to perturbations
High level analysis of microarrays
Inputs:
Clustering
Clustering
k means clustering
Hierarchical clustering
SOM clustering
Quality indices
More clustering
Drawbacks
Example: hippocampus
hy. time series
clustering
clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Claudio Altafini, February 9, 2007
p. 7/60
Hierarchical Clustering
cluster
merging
cost
High level analysis of microarrays
Clustering
Clustering
k means clustering
Hierarchical clustering
SOM clustering
Quality indices
More clustering
Drawbacks
Example: hippocampus
hy. time series
clustering
clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Claudio Altafini, February 9, 2007
data x1 , . . . , xn
Output: clustering tree
Algorithm:
x1 x2 x3 x4 x5 x6 x7
put each xi in a cluster Ci = {xi }
compute the merging cost between each pair of clusters
merge the two clusters with cheapest merging cost
repeat until only 1 cluster is left
cost of merging
single linkage
minxCi ,yCj d(x, y) ( loose clusters)
P
P
1
average linkage
yCj d(x, y)
xCi
|Ci | |Cj |
complete linkage
maxxCi ,yCj d(x, y) ( tight
clusters)
properties:
greedy algorithm
tends to build big clusters
need to choose a threshold on the # of clusters
Inputs:
p. 8/60
Clustering: Self Organizing Maps
data x1 , . . . , xn ;
SOM topology (k nodes)
Output k clusters
Algorithm:
1. start with a simple topology
2. select a random data p
3. move all nodes towards p according to the rule
Inputs:
High level analysis of microarrays
Clustering
Clustering
k means clustering
Hierarchical clustering
SOM clustering
Quality indices
More clustering
Drawbacks
Example: hippocampus
hy. time series
clustering
clustering
fi+1 (x) = fi (x) + (d(x, xp ), i)(p fi (x))
Principal Component Analysis
Ontological enrichment
fi (x) = position of node x at iteration i
xp = node closest to p
= (d, i) learning rate
4. go to 2. until convergence
properties
even more computationally costy, but more robust
neighboring clusters are similar: elements on the border
can belong to both clusters
Inferring Regulatory Networks
Bayesian Networks
Claudio Altafini, February 9, 2007
p. 9/60
Clustering: quality indices
High level analysis of microarrays
homogeneity
1
Clustering
Clustering
k means clustering
ngenes
Hierarchical clustering
SOM clustering
Quality indices
More clustering
hy. time series
clustering
clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
d(xi , C(xi ))
= average distance between each x and the centroid of
the cluster it belongs to
reflects the compactness of the cluster
separation
X
1
P
nCi nCj d(
ci , cj ))
n
n
C
C
j
i
i6=j
Drawbacks
Example: hippocampus
i6=j
Bayesian Networks
weighted average distances between cluster centroids
reflects the distance between clusters
siluette width: composition of the two indices
Claudio Altafini, February 9, 2007
p. 10/60
More advanced clustering
High level analysis of microarrays
Clustering
Clustering
k means clustering
Hierarchical clustering
SOM clustering
Quality indices
More clustering
Drawbacks
Example: hippocampus
hy. time series
clustering
clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
example: rather than a distance one can use a Pearson
correlation
Pn
)(yi y)
i=1 (xi x
y) = pP
pPn
d(x,
n
)2
)2
i=1 (xi x
i=1 (yi y
Pn
Pn
where x
= n1 i=1 xi , y = n1 i=1 yi
Pearson metric:
uses differences from the mean rather than the mean
y) [1, 1]
normalized by the standard deviation =d(x,
invariant to scaling and shifting of x and y
Bayesian Networks
Claudio Altafini, February 9, 2007
p. 11/60
Clustering: drawbacks
High level analysis of microarrays
Clustering
Clustering
k means clustering
Hierarchical clustering
SOM clustering
Quality indices
More clustering
Drawbacks
Example: hippocampus
clustering:
similar expression =similar function
Is it really usefull to inferr common function and
co-regulation???
example:
hy. time series
clustering
clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
coregulation
Claudio Altafini, February 9, 2007
same
cluster
coregulation
same
cluster
p. 12/60
Clustering: hippocampus time-series
High level analysis of microarrays
time instants
Clustering
0 min
30 min
90 min
180 min
Clustering
k means clustering
Hierarchical clustering
SOM clustering
what does the time series looks like for all genes?
Quality indices
More clustering
15000
Drawbacks
Example: hippocampus
hy. time series
clustering
clustering
Expression Level
10000
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
5000
Bayesian Networks
20
40
60
80
100
Time (min)
120
140
160
180
mess....
Claudio Altafini, February 9, 2007
p. 13/60
Hippocampus time series
High level analysis of microarrays
take the log
Clustering
14
Clustering
k means clustering
13
Hierarchical clustering
SOM clustering
12
Quality indices
More clustering
11
Log2 Expression Level
Drawbacks
Example: hippocampus
hy. time series
clustering
clustering
10
Principal Component Analysis
Ontological enrichment
5
Inferring Regulatory Networks
4
Bayesian Networks
Claudio Altafini, February 9, 2007
20
40
60
80
100
Time (min)
120
140
160
180
still mess....
p. 14/60
Hippocampus time series
High level analysis of microarrays
Clustering
Clustering
k means clustering
Hierarchical clustering
SOM clustering
Quality indices
More clustering
Drawbacks
Example: hippocampus
hy. time series
clustering
to identify interesting genes:
1. select only genes that show differential expression at a
fold change analysis
blue = genes that stay up for all 3 time samples
red = genes that stay up for 2 out of 3 time samples
4-fold
3-fold
upregulated in 3 and 2 times
upregulated in 3 and 2 times
clustering
12000
12000
10000
10000
8000
8000
6000
6000
4000
4000
2000
2000
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
20
40
60
80
100
120
140
160
180
20
40
60
80
100
120
140
160
180
2. select only genes with sufficiently high variance
Claudio Altafini, February 9, 2007
p. 15/60
Similar pattern: clustering
High level analysis of microarrays
Clustering
Clustering
k means clustering
Hierarchical clustering
SOM clustering
Quality indices
More clustering
clustering: similar gene expression time course =similar
function (or at least co-regulation)????
if I filter out those with little variance (the majority) are cluster
the remaining
Drawbacks
Example: hippocampus
hy. time series
clustering
clustering
KMeans Clustering of Profiles
10
10
5
0
Principal Component Analysis
10
100
100
10
10
10
100
100
100
200
10
10
10
10
5
0
100
100
10
10
Bayesian Networks
100
100
100
10
5
0
100
10
100
200
Claudio Altafini, February 9, 2007
100
100
100
10
5
0
100
100
200
12
100
200
100
200
100
200
100
200
10
100
100
100
200
100
200
100
200
200
100
200
10
12
100
200
100
200
10
12
10
4
14
12
100
11
10
8
12
5
0
8
6
10
5
0
10
5
0
200
10
10
100
5
0
10
5
0
12
12
Inferring Regulatory Networks
10
5
0
5
0
Ontological enrichment
10
10
5
0
KMeans Clustering of Profiles
15
10
100
200
p. 16/60
Similar pattern: clustering
High level analysis of microarrays
Clustering
Clustering
k means clustering
Hierarchical clustering
SOM clustering
clustering depends a lot on the algorithm
previous page: euclidean distance
here: Pearson correlation as distance
KMeans Clustering of Profiles
Quality indices
More clustering
Drawbacks
Example: hippocampus
10
5
10
5
0
hy. time series
clustering
clustering
10
100
100
10
100
100
100
200
5
0
100
100
10
5
0
100
100
10
5
0
100
100
200
10
100
10
5
0
100
100
10
5
0
100
100
200
100
200
100
200
100
200
100
200
100
100
100
200
100
200
100
200
200
100
200
100
200
100
200
1
1
1
1
100
1
1
5
0
1
1
200
5
0
Inferring Regulatory Networks
10
100
10
5
0
Principal Component Analysis
10
10
0
100
5
0
10
5
0
Bayesian Networks
10
5
Ontological enrichment
10
5
0
KMeans Clustering of Profiles
1
100
200
Claudio Altafini, February 9, 2007
p. 17/60
High level analysis of microarrays
Clustering
Clustering
k means clustering
Hierarchical clustering
SOM clustering
Quality indices
More clustering
Drawbacks
Example: hippocampus
Principal Component Analysis
hy. time series
clustering
clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Claudio Altafini, February 9, 2007
p. 18/60
PCA: Principal Component Analysis
High level analysis of microarrays
Clustering
Principal Component Analysis
PCA
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
PCA detects the directions that capture the most of the
information available from the data
PCA is performed by a linear transformation of the data set
based on the Singular Value Decomposition (SVD)
idea of principal components analysis: take linear
combinations of the x as basis elements so that the new
basis elements are orthogonal =they contain no
redundant information
Successive principal components capture less and less
information about the data
We can truncate the representation of the data to a limited
number of principle components =dimensionality
reduction
use SVD to decompose X (n m matrix):
X = U V
U n m orthogonal U U T = In
V m m orthogonal V V T = Im
Claudio Altafini, February 9, 2007
p. 19/60
PCA: Principal Component Analysis
High level analysis of microarrays
Clustering
Principal Component Analysis
PCA
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
x11
1
x2
X=
..
.
x1n
|
. . . xm
1
. . . xm
2
T
..
= U V
.
. . . xm
n
{z
}
1st exp....mth exp.
0
..
.
0
1
..
.
`
` = rank(X)
1 , . . . , ` singular values: j = j
j = eig() = eigenvalues of the covariance matrix of X
= covariance matrix of X after centering:
= (X [
x1 . . . x
m ])T (X [
x1 . . . x
m ])
is PCA improving yor clustering algorithm? Not
necessarily.... see
K. Y. Yeung, W. L. Ruzzo Principal Component Analysis for clustering gene expression
data, Bioinformatics 17 pages 763-774, 2001
Claudio Altafini, February 9, 2007
p. 20/60
High level analysis of microarrays
Clustering
Principal Component Analysis
PCA
Ontological enrichment
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Claudio Altafini, February 9, 2007
p. 21/60
Gene Ontology
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express
KEGG
biclustering
Biclustering example
Cancer compendium
Interpretation
GO = Gene Ontology project provides a controlled
vocabulary to describe gene and gene product attributes in
any organism
Genes are associated, with GO terms by trained curators
GO annotations give functions label to genes
cross-link to most common gene banks, pathways database,
etc.
Inferring Regulatory Networks
Bayesian Networks
Claudio Altafini, February 9, 2007
http://www.geneontology.org
p. 22/60
Structure of GO
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express
KEGG
biclustering
Biclustering example
Cancer compendium
Interpretation
Inferring Regulatory Networks
Bayesian Networks
GO terms:
1. Biological Process
2. Molecular Function
3. Cellular Component
a gene may belong to many categories
Claudio Altafini, February 9, 2007
p. 23/60
Structure of GO
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express
KEGG
biclustering
Biclustering example
Cancer compendium
Interpretation
Ontologies are structured
as a hierarchical directed
acyclic graph (DAG)
Terms can have more
than one parent,
and zero, one or
more children
Inferring Regulatory Networks
Bayesian Networks
Claudio Altafini, February 9, 2007
p. 24/60
Ontological enrichment
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express
KEGG
biclustering
Biclustering example
Cancer compendium
Interpretation
http://www.geneontology.org/GO.tools.microarray.shtml
Inferring Regulatory Networks
Bayesian Networks
Ontological enrichment: questions you would like to ask:
what is the main functional annotation of interesting
genes (e.g. differentially expressed, or genes having a
similar expression profiles)?
do genes involved in the same process/function have a
similar profile of expression?
Many tools exist that use GO to answer these questions:
Most of these tools work in a similar way:
input a gene list and a subset of interesting genes
tool shows which GO categories have most interesting
genes associated with them i.e. which categories are
enriched for interesting genes
tool provides a statistical measure to determine whether
enrichment is significant
Claudio Altafini, February 9, 2007
p. 25/60
Ontological enrichment
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express
KEGG
biclustering
Biclustering example
Cancer compendium
Interpretation
Inferring Regulatory Networks
Bayesian Networks
1. select a set of significant genes (e.g. t-test)
2. attain all the GO categories corresponding to them
3. analyze GO terms for significance
example of statistical measure: Hypergeometric test
N genes on the microarray
(
M genes Bio
Bio is a GO term
N M genes
/ Bio
K = n. of significant genes
what is the probability of having exactly x genes from K,
of type Bio?
M N M
P (X = x | N, M, K) =
Kx
N
K
P-value = probability of having at least x of K genes
(cumulative proability distribution)
N M
x1
X M
Kx
x
p val = 1
N
i=0
Claudio Altafini, February 9, 2007
p. 26/60
GO Tools
High level analysis of microarrays
Tools:
Clustering
Tool
Statistical model
Correction for multiple experiments
Onto-Express
2 , binomial, hypergeometric,
sidk, Holm, Bonferroni, FDR
KEGG
biclustering
GoMiner
Fishers exact test
EASEonline
Fishers exact test
Bonferroni
Biclustering example
Cancer compendium
GeneMerge
Hypergeometric
Bonferroni
FatiGO
Percentage
"Step-down minP, FDR
GOstat
chi2 Fishers exact test"
FDR, Holm
GOToolBox
Hypergeometric, binomial,
Bonferroni, Holm, Hochberg, Hommel, FDR
GoSurfer
Principal Component Analysis
Ontological enrichment
Fishers exact test
Ontological enrichment
Onto-Express
Interpretation
Inferring Regulatory Networks
Relative enrichment
Fishers exact test
Bayesian Networks
q-value ,DAG
Affymetrix also provide a Gene Ontology Mining Tool as part
of their NetAffx Analysis Center which returns GO terms for
probe sets
Claudio Altafini, February 9, 2007
p. 27/60
Example: Onto-Express
Onto-Express is available at
High level analysis of microarrays
Clustering
http://vortex.cs.wayne.edu/projects.htm
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express
KEGG
biclustering
Biclustering example
Cancer compendium
Interpretation
Inferring Regulatory Networks
Bayesian Networks
Claudio Altafini, February 9, 2007
p. 28/60
Example: Onto-Express
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express
KEGG
biclustering
Biclustering example
Cancer compendium
Interpretation
Inferring Regulatory Networks
Bayesian Networks
Claudio Altafini, February 9, 2007
p. 29/60
Example: Onto-Express
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express
KEGG
biclustering
Biclustering example
Cancer compendium
Interpretation
Inferring Regulatory Networks
Bayesian Networks
Claudio Altafini, February 9, 2007
p. 30/60
Other Ontologies: KEGG pathways
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express
KEGG
biclustering
Biclustering example
Cancer compendium
Interpretation
Inferring Regulatory Networks
Bayesian Networks
Claudio Altafini, February 9, 2007
p. 31/60
Biclustering
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express
KEGG
biclustering
Biclustering example
Cancer compendium
clustering can be carried out:
w.r.t gene expression
with respect to some other condition
(e.g. clinical condition in which I take the sample,
ontological information)
two-axis clustering =biclustering
conditions
Interpretation
conditions
Inferring Regulatory Networks
Bayesian Networks
genes
Claudio Altafini, February 9, 2007
genes
p. 32/60
Biclustering: expression + ontology
High level analysis of microarrays
E. Segal, N. Friedman, D. Koller, A. Regev A module map showing conditional activity of
Clustering
expression modules in cancer Nature Genetics 36, 1090-1098 (2004)
Principal Component Analysis
idea
inidividual genes
Ontological enrichment
biological process
Ontological enrichment
Onto-Express
KEGG
biclustering
regulatory modules
Biclustering example
Cancer compendium
Interpretation
Inferring Regulatory Networks
Bayesian Networks
rather than working with single genes and their regulatory
mechanics is it possible to lump together genes into modules
= set of genes that act in concert to carry out a specific
function?
here: DNA microarray data in a comprehensive analysis
aimed at identifying the shared and unique molecular
modules underlying human malignancies.
in the paper modules are extrated and used to characterize
gene-expression profiles in tumors as a combination of
activated and deactivated modules.
Claudio Altafini, February 9, 2007
p. 33/60
A cancer compendium
26 studies
KEGG
biclustering
Biclustering example
Cancer compendium
expression of 14.145 genes
1.975 arrays: Stanford Microarray Database
Whitehead Institute Database
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express
Interpretation
Inferring Regulatory Networks
Bayesian Networks
Claudio Altafini, February 9, 2007
2849 gene sets: Gene Ontology (1281)
KEGG: Kyoto Encycl. of Genes and Genomes (114)
Gene MicroArray Pathway Profiler (53)
other: tissue-specific gene sets (101)
other: clustered sets of coexpressed genes (1300)
whole analysis: data mining tool called GeneXPress
p. 34/60
Method modules & clinical conditions
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express
KEGG
biclustering
Biclustering example
Cancer compendium
Interpretation
Inferring Regulatory Networks
Bayesian Networks
# of statistically significant modules = 456
(spanning various processes and functions, metabolism,
transcription, degradation, cellular and neuronal signalling,
growth, cell cycle, apoptosis, extracellular matrix and
cytoskeleton components)
next: identify clinical conditions according to the combination
of active/deactive modules 263 biological and clinical
conditions (tissue type, tumor type, diagnosis and prognosis
info, molecular markers)
Claudio Altafini, February 9, 2007
p. 35/60
Modules vs clinical conditions
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express
KEGG
biclustering
Biclustering example
Cancer compendium
Interpretation
Inferring Regulatory Networks
Bayesian Networks
Claudio Altafini, February 9, 2007
p. 36/60
Interpretation of the network
interpretation
1. clinical conditions modules
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express
2. modules clinical conditions
KEGG
biclustering
Biclustering example
Cancer compendium
Interpretation
Inferring Regulatory Networks
some modules (e.g. cell cycle) are common to many tumor
types tumorigenic processes?
some other are specific (e.g. neural processes repressed in a
set of brain tumors)
Conclusion:
Bayesian Networks
various tumors of hematologic nature involve similar immune,
inflammation, growth regulation and signalling modules
large scale analysis between different
tissues/conditions/experimental setting yelds results with
statistical significance > 0
in studying tumors:
Activation of some modules is specific to particular types of
tumor
Other modules are shared across a different clinical conditions,
suggestive of common tumor progression mechanisms.
Claudio Altafini, February 9, 2007
p. 37/60
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express
KEGG
biclustering
Inferring Regulatory Networks
Biclustering example
Cancer compendium
Interpretation
Inferring Regulatory Networks
Bayesian Networks
Claudio Altafini, February 9, 2007
p. 38/60
Limitations of clustering/PCA
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Limitations of clustering
System Identification
Bayesian Networks
clustering: methods of information extraction from data
based on co-regulation:
similar expression pattern over a set of experiments
=similar function
all the clustering algorithms give the same results if the
time points are randomly permuted
cannot reveal causal/dynamical connections
=does not reveal what is behind the co-regulation
more ambitious goal:
find the transcriptional regulatory network
Claudio Altafini, February 9, 2007
p. 39/60
The reverse engineering paradigm
High level analysis of microarrays
Clustering
Principal Component Analysis
basic idea: the architecture of the network is inferred (or
reverse engineered) based on the observed response of the
system to a series of experimental perturbations
Ontological enrichment
Inferring Regulatory Networks
Limitations of clustering
System Identification
Bayesian Networks
perturbations
external
stimuli
Claudio Altafini, February 9, 2007
OUTPUTS
INPUTS
?
measured
signals
measured signals: [mRNA], [proteins] [metabolites]
global response: measure the entire state vector
time series (e.g. cell cycle)
single time point (e.g. steady state)
perturbations: experimental interventions that alter the state
of interest
p. 40/60
The reverse engineering paradigm
High level analysis of microarrays
TASK:
Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Limitations of clustering
System Identification
Bayesian Networks
from gene expression profiles to a gene-gene graph
extract the network structure
quantify the interactions
computationally the task is hard: a very large amount of data
is required
data rich/data poor paradox
many data =
6 significant data for network inference
what are the significant data?
Those obtained perturbing sistematically the variables of
interest
regulation is dynamical
we see it static because most time we cannot observe the
transient period (in which the system reacts to the change),
but just measure the new steady state in which the system
resettles following a perturbation
Claudio Altafini, February 9, 2007
p. 41/60
The reverse engineering paradigm
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Limitations of clustering
System Identification
Bayesian Networks
Claudio Altafini, February 9, 2007
what are then the perturbations?
everything that moves the cell from its standard working
condition
biochemical, environmental, genetic, transcriptional, etc.
examples: stress factors, starvation, infection, hormonal
and growth factors; chemical inhibitors/activators, protein
activity, metabolite concentration, gene overexpressions
and inhibition, gene knockout and mutations, miRNA
perturbations could be
temporary (e.g. activating or inhibiting a signalling
protein by phosphorylation) or permanent (e.g. gene
knockout)
time dependent (e.g. time-varying stimulus) or static
(e.g. gene knockout)
local (i.e., affecting a single gene) or global ( i.e., change
in temperature or pH)
of small amplitude or large amplitude
p. 42/60
Network inference algorithms
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Limitations of clustering
System Identification
Bayesian Networks
a few methods
1. B AYESIAN NETWORK
attains a probabilistic graph through a bayesian learning
(exact) complexity: superexponential
2. A SSOCIATION NETWORKS
learns a graph through a similarity measure
polynomial complexity
3. LINEAR ODE S MODELS
linear complexity
suffers from underdetermination
model-dependent
Claudio Altafini, February 9, 2007
p. 43/60
Bayesian networks
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Bayesian networks
Bayes rule
Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw
Bayesian networks
are a probabilistic framework aiming at capturing the
conditional dependence or conditional independence
between states in a set of data.
approach is statistic in nature =able to cope with noisy
data & not sufficiently many experimental data
useful when each state depends only on a relatively small #
of other components networks with low connectivity
Bayesian network can
learn the regulatory network
find the best set of parameters for the conditional
distribution of that network
best is to be taken in a Bayesian sense as the most
probable given the data
N. Friedman, M. Linial, I. Nachman, and D. Peer. Using Bayesian Network to Analyze
Expression Data J. Computational Biology 7:601-620, 2000
Claudio Altafini, February 9, 2007
p. 44/60
Bayesian networks
High level analysis of microarrays
example
A causes B is the rule to construct the graph
tables = conditional probability distribution
Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Bayesian networks
Bayes rule
Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw
Claudio Altafini, February 9, 2007
p. 45/60
Bayes rule
High level analysis of microarrays
Clustering
posterior probability =
Principal Component Analysis
marginal likelihood prior probability
coefficient
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
p(A B) = p(A) p(B)
Bayesian networks
Bayes rule
Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw
if A and B are independent events
if A and B are not independent
p(A B) = p(A|B)p(B)
= p(B|A)p(A)
=conditional probability
p(A|B) =
Claudio Altafini, February 9, 2007
p(B|A)p(A)
p(B)
p. 46/60
Bayes rule: example
(
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Bayesian networks
Bayes rule
Bayesian networks
Equivalence classes
Variable representation
Learning the network
p(C = 1)
p(C = 0)
= 0.5
= 0.5
p(R = 1|C = 1) = 0.8
p(R = 1|C = 0) = 0.2
from the likelihood
P (R = 1) = p(C = 0, R = 1) + p(C = 1, R = 1)
= p(R = 1|C = 0)p(C = 0) + p(R = 1|C = 1)p(C = 1)
= 0.2 0.5 + 0.8 0.5 = 0.5
Bayes rule
p(C|R) =
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw
p(R|C)p(C)
p(R)
e.g. if we see it is raining R = 1 =
p(R = 1|C = 0)p(C = 0)
0.2 0.5
=
= 0.2
p(R = 1)
0.5
(
!
P (R = 1)
= 0.26
if instead p(C = 0) = 0.9 =
p(C = 0|R = 1) = 0.69!!
p(C = 0|R = 1) =
Claudio Altafini, February 9, 2007
p. 47/60
Bayes rule: example
High level analysis of microarrays
Clustering
Principal Component Analysis
how about if you observe the grass is wet?
is it because of
sprinkler
Ontological enrichment
Inferring Regulatory Networks
p(S = 1|W = 1) =
p(S = 1, W = 1)
= 0.43
p(W = 1)
p(R = 1|W = 1) =
p(R = 1, W = 1)
= 0.7
p(W = 1)
Bayesian Networks
Bayesian networks
Bayes rule
Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw
from the joint probability, we deduce the different
conditional probabilities
Bayesian inference: find the probability of conditional
events, given the Bayesian network, or find the conditional
events and the network structure
Claudio Altafini, February 9, 2007
rain
p. 48/60
Bayesian networks
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Bayesian networks
Bayes rule
Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw
Bayesian networks are graphical representations of joint
probability distributions and consist of 2 components:
1. an annotated direct acyclic graph (DAG) G with
nodes = random variables X1 , . . . , Xn
(e.g. Xi = gene expression)
arcs = causal relationships between nodes Xi Xj
2. conditional probability distributions p(Xi | parents (Xi ))
for each Xi
the graph encodes the Markov assumption: each Xi is
independent of its non-descendants, given its parents
joint distribution
p(X1 , . . . , Xn ) =
n
Y
i=1
p(Xi | parents (Xi ))
on the joint distribution one can do inference and choose
likely causalities (conditional distribution)
Claudio Altafini, February 9, 2007
p. 49/60
Bayesian networks
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Bayesian networks
Bayes rule
to reduce the number of conditionals to compute in the joint
distribution: conditional independence
from the Markov assumption, for all the non-descendent
nodes there is conditional independence:
i(X; Y |Z) means X is independent of Y given Z
example
E
A
Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
conditional independeces
B
i(A; E), i(B; D|A, E),
i(C; A, D, E|B) i(D; B, C, E|A)
Improvements
Dynamic Bayesian Netw
joint distribution
p(A, B, C, D, E) = p(A)p(B|A, E)p(C|B)p(D|A)p(E)
Claudio Altafini, February 9, 2007
p. 50/60
Equivalence classes of Bayesian Networks
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Bayesian networks
Bayes rule
Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw
a Bayesian network implies a set of independencies I(G)
(more than just the ones following the Markov assumption)
Bayesian networks that have the same independencies
belong to the same equivalence class
example: G : X Y and G : Y X are equivalent
rather than a DAG (Direct Acyclic Graph) a class is
represented by a PDAG: Partially Direct Acyclic Graph: a
graph such that
if there is a direct edge X Y then all members of the
equivalence class must contain the edge with the same
direction
some edges may be nondirect X
Y (meaning in the
equivalence class both X Y and Y X are present)
Claudio Altafini, February 9, 2007
p. 51/60
Variable representation
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Bayesian networks
Bayes rule
Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw
Different types of representations for X1 , . . . , Xn
1. discrete variables: Xi take values in a finite set
binary = {0, 1}
{ low expression; normal; over-expressed }
=multinomial distribution
can capture combinatorial effects
discretization =loss of information
2. continuous variables: in order to compute posteriors in
closed form one must use linear Gaussian distributions
X
p(X|u1 , . . . , uk ) N (a0 +
ai u i , 2 )
i
can capture only linear effects
3. hybrid models: mix of the two cases
Claudio Altafini, February 9, 2007
p. 52/60
Learning the network
P ROBLEM FORMULATION:
High level analysis of microarrays
given a training set D = (x1 , . . . , xn ) of independent
instances of the random variables X1 , . . . , Xn , find the
network G (or equivalence class of networks) that best
matches D
Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Bayesian networks
Bayes rule
Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw
complete data: the entire state vector is measured
=full observation, unknown structure case.
Learning the structure (e.g. via Bayesian score algorithms) is
known to be a NP-hard problem (superexponential growth)
from the Bayes rule
p(G|D) =
p(D|G)p(G)
p(D)
where
p(G|D) = posterior probability on the network structure
p(G) = prior probability on the network structure
Claudio Altafini, February 9, 2007
p. 53/60
Learning the network
High level analysis of microarrays
take the log: Scoring function
S(G : D)
Clustering
Principal Component Analysis
Ontological enrichment
= log p(G|D)
= log p(D|G) + log p(G) + C
where
C = log p(D) = const.
p(D|G) = marginal likelihood = averages the probability of
the data over all possible structures assignable to G
Z
p(D|G) = p(D|G, )p(|G)d
Inferring Regulatory Networks
Bayesian Networks
Bayesian networks
Bayes rule
Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw
complete data =integral is treatable
solution:
model:
max S(G : D)
parameters
max S(|G , D)
Claudio Altafini, February 9, 2007
p. 54/60
Learning the network
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Bayesian networks
Bayes rule
Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw
this is still NP-hard
simplifications
complete data =G and G0 with equivalent graphs give
the same posterior score
score is decomposable
X
S(G : D) =
ScoreContribution (Xi , parents(Xi ) : D)
contribution of each Xi to the total score depends only on
its own value and on the value of its parents in G
heuristcs:
to cope with complexity: local search procedure that
changes one arc at each move: evaluation of the gain
made by adding/removing/reversing a single arc
further complexity reduction: # of parents is bounded
(fan-in) =sparsness
greedy algorithm, but performing well in practice
Claudio Altafini, February 9, 2007
p. 55/60
Discovering features
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Bayesian networks
Bayes rule
Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw
Claudio Altafini, February 9, 2007
result is a joint distribution over all random variables
rather than obtaining a single optimal model G , one gets a
set of models with different high scores
idea: compare highly scoring models for common features
simplest features: pairwise relations Markov Relations
Markov blanket = minimal set of variables that shield X
from the rest of the variables in the model =X is
independent from the rest given the blanket
2 nodes X and Y in the blanket either are directly linked
or share parenthood of a node
biologically it means that X and Y are related in a joint
process
assessing the confidence of a model: bootstrap = slightly
perturb your data, re-apply the learning procedure and verify
the overlap
p. 56/60
Drawbacks
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Bayesian networks
Bayes rule
Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
finding the best structure is a NP-hard problem
PDAG rather than DAG: not all cause-effect relations can be
resolved: Bayesian network is a model of dependencies
between variables rather than causality
sparseness assumption is initialized by genes that are
co-expressed in a clustering: this is reasonable but may
arbitrarily and erroneously restrict the search space
Graph must be Acyclic: the network found has no regulatory
loops
Improvements
Dynamic Bayesian Netw
Claudio Altafini, February 9, 2007
p. 57/60
Improvements and developements
High level analysis of microarrays
Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Bayesian networks
Bayes rule
to cope with unmeasured quantities (e.g. missing data: part
of the state vector not measured in some of the
experiments): hidden Markov models
to cope with acyclicity: Dynamic Bayesian Networks
idea: feedback is seen as a delay unfoding in time into an
acyclic graph
Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw
t1
Claudio Altafini, February 9, 2007
t2
t3
p. 58/60
Dynamic Bayesian Networks
High level analysis of microarrays
D. Husmeier Sensitivity and specificity of inferring genetic regulatory interactions from
Clustering
microarray experiments with dynamic Bayesian networks Bionformatics 19 p.2271-82,
Principal Component Analysis
2003
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Bayesian networks
Bayes rule
Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw
t1
each time slice is a Bayesian network
to tame complexity: transition probabilities between slices is
the same t
homogeneous Markov model
intraslice connections (i.e., instantaneous interactions) are
not allowed
directional ambiguity is avoided: temporal causality
Claudio Altafini, February 9, 2007
p. 59/60
Dynamic Bayesian Networks: drawbacks
High level analysis of microarrays
Clustering
Principal Component Analysis
the bottleneck is that the time series of data are short =the
posterior distribution over network structure is vague...
other problems:
Ontological enrichment
Inferring Regulatory Networks
p(G|D) =
Bayesian Networks
Bayesian networks
Bayes rule
Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw
Claudio Altafini, February 9, 2007
p(D|G)p(G)
p(D)
prior on network structure p(G) has a non-neglegible
influence on posterior p(G|D)
=p(G) should capture known features of biological
networks
=need to know a lot to initialize G
needless to say: computational complexity
p. 60/60