Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
128 views30 pages

High Level Analysis of Microarray Data: Claudio Altafini

This document discusses high level analysis techniques for microarray data, including clustering, principal component analysis, and ontological enrichment. It provides details on common clustering algorithms like k-means and hierarchical clustering. It also discusses using clustering to analyze gene expression over time in hippocampus tissue samples. The goal is to group genes exhibiting similar expression patterns to infer common functions and coregulation. However, it notes that clustering based only on expression similarity may not reliably indicate functional similarity between genes.

Uploaded by

Michael
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
128 views30 pages

High Level Analysis of Microarray Data: Claudio Altafini

This document discusses high level analysis techniques for microarray data, including clustering, principal component analysis, and ontological enrichment. It provides details on common clustering algorithms like k-means and hierarchical clustering. It also discusses using clustering to analyze gene expression over time in hippocampus tissue samples. The goal is to group genes exhibiting similar expression patterns to infer common functions and coregulation. However, it notes that clustering based only on expression similarity may not reliably indicate functional similarity between genes.

Uploaded by

Michael
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 30

High level analysis of microarray data

Lecture 2
Claudio Altafini
SISSA
http://people.sissa.it/altafini

Claudio Altafini, February 9, 2007

p. 1/60

High level analysis of microarray data

High level analysis of microarrays


Clustering

Model-free methods

1.

Principal Component Analysis


Ontological enrichment

CLUSTERING ALGORITHMS

2.

PRINCIPAL COMPONENT ANALYSIS

Inferring Regulatory Networks


Bayesian Networks

3.

reduce the dimension of a data set keeping the most


significant directions

ONTOLOGICAL ENRICHMENT

Claudio Altafini, February 9, 2007

put together genes with similar expression profiles

add functional annotation (e.g. GO)


perform statistical tests on the ontological information

p. 2/60

High level analysis of microarray data

High level analysis of microarrays

Gene network inference methods

1. LESS MODEL - DEPENDENT METHODS (e.g. probabilistic,


statistical, etc.)
looking only for the core relationships of a network
not quantitative
can be used for large scale networks
2. MODEL - DEPENDENT METHODS (e.g. ODEs)
provide both the network topology and the functional
relationships
useful mostly for small/mid scale networks
quantitative

Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks

warning: the classification is not sharp!!!!

Claudio Altafini, February 9, 2007

p. 3/60

High level analysis of microarrays


Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks

Clustering

Bayesian Networks

Claudio Altafini, February 9, 2007

p. 4/60

Clustering

High level analysis of microarrays

example: cluster these

Clustering
Clustering
k means clustering
Hierarchical clustering
SOM clustering
Quality indices
More clustering
Drawbacks
Example: hippocampus
hy. time series
clustering
clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks

Claudio Altafini, February 9, 2007

p. 5/60

Clustering

High level analysis of microarrays

Clustering
Clustering
k means clustering
Hierarchical clustering
SOM clustering
Quality indices
More clustering
Drawbacks
Example: hippocampus
hy. time series
clustering
clustering

Principal Component Analysis

typically: Lp norm

d(x, y) = kx ykp ,

Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks

Claudio Altafini, February 9, 2007

Clustering = dividing a set of data into relatively


homogeneous groups according to a user-defined metric

d(x, x) = 0
d(x, y) > 0 such that d(x, y) = d(y, x)

d(x, y) 6 d(x, z) + d(z, y)


p = 1, . . . ,

example: Euclidean norm p = 2


p
d(x, y) = (x1 y1 )2 + . . . + (xn yn )2

3 main algorithms:
k-means
hierarchical clustering
SOM: Self Organizing Maps

p. 6/60

Clustering algorithms: k-means


data x1 , . . . , xn ,
# of clusters k
Output: k clusters
Algorithm:
1. select k centroids
2. assign each element xi to the cluster with nearest
centroid
3. recompute the centroid
4. repeat until it converges
Properties:
need to choose k
initialization step can change the result
sensitive to perturbations

High level analysis of microarrays

Inputs:

Clustering

Clustering
k means clustering
Hierarchical clustering
SOM clustering

Quality indices
More clustering
Drawbacks
Example: hippocampus
hy. time series
clustering
clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks

Bayesian Networks

Claudio Altafini, February 9, 2007

p. 7/60

Hierarchical Clustering
cluster
merging
cost
High level analysis of microarrays
Clustering
Clustering
k means clustering
Hierarchical clustering
SOM clustering

Quality indices
More clustering
Drawbacks
Example: hippocampus
hy. time series
clustering
clustering
Principal Component Analysis

Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks

Claudio Altafini, February 9, 2007

data x1 , . . . , xn
Output: clustering tree
Algorithm:
x1 x2 x3 x4 x5 x6 x7
put each xi in a cluster Ci = {xi }
compute the merging cost between each pair of clusters
merge the two clusters with cheapest merging cost
repeat until only 1 cluster is left
cost of merging
single linkage
minxCi ,yCj d(x, y) ( loose clusters)
P
P
1
average linkage
yCj d(x, y)
xCi
|Ci | |Cj |
complete linkage
maxxCi ,yCj d(x, y) ( tight
clusters)
properties:
greedy algorithm
tends to build big clusters
need to choose a threshold on the # of clusters

Inputs:

p. 8/60

Clustering: Self Organizing Maps


data x1 , . . . , xn ;
SOM topology (k nodes)
Output k clusters
Algorithm:
1. start with a simple topology
2. select a random data p
3. move all nodes towards p according to the rule

Inputs:
High level analysis of microarrays
Clustering
Clustering
k means clustering
Hierarchical clustering
SOM clustering

Quality indices
More clustering
Drawbacks
Example: hippocampus
hy. time series
clustering
clustering

fi+1 (x) = fi (x) + (d(x, xp ), i)(p fi (x))

Principal Component Analysis


Ontological enrichment

fi (x) = position of node x at iteration i


xp = node closest to p
= (d, i) learning rate
4. go to 2. until convergence
properties
even more computationally costy, but more robust
neighboring clusters are similar: elements on the border
can belong to both clusters

Inferring Regulatory Networks

Bayesian Networks

Claudio Altafini, February 9, 2007

p. 9/60

Clustering: quality indices

High level analysis of microarrays

homogeneity
1

Clustering
Clustering
k means clustering

ngenes

Hierarchical clustering
SOM clustering
Quality indices
More clustering

hy. time series


clustering
clustering

Principal Component Analysis


Ontological enrichment
Inferring Regulatory Networks

d(xi , C(xi ))

= average distance between each x and the centroid of


the cluster it belongs to
reflects the compactness of the cluster
separation
X
1
P
nCi nCj d(
ci , cj ))
n
n
C
C
j
i
i6=j

Drawbacks
Example: hippocampus

i6=j

Bayesian Networks

weighted average distances between cluster centroids


reflects the distance between clusters
siluette width: composition of the two indices

Claudio Altafini, February 9, 2007

p. 10/60

More advanced clustering

High level analysis of microarrays

Clustering
Clustering
k means clustering
Hierarchical clustering
SOM clustering
Quality indices
More clustering
Drawbacks
Example: hippocampus
hy. time series
clustering
clustering

Principal Component Analysis


Ontological enrichment
Inferring Regulatory Networks

example: rather than a distance one can use a Pearson


correlation
Pn
)(yi y)
i=1 (xi x
y) = pP
pPn
d(x,
n
)2
)2
i=1 (xi x
i=1 (yi y
Pn
Pn
where x
= n1 i=1 xi , y = n1 i=1 yi

Pearson metric:
uses differences from the mean rather than the mean
y) [1, 1]
normalized by the standard deviation =d(x,
invariant to scaling and shifting of x and y

Bayesian Networks

Claudio Altafini, February 9, 2007

p. 11/60

Clustering: drawbacks

High level analysis of microarrays

Clustering
Clustering
k means clustering
Hierarchical clustering
SOM clustering
Quality indices
More clustering
Drawbacks
Example: hippocampus

clustering:
similar expression =similar function
Is it really usefull to inferr common function and
co-regulation???
example:

hy. time series


clustering
clustering
Principal Component Analysis
Ontological enrichment

Inferring Regulatory Networks


Bayesian Networks

coregulation

Claudio Altafini, February 9, 2007

same

cluster

coregulation

same

cluster

p. 12/60

Clustering: hippocampus time-series

High level analysis of microarrays

time instants

Clustering

0 min

30 min

90 min

180 min

Clustering
k means clustering
Hierarchical clustering
SOM clustering

what does the time series looks like for all genes?

Quality indices
More clustering

15000

Drawbacks
Example: hippocampus
hy. time series
clustering
clustering
Expression Level

10000

Principal Component Analysis


Ontological enrichment
Inferring Regulatory Networks

5000

Bayesian Networks

20

40

60

80
100
Time (min)

120

140

160

180

mess....

Claudio Altafini, February 9, 2007

p. 13/60

Hippocampus time series

High level analysis of microarrays

take the log

Clustering
14

Clustering
k means clustering

13

Hierarchical clustering
SOM clustering

12

Quality indices
More clustering

11
Log2 Expression Level

Drawbacks
Example: hippocampus
hy. time series
clustering
clustering

10

Principal Component Analysis

Ontological enrichment
5

Inferring Regulatory Networks


4

Bayesian Networks

Claudio Altafini, February 9, 2007

20

40

60

80
100
Time (min)

120

140

160

180

still mess....

p. 14/60

Hippocampus time series

High level analysis of microarrays

Clustering
Clustering
k means clustering
Hierarchical clustering
SOM clustering
Quality indices
More clustering
Drawbacks
Example: hippocampus
hy. time series
clustering

to identify interesting genes:


1. select only genes that show differential expression at a
fold change analysis
blue = genes that stay up for all 3 time samples
red = genes that stay up for 2 out of 3 time samples
4-fold
3-fold
upregulated in 3 and 2 times

upregulated in 3 and 2 times

clustering
12000

12000

10000

10000

8000

8000

6000

6000

4000

4000

2000

2000

Principal Component Analysis


Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks

20

40

60

80

100

120

140

160

180

20

40

60

80

100

120

140

160

180

2. select only genes with sufficiently high variance

Claudio Altafini, February 9, 2007

p. 15/60

Similar pattern: clustering

High level analysis of microarrays

Clustering
Clustering
k means clustering
Hierarchical clustering
SOM clustering
Quality indices
More clustering

clustering: similar gene expression time course =similar


function (or at least co-regulation)????
if I filter out those with little variance (the majority) are cluster
the remaining

Drawbacks
Example: hippocampus
hy. time series
clustering
clustering

KMeans Clustering of Profiles

10

10

5
0

Principal Component Analysis

10

100

100

10

10

10

100

100

100

200

10

10

10

10

5
0

100

100

10

10

Bayesian Networks

100

100

100

10

5
0

100

10

100

200

Claudio Altafini, February 9, 2007

100

100

100

10

5
0

100

100

200

12

100

200

100

200

100

200

100

200

10

100

100

100

200

100

200

100

200

200

100

200

10

12

100

200

100

200

10
12

10

4
14

12

100

11

10

8
12

5
0

8
6

10

5
0

10

5
0

200

10

10

100

5
0

10

5
0

12

12

Inferring Regulatory Networks

10

5
0

5
0

Ontological enrichment

10

10

5
0

KMeans Clustering of Profiles


15

10

100

200

p. 16/60

Similar pattern: clustering

High level analysis of microarrays


Clustering
Clustering
k means clustering
Hierarchical clustering
SOM clustering

clustering depends a lot on the algorithm


previous page: euclidean distance
here: Pearson correlation as distance
KMeans Clustering of Profiles

Quality indices
More clustering
Drawbacks
Example: hippocampus

10
5

10

5
0

hy. time series


clustering
clustering

10

100

100

10

100

100

100

200

5
0

100

100

10

5
0

100

100

10

5
0

100

100

200

10

100

10

5
0

100

100

10

5
0

100

100

200

100

200

100

200

100

200

100

200

100

100

100

200

100

200

100

200

200

100

200

100

200

100

200

1
1

1
1

100

1
1

5
0

1
1

200

5
0

Inferring Regulatory Networks


10

100

10

5
0

Principal Component Analysis


10

10
0

100

5
0

10

5
0

Bayesian Networks

10
5

Ontological enrichment

10

5
0

KMeans Clustering of Profiles


1

100

200

Claudio Altafini, February 9, 2007

p. 17/60

High level analysis of microarrays


Clustering
Clustering
k means clustering
Hierarchical clustering
SOM clustering
Quality indices
More clustering
Drawbacks
Example: hippocampus

Principal Component Analysis

hy. time series


clustering
clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks

Claudio Altafini, February 9, 2007

p. 18/60

PCA: Principal Component Analysis

High level analysis of microarrays

Clustering
Principal Component Analysis
PCA

Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks

PCA detects the directions that capture the most of the


information available from the data
PCA is performed by a linear transformation of the data set
based on the Singular Value Decomposition (SVD)
idea of principal components analysis: take linear
combinations of the x as basis elements so that the new
basis elements are orthogonal =they contain no
redundant information
Successive principal components capture less and less
information about the data
We can truncate the representation of the data to a limited
number of principle components =dimensionality
reduction
use SVD to decompose X (n m matrix):
X = U V

U n m orthogonal U U T = In
V m m orthogonal V V T = Im

Claudio Altafini, February 9, 2007

p. 19/60

PCA: Principal Component Analysis

High level analysis of microarrays


Clustering
Principal Component Analysis
PCA
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks

x11
1
x2
X=
..
.
x1n
|

. . . xm
1

. . . xm
2
T
..
= U V
.
. . . xm
n
{z
}

1st exp....mth exp.

0
..

.
0
1
..

.
`

` = rank(X)

1 , . . . , ` singular values: j = j

j = eig() = eigenvalues of the covariance matrix of X


= covariance matrix of X after centering:
= (X [
x1 . . . x
m ])T (X [
x1 . . . x
m ])
is PCA improving yor clustering algorithm? Not
necessarily.... see
K. Y. Yeung, W. L. Ruzzo Principal Component Analysis for clustering gene expression
data, Bioinformatics 17 pages 763-774, 2001

Claudio Altafini, February 9, 2007

p. 20/60

High level analysis of microarrays


Clustering
Principal Component Analysis
PCA

Ontological enrichment

Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks

Claudio Altafini, February 9, 2007

p. 21/60

Gene Ontology

High level analysis of microarrays

Clustering
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express
KEGG
biclustering
Biclustering example
Cancer compendium

Interpretation

GO = Gene Ontology project provides a controlled


vocabulary to describe gene and gene product attributes in
any organism
Genes are associated, with GO terms by trained curators
GO annotations give functions label to genes
cross-link to most common gene banks, pathways database,
etc.

Inferring Regulatory Networks


Bayesian Networks

Claudio Altafini, February 9, 2007

http://www.geneontology.org
p. 22/60

Structure of GO

High level analysis of microarrays


Clustering
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express
KEGG
biclustering
Biclustering example
Cancer compendium
Interpretation
Inferring Regulatory Networks
Bayesian Networks

GO terms:
1. Biological Process
2. Molecular Function
3. Cellular Component
a gene may belong to many categories

Claudio Altafini, February 9, 2007

p. 23/60

Structure of GO

High level analysis of microarrays

Clustering
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express
KEGG
biclustering
Biclustering example
Cancer compendium
Interpretation

Ontologies are structured


as a hierarchical directed
acyclic graph (DAG)
Terms can have more
than one parent,
and zero, one or
more children

Inferring Regulatory Networks


Bayesian Networks

Claudio Altafini, February 9, 2007

p. 24/60

Ontological enrichment

High level analysis of microarrays

Clustering
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express
KEGG
biclustering
Biclustering example
Cancer compendium
Interpretation

http://www.geneontology.org/GO.tools.microarray.shtml

Inferring Regulatory Networks


Bayesian Networks

Ontological enrichment: questions you would like to ask:


what is the main functional annotation of interesting
genes (e.g. differentially expressed, or genes having a
similar expression profiles)?
do genes involved in the same process/function have a
similar profile of expression?
Many tools exist that use GO to answer these questions:

Most of these tools work in a similar way:


input a gene list and a subset of interesting genes
tool shows which GO categories have most interesting
genes associated with them i.e. which categories are
enriched for interesting genes
tool provides a statistical measure to determine whether
enrichment is significant

Claudio Altafini, February 9, 2007

p. 25/60

Ontological enrichment

High level analysis of microarrays


Clustering
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express
KEGG
biclustering
Biclustering example
Cancer compendium
Interpretation
Inferring Regulatory Networks
Bayesian Networks

1. select a set of significant genes (e.g. t-test)


2. attain all the GO categories corresponding to them
3. analyze GO terms for significance
example of statistical measure: Hypergeometric test
N genes on the microarray
(
M genes Bio
Bio is a GO term
N M genes
/ Bio
K = n. of significant genes
what is the probability of having exactly x genes from K,
of type Bio?


M N M
P (X = x | N, M, K) =

Kx

N
K

P-value = probability of having at least x of K genes


(cumulative proability distribution)
 N M 
x1
X M
Kx
x
p val = 1

N
i=0

Claudio Altafini, February 9, 2007

p. 26/60

GO Tools

High level analysis of microarrays

Tools:

Clustering

Tool

Statistical model

Correction for multiple experiments

Onto-Express

2 , binomial, hypergeometric,

sidk, Holm, Bonferroni, FDR

KEGG
biclustering

GoMiner

Fishers exact test

EASEonline

Fishers exact test

Bonferroni

Biclustering example
Cancer compendium

GeneMerge

Hypergeometric

Bonferroni

FatiGO

Percentage

"Step-down minP, FDR

GOstat

chi2 Fishers exact test"

FDR, Holm

GOToolBox

Hypergeometric, binomial,

Bonferroni, Holm, Hochberg, Hommel, FDR

GoSurfer

Principal Component Analysis


Ontological enrichment

Fishers exact test

Ontological enrichment
Onto-Express

Interpretation
Inferring Regulatory Networks

Relative enrichment

Fishers exact test

Bayesian Networks

q-value ,DAG

Affymetrix also provide a Gene Ontology Mining Tool as part


of their NetAffx Analysis Center which returns GO terms for
probe sets

Claudio Altafini, February 9, 2007

p. 27/60

Example: Onto-Express
Onto-Express is available at

High level analysis of microarrays

Clustering

http://vortex.cs.wayne.edu/projects.htm

Principal Component Analysis


Ontological enrichment
Ontological enrichment
Onto-Express
KEGG
biclustering
Biclustering example
Cancer compendium
Interpretation
Inferring Regulatory Networks
Bayesian Networks

Claudio Altafini, February 9, 2007

p. 28/60

Example: Onto-Express

High level analysis of microarrays


Clustering
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express
KEGG
biclustering
Biclustering example
Cancer compendium
Interpretation
Inferring Regulatory Networks
Bayesian Networks

Claudio Altafini, February 9, 2007

p. 29/60

Example: Onto-Express

High level analysis of microarrays


Clustering
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express
KEGG
biclustering
Biclustering example
Cancer compendium
Interpretation
Inferring Regulatory Networks
Bayesian Networks

Claudio Altafini, February 9, 2007

p. 30/60

Other Ontologies: KEGG pathways

High level analysis of microarrays


Clustering
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express
KEGG
biclustering
Biclustering example
Cancer compendium
Interpretation
Inferring Regulatory Networks
Bayesian Networks

Claudio Altafini, February 9, 2007

p. 31/60

Biclustering

High level analysis of microarrays

Clustering
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express
KEGG
biclustering
Biclustering example
Cancer compendium

clustering can be carried out:


w.r.t gene expression
with respect to some other condition
(e.g. clinical condition in which I take the sample,
ontological information)
two-axis clustering =biclustering
conditions

Interpretation

conditions

Inferring Regulatory Networks


Bayesian Networks

genes

Claudio Altafini, February 9, 2007

genes

p. 32/60

Biclustering: expression + ontology

High level analysis of microarrays

E. Segal, N. Friedman, D. Koller, A. Regev A module map showing conditional activity of

Clustering

expression modules in cancer Nature Genetics 36, 1090-1098 (2004)

Principal Component Analysis

idea

inidividual genes

Ontological enrichment

biological process

Ontological enrichment
Onto-Express
KEGG
biclustering

regulatory modules

Biclustering example
Cancer compendium
Interpretation

Inferring Regulatory Networks


Bayesian Networks

rather than working with single genes and their regulatory


mechanics is it possible to lump together genes into modules
= set of genes that act in concert to carry out a specific
function?
here: DNA microarray data in a comprehensive analysis
aimed at identifying the shared and unique molecular
modules underlying human malignancies.
in the paper modules are extrated and used to characterize
gene-expression profiles in tumors as a combination of
activated and deactivated modules.

Claudio Altafini, February 9, 2007

p. 33/60

A cancer compendium

26 studies

KEGG
biclustering

Biclustering example
Cancer compendium

expression of 14.145 genes


1.975 arrays: Stanford Microarray Database
Whitehead Institute Database

High level analysis of microarrays


Clustering
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express

Interpretation
Inferring Regulatory Networks
Bayesian Networks

Claudio Altafini, February 9, 2007

2849 gene sets: Gene Ontology (1281)


KEGG: Kyoto Encycl. of Genes and Genomes (114)
Gene MicroArray Pathway Profiler (53)
other: tissue-specific gene sets (101)
other: clustered sets of coexpressed genes (1300)

whole analysis: data mining tool called GeneXPress

p. 34/60

Method modules & clinical conditions

High level analysis of microarrays


Clustering
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express
KEGG
biclustering
Biclustering example
Cancer compendium
Interpretation

Inferring Regulatory Networks


Bayesian Networks

# of statistically significant modules = 456


(spanning various processes and functions, metabolism,
transcription, degradation, cellular and neuronal signalling,
growth, cell cycle, apoptosis, extracellular matrix and
cytoskeleton components)
next: identify clinical conditions according to the combination
of active/deactive modules 263 biological and clinical
conditions (tissue type, tumor type, diagnosis and prognosis
info, molecular markers)

Claudio Altafini, February 9, 2007

p. 35/60

Modules vs clinical conditions

High level analysis of microarrays


Clustering
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express
KEGG
biclustering
Biclustering example
Cancer compendium
Interpretation
Inferring Regulatory Networks
Bayesian Networks

Claudio Altafini, February 9, 2007

p. 36/60

Interpretation of the network

interpretation
1. clinical conditions modules

High level analysis of microarrays


Clustering

Principal Component Analysis


Ontological enrichment
Ontological enrichment
Onto-Express

2. modules clinical conditions

KEGG
biclustering

Biclustering example
Cancer compendium
Interpretation
Inferring Regulatory Networks

some modules (e.g. cell cycle) are common to many tumor


types tumorigenic processes?
some other are specific (e.g. neural processes repressed in a
set of brain tumors)

Conclusion:

Bayesian Networks

various tumors of hematologic nature involve similar immune,


inflammation, growth regulation and signalling modules

large scale analysis between different


tissues/conditions/experimental setting yelds results with
statistical significance > 0

in studying tumors:

Activation of some modules is specific to particular types of


tumor
Other modules are shared across a different clinical conditions,
suggestive of common tumor progression mechanisms.

Claudio Altafini, February 9, 2007

p. 37/60

High level analysis of microarrays


Clustering
Principal Component Analysis
Ontological enrichment
Ontological enrichment
Onto-Express
KEGG
biclustering

Inferring Regulatory Networks

Biclustering example
Cancer compendium
Interpretation
Inferring Regulatory Networks
Bayesian Networks

Claudio Altafini, February 9, 2007

p. 38/60

Limitations of clustering/PCA

High level analysis of microarrays

Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Limitations of clustering
System Identification
Bayesian Networks

clustering: methods of information extraction from data


based on co-regulation:
similar expression pattern over a set of experiments
=similar function
all the clustering algorithms give the same results if the
time points are randomly permuted
cannot reveal causal/dynamical connections
=does not reveal what is behind the co-regulation
more ambitious goal:
find the transcriptional regulatory network

Claudio Altafini, February 9, 2007

p. 39/60

The reverse engineering paradigm

High level analysis of microarrays

Clustering
Principal Component Analysis

basic idea: the architecture of the network is inferred (or


reverse engineered) based on the observed response of the
system to a series of experimental perturbations

Ontological enrichment
Inferring Regulatory Networks
Limitations of clustering
System Identification
Bayesian Networks

perturbations
external
stimuli

Claudio Altafini, February 9, 2007

OUTPUTS

INPUTS

?






























measured
signals

measured signals: [mRNA], [proteins] [metabolites]


global response: measure the entire state vector
time series (e.g. cell cycle)
single time point (e.g. steady state)
perturbations: experimental interventions that alter the state
of interest
p. 40/60

The reverse engineering paradigm

High level analysis of microarrays

TASK:

Clustering
Principal Component Analysis
Ontological enrichment

Inferring Regulatory Networks


Limitations of clustering
System Identification
Bayesian Networks

from gene expression profiles to a gene-gene graph


extract the network structure
quantify the interactions
computationally the task is hard: a very large amount of data
is required
data rich/data poor paradox
many data =
6 significant data for network inference
what are the significant data?
Those obtained perturbing sistematically the variables of
interest
regulation is dynamical
we see it static because most time we cannot observe the
transient period (in which the system reacts to the change),
but just measure the new steady state in which the system
resettles following a perturbation

Claudio Altafini, February 9, 2007

p. 41/60

The reverse engineering paradigm

High level analysis of microarrays


Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Limitations of clustering
System Identification
Bayesian Networks

Claudio Altafini, February 9, 2007

what are then the perturbations?


everything that moves the cell from its standard working
condition
biochemical, environmental, genetic, transcriptional, etc.
examples: stress factors, starvation, infection, hormonal
and growth factors; chemical inhibitors/activators, protein
activity, metabolite concentration, gene overexpressions
and inhibition, gene knockout and mutations, miRNA
perturbations could be
temporary (e.g. activating or inhibiting a signalling
protein by phosphorylation) or permanent (e.g. gene
knockout)
time dependent (e.g. time-varying stimulus) or static
(e.g. gene knockout)
local (i.e., affecting a single gene) or global ( i.e., change
in temperature or pH)
of small amplitude or large amplitude
p. 42/60

Network inference algorithms

High level analysis of microarrays

Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Limitations of clustering
System Identification
Bayesian Networks

a few methods
1. B AYESIAN NETWORK
attains a probabilistic graph through a bayesian learning
(exact) complexity: superexponential
2. A SSOCIATION NETWORKS
learns a graph through a similarity measure
polynomial complexity
3. LINEAR ODE S MODELS
linear complexity
suffers from underdetermination
model-dependent

Claudio Altafini, February 9, 2007

p. 43/60

Bayesian networks

High level analysis of microarrays


Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Bayesian networks
Bayes rule
Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw

Bayesian networks
are a probabilistic framework aiming at capturing the
conditional dependence or conditional independence
between states in a set of data.
approach is statistic in nature =able to cope with noisy
data & not sufficiently many experimental data
useful when each state depends only on a relatively small #
of other components networks with low connectivity
Bayesian network can
learn the regulatory network
find the best set of parameters for the conditional
distribution of that network
best is to be taken in a Bayesian sense as the most
probable given the data

N. Friedman, M. Linial, I. Nachman, and D. Peer. Using Bayesian Network to Analyze


Expression Data J. Computational Biology 7:601-620, 2000

Claudio Altafini, February 9, 2007

p. 44/60

Bayesian networks

High level analysis of microarrays

example

A causes B is the rule to construct the graph


tables = conditional probability distribution

Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Bayesian networks
Bayes rule
Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw

Claudio Altafini, February 9, 2007

p. 45/60

Bayes rule

High level analysis of microarrays


Clustering

posterior probability =

Principal Component Analysis

marginal likelihood prior probability


coefficient

Ontological enrichment
Inferring Regulatory Networks

Bayesian Networks

p(A B) = p(A) p(B)

Bayesian networks
Bayes rule
Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw

if A and B are independent events

if A and B are not independent


p(A B) = p(A|B)p(B)
= p(B|A)p(A)
=conditional probability
p(A|B) =

Claudio Altafini, February 9, 2007

p(B|A)p(A)
p(B)

p. 46/60

Bayes rule: example


(

High level analysis of microarrays


Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Bayesian networks
Bayes rule
Bayesian networks
Equivalence classes
Variable representation
Learning the network

p(C = 1)
p(C = 0)

= 0.5
= 0.5

p(R = 1|C = 1) = 0.8


p(R = 1|C = 0) = 0.2

from the likelihood

P (R = 1) = p(C = 0, R = 1) + p(C = 1, R = 1)
= p(R = 1|C = 0)p(C = 0) + p(R = 1|C = 1)p(C = 1)
= 0.2 0.5 + 0.8 0.5 = 0.5

Bayes rule
p(C|R) =

Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw

p(R|C)p(C)
p(R)

e.g. if we see it is raining R = 1 =


p(R = 1|C = 0)p(C = 0)
0.2 0.5
=
= 0.2
p(R = 1)
0.5
(
!
P (R = 1)
= 0.26
if instead p(C = 0) = 0.9 =
p(C = 0|R = 1) = 0.69!!

p(C = 0|R = 1) =

Claudio Altafini, February 9, 2007

p. 47/60

Bayes rule: example

High level analysis of microarrays

Clustering
Principal Component Analysis

how about if you observe the grass is wet?


is it because of
sprinkler

Ontological enrichment
Inferring Regulatory Networks

p(S = 1|W = 1) =

p(S = 1, W = 1)
= 0.43
p(W = 1)

p(R = 1|W = 1) =

p(R = 1, W = 1)
= 0.7
p(W = 1)

Bayesian Networks
Bayesian networks
Bayes rule

Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw

from the joint probability, we deduce the different


conditional probabilities
Bayesian inference: find the probability of conditional
events, given the Bayesian network, or find the conditional
events and the network structure

Claudio Altafini, February 9, 2007

rain

p. 48/60

Bayesian networks

High level analysis of microarrays


Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Bayesian networks
Bayes rule
Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw

Bayesian networks are graphical representations of joint


probability distributions and consist of 2 components:
1. an annotated direct acyclic graph (DAG) G with
nodes = random variables X1 , . . . , Xn
(e.g. Xi = gene expression)
arcs = causal relationships between nodes Xi Xj
2. conditional probability distributions p(Xi | parents (Xi ))
for each Xi
the graph encodes the Markov assumption: each Xi is
independent of its non-descendants, given its parents
joint distribution
p(X1 , . . . , Xn ) =

n
Y

i=1

p(Xi | parents (Xi ))

on the joint distribution one can do inference and choose


likely causalities (conditional distribution)

Claudio Altafini, February 9, 2007

p. 49/60

Bayesian networks

High level analysis of microarrays

Clustering
Principal Component Analysis

Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Bayesian networks
Bayes rule

to reduce the number of conditionals to compute in the joint


distribution: conditional independence
from the Markov assumption, for all the non-descendent
nodes there is conditional independence:
i(X; Y |Z) means X is independent of Y given Z
example
E
A

Bayesian networks
Equivalence classes
Variable representation
Learning the network

Discovering features
Drawbacks

conditional independeces
B

i(A; E), i(B; D|A, E),


i(C; A, D, E|B) i(D; B, C, E|A)

Improvements
Dynamic Bayesian Netw

joint distribution

p(A, B, C, D, E) = p(A)p(B|A, E)p(C|B)p(D|A)p(E)

Claudio Altafini, February 9, 2007

p. 50/60

Equivalence classes of Bayesian Networks

High level analysis of microarrays

Clustering
Principal Component Analysis

Ontological enrichment
Inferring Regulatory Networks

Bayesian Networks
Bayesian networks
Bayes rule

Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw

a Bayesian network implies a set of independencies I(G)


(more than just the ones following the Markov assumption)
Bayesian networks that have the same independencies
belong to the same equivalence class
example: G : X Y and G : Y X are equivalent
rather than a DAG (Direct Acyclic Graph) a class is
represented by a PDAG: Partially Direct Acyclic Graph: a
graph such that
if there is a direct edge X Y then all members of the
equivalence class must contain the edge with the same
direction
some edges may be nondirect X
Y (meaning in the
equivalence class both X Y and Y X are present)

Claudio Altafini, February 9, 2007

p. 51/60

Variable representation

High level analysis of microarrays


Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Bayesian networks
Bayes rule
Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw

Different types of representations for X1 , . . . , Xn


1. discrete variables: Xi take values in a finite set
binary = {0, 1}
{ low expression; normal; over-expressed }
=multinomial distribution
can capture combinatorial effects
discretization =loss of information
2. continuous variables: in order to compute posteriors in
closed form one must use linear Gaussian distributions
X
p(X|u1 , . . . , uk ) N (a0 +
ai u i , 2 )
i

can capture only linear effects


3. hybrid models: mix of the two cases

Claudio Altafini, February 9, 2007

p. 52/60

Learning the network


P ROBLEM FORMULATION:
High level analysis of microarrays

given a training set D = (x1 , . . . , xn ) of independent


instances of the random variables X1 , . . . , Xn , find the
network G (or equivalence class of networks) that best
matches D

Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Bayesian networks
Bayes rule

Bayesian networks
Equivalence classes
Variable representation
Learning the network

Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw

complete data: the entire state vector is measured


=full observation, unknown structure case.
Learning the structure (e.g. via Bayesian score algorithms) is
known to be a NP-hard problem (superexponential growth)
from the Bayes rule
p(G|D) =

p(D|G)p(G)
p(D)

where
p(G|D) = posterior probability on the network structure
p(G) = prior probability on the network structure
Claudio Altafini, February 9, 2007

p. 53/60

Learning the network

High level analysis of microarrays

take the log: Scoring function


S(G : D)

Clustering
Principal Component Analysis
Ontological enrichment

= log p(G|D)
= log p(D|G) + log p(G) + C

where
C = log p(D) = const.
p(D|G) = marginal likelihood = averages the probability of
the data over all possible structures assignable to G
Z
p(D|G) = p(D|G, )p(|G)d

Inferring Regulatory Networks


Bayesian Networks
Bayesian networks
Bayes rule
Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw

complete data =integral is treatable


solution:
model:
max S(G : D)

parameters

max S(|G , D)

Claudio Altafini, February 9, 2007

p. 54/60

Learning the network

High level analysis of microarrays

Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Bayesian networks
Bayes rule

Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw

this is still NP-hard


simplifications
complete data =G and G0 with equivalent graphs give
the same posterior score
score is decomposable
X
S(G : D) =
ScoreContribution (Xi , parents(Xi ) : D)

contribution of each Xi to the total score depends only on


its own value and on the value of its parents in G
heuristcs:
to cope with complexity: local search procedure that
changes one arc at each move: evaluation of the gain
made by adding/removing/reversing a single arc
further complexity reduction: # of parents is bounded
(fan-in) =sparsness
greedy algorithm, but performing well in practice

Claudio Altafini, February 9, 2007

p. 55/60

Discovering features

High level analysis of microarrays

Clustering

Principal Component Analysis


Ontological enrichment

Inferring Regulatory Networks


Bayesian Networks

Bayesian networks
Bayes rule
Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw

Claudio Altafini, February 9, 2007

result is a joint distribution over all random variables


rather than obtaining a single optimal model G , one gets a
set of models with different high scores
idea: compare highly scoring models for common features
simplest features: pairwise relations Markov Relations
Markov blanket = minimal set of variables that shield X
from the rest of the variables in the model =X is
independent from the rest given the blanket
2 nodes X and Y in the blanket either are directly linked
or share parenthood of a node
biologically it means that X and Y are related in a joint
process
assessing the confidence of a model: bootstrap = slightly
perturb your data, re-apply the learning procedure and verify
the overlap

p. 56/60

Drawbacks

High level analysis of microarrays

Clustering

Principal Component Analysis


Ontological enrichment
Inferring Regulatory Networks

Bayesian Networks
Bayesian networks
Bayes rule
Bayesian networks
Equivalence classes
Variable representation
Learning the network

Discovering features
Drawbacks

finding the best structure is a NP-hard problem


PDAG rather than DAG: not all cause-effect relations can be
resolved: Bayesian network is a model of dependencies
between variables rather than causality
sparseness assumption is initialized by genes that are
co-expressed in a clustering: this is reasonable but may
arbitrarily and erroneously restrict the search space
Graph must be Acyclic: the network found has no regulatory
loops

Improvements
Dynamic Bayesian Netw

Claudio Altafini, February 9, 2007

p. 57/60

Improvements and developements

High level analysis of microarrays

Clustering
Principal Component Analysis
Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Bayesian networks
Bayes rule

to cope with unmeasured quantities (e.g. missing data: part


of the state vector not measured in some of the
experiments): hidden Markov models
to cope with acyclicity: Dynamic Bayesian Networks
idea: feedback is seen as a delay unfoding in time into an
acyclic graph

Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw

t1

Claudio Altafini, February 9, 2007

t2

t3

p. 58/60

Dynamic Bayesian Networks

High level analysis of microarrays

D. Husmeier Sensitivity and specificity of inferring genetic regulatory interactions from

Clustering

microarray experiments with dynamic Bayesian networks Bionformatics 19 p.2271-82,

Principal Component Analysis

2003

Ontological enrichment
Inferring Regulatory Networks
Bayesian Networks
Bayesian networks
Bayes rule
Bayesian networks
Equivalence classes
Variable representation
Learning the network
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw

t1

each time slice is a Bayesian network


to tame complexity: transition probabilities between slices is
the same t
homogeneous Markov model
intraslice connections (i.e., instantaneous interactions) are
not allowed
directional ambiguity is avoided: temporal causality

Claudio Altafini, February 9, 2007

p. 59/60

Dynamic Bayesian Networks: drawbacks

High level analysis of microarrays

Clustering
Principal Component Analysis

the bottleneck is that the time series of data are short =the
posterior distribution over network structure is vague...
other problems:

Ontological enrichment
Inferring Regulatory Networks

p(G|D) =

Bayesian Networks
Bayesian networks
Bayes rule
Bayesian networks
Equivalence classes

Variable representation
Learning the network
Discovering features
Drawbacks
Improvements
Dynamic Bayesian Netw

Claudio Altafini, February 9, 2007

p(D|G)p(G)
p(D)

prior on network structure p(G) has a non-neglegible


influence on posterior p(G|D)
=p(G) should capture known features of biological
networks
=need to know a lot to initialize G
needless to say: computational complexity

p. 60/60

You might also like