Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
43 views96 pages

ComplexNetworks2023 Notes

THeory of complex networks

Uploaded by

barandiszabi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
43 views96 pages

ComplexNetworks2023 Notes

THeory of complex networks

Uploaded by

barandiszabi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 96

Module 6CCMCS02/7CCMCS02

Theory of Complex Networks


Compact Lecture Notes
Version of 27 November 2023

798

672
977

606 1 539
202
823

588
930 329
120
68 312
510
144 430 926 946 394
906 57
258
659 699 187
425
605
451 612 988 688
860
629
383
386 842
594 679
306 301 719
180
117 192
735 891 251
496 761 56 361
972 872 300 746
432 309
43 712
462 670
838 899
437 472
897 111 281 87
46 691 883
890 412 470 232
30
622 868 997
827 948 175 968
596 488 759 366
706 566 749 960 886
453
664 962 919 878
935 152
851 324 789 516
685 19
316 223 873443 34 315
138 520 435646 266
252 235 649 711 517
83 903 933 139 846 969 724
762 797 263
297 785 467
49 427
254 748 351
914
865 773 271 597
414
105 219 951
779 194 805 970
159 807 647
179 610 352
888 143 415
754 150 959 741 804
487 922 729 777 945 814
546
177 330 864 651
713 359 21
308 155
953 48 221 166
887 840 514 683 106 422
31 955 799
319 989 967
570 979
94 191
269 645 981 637 334 186
491 421 454 917 591
796 905 782
806 658 340
613
404 525 973
896 717 726 74
908 336 124 923
188 17 705 829
54 207 122 907 755
671 465
682 787 833
298 565 701 58 976
267
966 413 676
376 937 357 198
551 375 916 129
484 22 326
971
292
702 615 898 677 920
964 924 408 495
581 769
656 881 821 574
475 556 695
661
60 730 680
480 557 478 912 483 205
325 763 956 249
189 770 673
742
53
178 562 904 751
540
463
505 247 687 206
812
608 947 185 311 665
439 396 893
541
140 616 110
163885 515 776
199 599
857 884 784
373 716 374
602
371
745 245 213
547 875
125 550 932 509 825
497 795 372
617 320
928 93 625 228
843 534 134 674 335
304 131
542 828 295
468 165
950 728
808 368
406
279 27
9 400
201
996 618 184 631
874 583
367 723
70
580 337 339
739 362 663 399
264 248
98 604 630
95 990 328
901 402
241 75
469 346 837
211 858 499 792
15 513 869 548 59 236
302 810
240 576 826
975 867 473
809
628
343 983
944 482 703
433 174 895 498
786 552 693 648 781 32 911
13 667 429
740
980 555 486
62 527 714 460
277 569 519
445 927 154 642
530 365 181 401
653 419 508 291 272 985
715 913 317
803 112
182 758 852 882 643
265 322 952 771 169 609
464 528 10
476 338 100
38 848 364
44 627 378 811
6
614 635
348 925
363 780 471
294 353 590
135
536 733
727 994
78 126 377 403 931
16 287 76 747
839 793 696
289 176 800 456
299 831 554 987 506
310 764
501 20 288 866
392 426 47
190 449 636 333 650
718 720
704 280 563
900 261 442 652 492 545
293 707 611 193 936 543
732 35 578 167 153
601 85 915 170 879 489 160
388 522
791 305 395 587
963
11
420 356 861 560 466
350 941 358 760 708 255
666
634 532 417 568
197 986 621 149
709 398 148 641
841 91 765
405 162 579
815 586 561
256 567
832 813 222
341 459
778 250 849
239 103 998 101
44061 409 118 285
217 452 698 984
360 553 381
801
303 918
234 296 675 870
92 86
481524 63 854 518
196 744
638 438 662
71 342 114
689
80 640
743 323 526 598 173
90 824
512 626
238 686 278 89
259 974 273 783 644
500 284
957 136
995 276 257 575 37 229
42 102 55 531 921 737 544
571 909 130
835 200 385
736 529
485 455 12 158 390 347 954
592 282 816 564 790 332
318 133 584 286 369
195
632 535 79 183 25
208 260 39 750
397 243 107
507 725
834 142
321 847 132 623 387 766
768 410 880
147 242
283 244 942 992 2
474 845
853 600 127 830 210 391 146 493
461 104 444 418
220
504 447 99 436
327 477
820 7 313 84 458
836 67 314
354
690 227 119 66 151 934
734 64 108 752 654
441 949 965 157
82 5 40 41 214
775 929
850
274 26 731 77
502 619 137 982 511 14
123 660 978 1000
231 558 156 33
355 331 537 721 772
844 237
767 822
490 262 45 991 577 428 141 993 503
999 756
36 855757
203 894669
958 69 700 607
29 692 121
856 172
96 407
216 344 910 523 268
389 697 876
225
585 657
889 161 97
233 307
204 28 678
171
88 212 446 450
681 589 559
4 684
961 224
788 51 349
128 694 494
226 862 633
424
859 794 818
892 738 379
145 549
722 50
246
940 411
3 573 47918
871 270 116 603
655
109
52 168
215 115
582
290 538
380
817 275 593
81 521
943 802 423
164
620 73

345
533 939
863
595 668
572 382
639

253384 113
624 431

370

ACC Coolen
I Neri
Department of Mathematics
King’s College London
2

1 Introduction 4

2 Definitions and notation 15


2.1 Networks or graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2 The adjacency matrix of a network . . . . . . . . . . . . . . . . . . . . . . . 16
2.3 Paths in networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4 Graphs within graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.5 Connected components in nondirected and directed graphs . . . . . . . . . . 21
2.5.1 Nondirected graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.5.2 Directed graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3 Microscopic structural characteristics of graphs 24


3.1 Node-specific quantities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2 Quantities related to pairs of nodes . . . . . . . . . . . . . . . . . . . . . . . 26

4 Macroscopic structural characteristics of graphs 30


4.1 Mean values of single-node features . . . . . . . . . . . . . . . . . . . . . . . 30
4.2 Distributions of single node quantities . . . . . . . . . . . . . . . . . . . . . . 31
4.3 Distributions of multi-node quantities . . . . . . . . . . . . . . . . . . . . . . 35
4.4 Generalisation to node features other than degrees . . . . . . . . . . . . . . . 38
4.5 Modularity and community detection . . . . . . . . . . . . . . . . . . . . . . 39

5 Processes on networks and their relation to spectral features 41


5.1 Voter models on networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.2 Stability of fixed points in the nonlinear voter model . . . . . . . . . . . . . 45
5.3 Diffusion processes: - the Laplacian matrix of a graph . . . . . . . . . . . . . 46
5.4 Random walks on networks . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
5.5 PageRank . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
5.6 Spectra of adjacency matrices of undirected graphs . . . . . . . . . . . . . . 50
5.7 Spectra of Laplacian matrices . . . . . . . . . . . . . . . . . . . . . . . . . . 54

6 Random graphs 59
6.1 Random graphs as ‘null models’ . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.2 The Erdős-Rènyi model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.3 Generating functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.4 The Erdős-Renyi model in the finite connectivity regime . . . . . . . . . . . 66
6.5 Random graphs with a given prescribed degree distribution p(k) . . . . . . . 67
6.6 Percolation theory for random, locally tree-like graphs . . . . . . . . . . . . . 69
3

7 Appendices 80
7.1 Network software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
7.2 The Pearson correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
7.3 Properties of symmetric matrices . . . . . . . . . . . . . . . . . . . . . . . . 81
7.4 Integral representation of the Kronecker δ-symbol . . . . . . . . . . . . . . . 84
7.5 The Landau order symbol . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
7.6 Perron-Frobenius Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

8 Exercises 86
4

1. Introduction

The challenge of large data sets. In recent decades our ability to collect and store vast
amounts of quantitative data has increased dramatically. This includes socio-economic data,
such as social links between individuals or professional collaboration networks, consumer
preferences and commercial interactions, trade dependencies among corporations, credit
or insurance obligations between financial institutions. We have access to traffic data on
computer and telecommunication systems, satellite networks, the internet, electricity grids,
rail, road or air travel connections and distribution networks. We collect and process large
amounts of geological and meteorological data, data on sea levels, air and water pollution,
volcanic and seismic records, and sizes of polar and glacier ice sheets. Finally, we have seen
an explosion in recent years of biomedical data, such as experimental data on biochemical
processes and structures at cellular, subcellular and even molecular levels, the topologies
of complex composite signalling and information processing systems such as the brain or
the immune system, genomic and epigenetic data (gene expression levels, DNA sequences),
epidemiological data, and vast numbers of patient records with clinical information.
However, one tends to collect data for a reason. This reason is usually the desire to
understand the dynamical behaviour of the complex system that generated the data, to
predict with reasonable accuracy its future evolution and its response to perturbations or
interventions, or to understand how it was formed. We may want this for commercial gain,
to improve and optimise a system’s efficiency, to design effective regulatory controls, or
(in the case of medicine) to understand and cure diseases. For small and simple systems
the translation of observation into qualitative and quantitative understanding of design
and function is usually not difficult. In contrast, if we collect data on complex systems
with millions of nonlinearly interacting variables, just having a list of the parts and their
connections and observations of their collective patterns of behaviour is no longer enough to
understand how these systems work.

Networks as data reduction and visualisation tools. A first and useful step in modelling
structural data on complex many-variable systems is to visualise these systems as networks
or graphs. The idea is to represent each observed system component as a node in the
network, and each observed interaction between two components as a link between the
two corresponding nodes. Dependent upon one’s research domain, the nodes of such
networks may represent anything ranging from genes, molecules, or proteins (in biology), via
processors or servers (in computer science), to people, airports, power plants or corporations
(in social science or economics). The links could refer to biochemical reactions, wired
or wireless communication channels, friendships, financial contracts, etc. The price paid
for the complexity reduction is the loss of information. Limiting ourselves to a network
representation means that we only record which parts interact, and disregard how and when
they interact. However, the rationale is that the topology of the interactions between a
5

Figure 1. Neural networks, areas of the brain mapped using Golgi’s staining technique
by Ramon y Cajal around 1900. The nodes are brain cells (neurons), and the links are
coated fibres via which the neurone communicate electrical signals. The staining shows only
a tiny fraction of the links, in reality a neuron is connected on average to around 104 other
neurone. The topologies of these networks vary according to the brain area that is being
imaged, ranging from rather regular (in areas related to preprocessing of sensory signals) to
nearly amorphous (in higher cognitive areas).

system’s components should somehow be a fingerprint of its function, and that much can be
learned from the topologies of such networks alone.
For example, DNA contains the assembly instructions for large and complicated macro-
molecules called proteins. These proteins serve as building material for all the parts of the
cell, as molecular processing factories, information readers, translators, sensors, transporters
and messengers. They interact with each other by forming (temporary) complexes, which
are (meta)stable super-molecules formed of multiple proteins that attach to each other
selectively. Many experimental groups produce tables of molecular binding partners such
as that shown in Fig. 2. Network representations of these data have been very useful to
generate intuition on the possible relevance of individual nodes (i.e. proteins), and to suggest
functional modules. There are thousands of other examples of complex systems that tend to
be modelled as networks, of which a selection is shown in the various figures in this section.
6

Figure 2. Left: the data on human protein interactions (from HPRD data base), being
lists of pairs of protein species that have been observed to interact, together with codes of
each species and information on the experiments where interaction was observed. Each line
reports on one pair-interaction. This database contains some 70,000 reported interactions
(about half of the interactions believed to exist among human proteins). Right: the network
representation of the data on the left. Each protein is represented by a node, and each pair-
interaction (each line on the left) is represented by a link connecting the two nodes concerned.
Since interaction of two proteins is a symmetric property, this graph has nondirected links.

Figure 3. Gene regulation networks: here the nodes represent different genes in humans,
and the links (which now are directional) indicate the ability of individual genes (when
‘switched on’) to affect the extent to which other genes are activated. Here the links are
moreover ‘weighted’ (i.e. they carry a numerical value), indicated by solid arrows (positive
value, excitatory effect) or dashed arrows (negative value, inhibitory effect). The figure here
shows only a small subset of the genes – the true number of nodes is in the order of 20,000.
7

Figure 4. Graphical representation of the internet. Nodes represent computers and other
devices and links represent data connections, e.g., optical fibre lines.
8

Figure 5. Social network of collaboration partners within an organisation


(here: a subset of IBM).

Figure 6. Network representations of interactions observed between share prices. Nodes


represent individual companies (with a colour code representing a classification of sectors),
and links imply significant observed correlation in the share prices of these companies.
9

Figure 7. Main arteries of the oil and gas distribution network, and of the national
electricity power grid of the USA. Here one would be interested in questions related to
the network’s vulnerability against targeted attacks, and how to design networks to reduce
the damage done by such attacks.
10

Figure 8. Phylogenetic trees, constructed from genome similarity. Here the nodes
represent species of organisms, and links represent the most plausible evolutionary ancestor
relationships. Top: general phylogenetic tree showing the main organism families and their
genome lengths. Bottom: focus on different strains of human influenza.
11

Figure 9. Ecological networks, mapping species and their mutual dependencies in a


given area (predator/prey or parasitic relationships, supply of food or other resources,
reproduction, etc). The nodes represent the different species of organisms, and links indicate
a mutual dependency.
12

Figure 10. Networks representing economic and financial relationships between the main
players in financial markets. It is now increasingly (and painfully) being realised by
regulators that the complex interconnected nature of the international financial system
means that new mathematical approaches are needed to understand, predict, and prevent
future financial crises. Top: the type of players required in models. Bottom: example of big
players and their dependencies and relations in the international banking network.
13

Figure 11. Immune networks. Nodes represent different ‘T-clones’ (families of immune cells
that coordinate the adaptive immune response to specific invaders). Links indicate that the
T-clones interact with a common B-clone (the B-clones actually trigger the destruction of
the invaders). In humans there are typically around 108 T-clones (i.e. nodes).

Figure 12. Simplified models of magnetic systems in statistical physics. Here the nodes
(of which there are of the order of 1024 ) represent sites on regular (2-dim or 3-dim) lattices,
occupied by atoms with intrinsic magnetic fields. The links indicate which pairs of magnetic
atoms are believed to be able to interact and exchange energy.
14

Figure 13. Mobility networks. These are very important in the modelling and prevention of
the spread of epidemics. Nodes are the main global population centres, and links represent
pairs of population centres with the most intensive traffic of people between them.

Note that network images in themselves are subjective. Different software tools will
use different protocols for deciding on the most appealing placements of nodes, and hence
generally will produce different visual representations of the same network (see e.g. Fig. 14).
It is perfectly fine to use graphical representations as intuitive aids, but we will need precise
mathematical descriptions when it comes to extracting information and testing hypotheses.

Figure 14. The two graphs shown here may look different, but are in fact topologically
identical. They differ only in the choices made for the placements of the nodes in the image
plane, i.e. in cosmetics.

This course will describe some of the mathematical and computational tools that have
been developed over the years to characterise and quantify network and graph topologies,
quantify the complexity of large networks, identify modules in networks, and to understand
the resilience of processes on networks against node or link removal. We also study how to
define and generate synthetic networks with controlled topological features.
15

2. Definitions and notation

2.1. Networks or graphs


Some authors use ‘networks’ to denote the physical objects in the real world, and ‘graphs’
for their mathematical description. Here we will not make this distinction.
• Definition: an N -node graph G(V, E) is defined by a set of vertices (or nodes)
V = {1, . . . , N }, and a set of edges (or links) E ⊆ {(i, j)| i, j ∈ V }
• Definition: a simple graph is a graph without self-links, i.e., ∀(i, j) ∈ E : i 6= j
• Definition: a nondirected graph is a graph with symmetric links only, i.e., if (i, j) ∈ E
then also (j, i) ∈ E
• Definition: a directed graph is one that contains non-symmetric links, i.e., ∃(i, j) ∈ E
such that (j, i) ∈
/E

6• 6•
'$ A
K A
A
5 A
5
•?@
3 A
• •@3 A•
&%
I 6
@ @
@ 4 @ 4
@ • - •A •7
- •
@ •A •7

2 A 2 A

1•
U
A
•8 1•
A
•8

V = {1, 2, 3, 4, 5, 6, 7, 8} V = {1, 2, 3, 4, 5, 6, 7, 8}
E = {(2, 1), (3, 2), (4, 2), (5, 2), (3, 3), E = {(2, 1), (1, 2), (3, 2), (2, 3), (4, 2), (2, 4),
(5, 4), (7, 4), (8, 4), (6, 5)} (5, 2), (2, 5), (5, 4), (4, 5), (7, 4),
(4, 7), (8, 4), (4, 8), (6, 5), (5, 6)}
Figure 15. Left: example of a directed graph. It is not simple, since it has a self-link (3, 3)
(note that in principle we could leave out arrows when drawing self-links, since there are
reciprocal by definition). Right: example of a simple nondirected graph.

• Conventions: in drawing graphs we use the following conventions, see e.g. Fig 15
(i) a node i is represented by a small filled circle
(ii) in directed graphs a link (i, j) is drawn as an arrow from node j to node i
(iii) in simple nondirected graphs a link (i, j), which in such graphs is always
accompanied by a link (j, i), is drawn as a line segment between nodes i and j
(iv) a self-link (i, i) is drawn by a small circle that starts at i and ends at i
16

In situations where in the context of the problem at hand it is clear that we have only
nondirected graphs, we leave out the explicit mentioning of both (i, j) and (j, i) and simply
give

In these lectures we will limit ourselves to non-weighted graphs, i.e. we will not consider
graphs in which links carry a numerical value to represent their sign or strength. We consider
links to be binary objects: they are either present or absent.

2.2. The adjacency matrix of a network


Next we switch to a less cumbersome representation of graphs than sets of links, which will
also make subsequent calculations more easy. In an N -node graph there are N × N potential
links, the presence of each of which can be coded by a binary number, which we arrange as
the entries of a matrix:
• Definition: the adjacency matrix A ∈ {0, 1}N ×N of an N -node graph G(V, E) is defined
by the following entries:
Aij = 1 if (i, j) ∈ E, i.e. if there is a link j → i
∀(i, j) ∈ {1, . . . , N }2 : (1)
Aij = 0 if (i, j) ∈
/ E, i.e. if there is no link j → i

@ Aij = 1 @
i
@ 
@ • @
•@ j
@ @
@ @
@ @

• Consequence: a simple N -node graph has an N ×N adjacency matrix with zero diagonal
elements, i.e. Aii = 0 ∀i ∈ {1, . . . , N }.
• Consequence: a nondirected N -node graph has a symmetric N × N adjacency matrix,
i.e. Aij = Aji ∀(i, j) ∈ {1, . . . , N }2 .
17

• Consequence: a directed N -node graph has a nonsymmetric N × N adjacency matrix,


i.e. ∃(i, j) ∈ {1, . . . , N }2 such that Aij 6= Aji .
There is a one-to-one correspondence between the N 2 binary entries of A and the
specification of which links are present in an N -node graph, so each N -node graph
corresponds to a unique adjacency matrix and vice versa. We could therefore equivalently
also have defined our graphs and their types (e.g. directed, non directed, simple) on the
basis of the adjacency matrices and their properties, instead of starting from the edge and
vertex sets.

We can verify that the two network examples of Fig 15 correspond to the following two
adjacency matrices:

6•  
'$ A
K 0 0 0 0 0 0 0 0
1 0 0 0 0 0 0 0
A  
5
?3
• •
A  
 
 0 1 1 0 0 0 0 0 
&%
I
@ 6  
@  0 1 0 0 0 0 0 0 
@ 4 A= 
@ • - •A •7
- 
 0 1 0 1 0 0 0 0 


2 A

 0 0 0 0 1 0 0 0 

•8 0 0 0 1 0 0 0 0
U
A  
1•
 
corresponds to 0 0 0 1 0 0 0 0

6•  
A 0 1 0 0 0 0 0 0
1 0 1 1 1 0 0 0
 
A
5
•@3 •
 
A  
 0 1 0 0 0 0 0 0 
 
@  0 1 0 0 1 0 1 1 
@ 4 A= 

@ •A •7 
 0 1 0 1 0 1 0 0 

2 A

 0 0 0 0 1 0 0 0 

•8 0 0 0 1 0 0 0 0
A  
1•
 
corresponds to 0 0 0 1 0 0 0 0

We observe indeed that the second adjacency matrix is symmetric (as it corresponds to a
nondirected graph), in contrast to the first.
18

2.3. Paths in networks


Consider products of a graph’s adjacency matrix entries of the following form, with i, j ∈
{1, . . . , N }, and with i` ∈ {1, . . . , N } for all `:
k−1 k−1 f actors
Y z }| {
Ai` i`+1 = Ai1 i2 Ai2 i3 Ai3 i4 . . . Aik−2 ik−1 Aik−1 ik (2)
`=1

Since each individual factor is either 0 or 1 we must conclude that


k−1
Y
Ai` i`+1 = 1 if Ai` i`+1 = 1 ∀` ∈ {1, . . . , k − 1} (3)
`=1
k−1
Y
Ai` i`+1 = 0 otherwise (4)
`=1

But this implies


k−1
Y
Ai` i`+1 = 1 if the graph contains the path of connected links (5)
`=1
ik → ik−1 → . . . → i2 → i1
k−1
Y
Ai` i`+1 = 0 if it does not (6)
`=1

We say that the number of connected edges in a path is its length. For instance, in the first
(directed) graph of Fig 15 we have
A54 A42 = 1 : the graph contains the path 2 → 4 → 5 of length 2
A45 A52 = 0 : the graph does not contain the path 2 → 5 → 4
A65 A54 A42 A21 = 1 : the graph contains the path 1 → 2 → 4 → 5 → 6 of length 4
A76 A65 A54 A42 = 0 : the graph does not contain the path 2 → 4 → 5 → 6 → 7
and so on.

• Definition: a closed path (or cycle) is a path that starts and ends at the same node, so
k−1
Y
Ai` i`+1 = 1 with i1 = ik if the graph contains (7)
`=1
the cycle i1 → ik−1 → . . . → i2 → i1
k−1
Y
Ai` i`+1 = 0 with i1 = ik if it does not (8)
`=1

• Definition: a simple cycle in a graph is one that contains no repeated vertices or edges,
i.e. k−1
Q
`=1 Ai` i`+1 = 1 with i1 = ik and all node labels in {i1 , . . . , ik−1 } are nonidentical.
19

We can also ask whether there is any path of a specified length from a given initial node
j to a specified target node i. Now we do not care what exactly are the intermediate nodes
visited in between i and j. We are interested in the presence or absence of paths of length 2
or more, since having a path of length 1 means simply that Aij = 1 (which can be read off
directly from the adjacency matrix):
N
X N
X  k−1
Y 
... Aii1 Ai` i`+1 Aik j > 0 ⇔ there exists at least one path of (9)
i1 =1 ik =1 `=1 length k + 1 f rom node j to node i
X N X N k−1
Y 
... Aii1 Ai` i`+1 Aik j = 0 ⇔ there exists no path of (10)
i1 =1 ik =1 `=1 lenght k + 1 f rom node j to node i
But since the summations over the indices {i1 , . . . , ik } in the latter formulae are equivalent to
doing matrix multiplications, we can simplify these formulae. We remember the definitions
of matrix multiplication and powers of matrices, e.g.
N
X N
X
0 k+1
(AB)ij = Air Brj , (A )ij = δi,j , (A )ij = (Ak )ir Arj
r=1 r=1

(with the Kronecker δ-symbol, defined as δii = 1 for all i and δij = 0 for all index pairs
i 6= j), and we conclude from these that the following is true
(Ak+1 )ij > 0 ⇔ there exists at least one path (11)
of lenght k + 1 f rom node j to node i
(Ak+1 )ij = 0 ⇔ there exists no path (12)
of lenght k + 1 f rom node j to node i
This shows already the benefit of working with adjacency matrices as opposed to the sets
(V, E) of nodes and links; for large N it would become painful to trace lines in images or
match entries in sets of index pairs, but instead we can simply do matrix multiplication.
Finally, the last step will not come as a surprise. If we don’t care about path lengths
but only ask about connectivity, we may write
X
(Ak+1 )ij > 0 ⇔ there exists at least one path f rom j to i (13)
k≥0
X
(Ak+1 )ij = 0 ⇔ there exists no path f rom j to i
k≥0
20

2.4. Graphs within graphs


• Definition: a subgraph G0 (V 0 , E 0 ) of a graph G(V, E) is a graph such that V 0 ⊆ V and
E 0 ⊆ E.

For instance, the red graph below has V 0 = {1, 2, 4, 5, 6} and E 0 = {(1, 2), (2, 1), (1, 5), (5, 1),
(4, 5), (5, 4), (4, 6), (6, 4)}, and is clearly a subgraph of the one at the top of page 15.

• Definition: the connected components of a graph G are the largest subgraphs of G such
that for each subgraph there exists a path between all vertices within the subgraph.

For instance, the graph in black on the left has the connected components shown in
blue and red on the right:

• Definition: we say that the graph G is a tree if it is connected and if there exist no
simple cycles.

• Definition: a clique in a graph G is a maximal subset V 0 ⊆ V of vertices in the graph


such that every member of the subset has an edge connecting to every other. Here
‘maximal’ means that it is impossible to add any further node to V 0 such that the new
node connects to all nodes in V 0 .
21

In this graph we all cliques of size 2, 3 and 4 shown respectively in purple, green and pink.
Note: the immune networks in Figure 11 consist strictly of connected cliques.

• Definition: a bipartite graph G is one in which the vertices can be divided into two
nonempty disjoint subsets, i.e. V = V1 ∪ V2 with |V1 |, |V2 | > 0 and V1 ∩ V2 = ∅, such
that all edges (i, j) ∈ E have either i ∈ V1 and j ∈ V2 or j ∈ V1 and i ∈ V2 .

The nodes in bipartite graphs can be divided into two qualitatively different groups, and
there are no links between indices in the same group. Examples are networks that represent
sexual relations in heterosexual groups, or graphs that represent relations between diseases
(node group 1) and clinical features (node group 2), networks mapping researchers and the
journals in which they publish, networks of resource generators and resource consumers, etc:

2.5. Connected components in nondirected and directed graphs


Connected components are important topological features of graphs. They form the main
objects of study in percolation theory, which is an important field of study in random graph
theory, see Chapter 6 of the lecture notes, and applied areas of network theory.
We previously defined connected components as the largest connected subgraphs of
G. However, since we did not specify what we mean by largest connected subgraphs, the
definition was a bit loose. Therefore, we reconsider the definition with some care, and this
will also bring us naturally to the concept of strongly connected components in directed
graphs.

2.5.1. Nondirected graphs We say that two nodes i and j in a nondirected graph G are
connected when there exists a path in G = (V, E) starting in i and ending in j, or equivalently,
starting in j and ending in i. We denote the connectedness of two nodes by i ↔ j. We
assuming that i ↔ i, i.e., a node is always connected to itself.
22

Connectedness is an equivalence relation, and therefore we can define equivalence classes


associated to ↔.
The equivalence class [i] associated with a node i ∈ V consists of all nodes j connected
to i, i.e.,
[i] = {j ∈ V : j ↔ i} . (14)
The equivalence classes [i] partition V into nconn nonempty and mutually disjoint sets
Vα , i.e.,
V = V1 ∪ V2 . . . ∪ Vnconn , (15)
where each set Vα = [j] for j ∈ V . The connected components of G can now be defined as
Gα = (Vα , Eα ), (16)
where
Eα = {(j, k) ∈ E : j ∈ Vα and k ∈ Vα } . (17)
If a graph G has only one connected component, then we say that G is connected.

2.5.2. Directed graphs Let G be a directed graph. We say that i → j, when there exists a
path in G starting in i and ending in j. Again, we assume that i → i.
For directed graphs, i → j does not imply that j → i, and hence it is not an equivalence
relation.
Therefore, we say that two nodes i and j are strongly connected when i → j and j → i,
and we denote strongly connectedness as i ↔ j.
Just as was the case for nondirected graphs, ↔ is an equivalence relation, and we
can define the strongly connected components identically as the connected components in
nondirected graphs [through Eqs. (14-17)].
If the number of strongly connected components in a directed graph G equals one, then
we say that the graph G is strongly connected.
Given a strongly connected component Gα = (Vα , Eα ), the corresponding outcomponent
Gα = (Vαout , Eαout ) is the subgraph of G consisting of the nodes
out

Vαout = {j ∈ V : ∀k ∈ Vα , k → j} (18)
and
Eαout = (j, k) ∈ E : j ∈ Vαout and k ∈ Vαout .

(19)
Analogously, we define the incomponent Gin in in
α = (Vα , Eα ) as the subgraph of G consisting
of the nodes
Vαin = {j ∈ V : ∀k ∈ Vα , j → k} (20)
23

Figure 16. Sketch of the topology of the world wide web from Reference [Broder et al.,
Computer networks 33, (2000) 309–320]. The GSCC is the largest strongly connected
component of the world wide web, in the sense that it has the largest number of nodes
N . The IN component consist of all webpages that can reach the GSCC and the OUT
component contains all webpages that can be reached from the GSCC (about a fifth of the
web is located in the IN and OUT components).

and
Eαin = (j, k) ∈ E : j ∈ Vαin and k ∈ Vαin .

(21)
Figure 16 illustrates the so-called bow tie diagram of the World Wide Web, that consists
of the largest strongly connected component, and the corresponding incomponents and
outcomponents.
24

AA
K A A
A A
•@
I

A
6
•@ •
A •@ •
A
@ @ @
@ @ @
@• - •A - @ • •A @• •A

i AA i A i A
• • •
U A A

kiin (A) = 1 ki (A) = 4 ki (A) = 4


kiout (A) = 3 Ci (A) = 1/6 Ci (A) = 2/6 = 1/3
Ti (A) = 1 Ti (A) = 2

Figure 17. Left: in- and out degrees kiin (A) and kiout (A), i.e. the number of arrows flowing
into and out of the given node, in directed graphs. Middle and right: degrees ki (A), triangle
counters Ti (A), and clustering coefficients Ci (A) in nondirected graphs. Ci (A) gives the
fraction of distinct neighbour pairs of i that are themselves connected. In the absence of link
directionality, there is no distinction in nondirected graphs between left- and right-degrees.

3. Microscopic structural characteristics of graphs

3.1. Node-specific quantities


To characterize graph topologies more intuitively, we first inspect simple quantities that
inform us about their structure in the vicinity of individual nodes.
Node degrees. The first of these are the indegrees kiin (A) ∈ IN and outdegrees kiout (A) ∈ IN
of each node i in graph A. They count, respectively, the number of arrows flowing into and
out of node i:
• Definition: the indegree of node i in an N -node graph with adjacency matrix A is
defined as kiin (A) = N
P
j=1 Aij
• Definition: the outdegree of node i in an N -node graph with adjacency matrix A is
defined as kiout (A) = N
P
j=1 Aji

We denote a pair of indegrees and outdegrees for a node i as ~ki (A) = (kiin (A), kiout (A)) ∈ IN2 .
In nondirected graphs we find that always kiin (A) = kiout (A) (see exercises). Here we can
drop the superscripts and simply refer to ‘the degree’ of a node:
• Definition: the degree of node i in a nondirected N -node graph with adjacency matrix
A is defined as ki (A) = N
P
j=1 Aij .
• Definition: the degree sequence of a nondirected N -node graph with adjacency matrix
A is defined as the vector (k1 (A), k2 (A), . . . , kN (A)) ∈ INN .
25

Neighbourhood sets. We say that j is a neighbour of i when either Aij = 1 or Aji = 1. For
nondirected graphs, the neighbours of i form a neighbourhood set,
∂i (A) = {j ∈ V : Aij = 1} , (22)
such that |∂i (A)| = ki (A). For directed graphs we distinguish between inneighbours and
outneighbours through the two sets
∂iin (A) = {j ∈ V : Aij = 1} , (23)
and
∂iout (A) = {j ∈ V : Aji = 1} , (24)
such that |∂iin (A)| = kiin (A) and |∂iout (A)| = kiout (A) .
Clustering coefficients and closed path counters. There are many ways to characterise a
graph’s local structure beyond counting the neighbours of a node. For nondirected graphs,
the clustering coefficient gives the fraction of node pairs linked to i that are themselves
connected:
• Definition: the clustering coefficient Ci (A) of node i with degree ≥ 2 in a nondirected
N -node graph with adjacency matrix A is defined as
number of connected node pairs among neighbours of i
Ci (A) =
number of node pairs among neighbours of i
PN
j,k=1;j,k6=i (1−δj,k )Aij Ajk Aik
= PN ∈ [0, 1] (25)
j,k=1;j,k6=i (1−δj,k )Aij Aik

(we use here the convention that 0/0 = 0).

Note that we have used the Kronecker-delta δj,k , which is defined by


(
1, if j = k,
δj,k = (26)
0, if j 6= k.
We have already seen that products of entries of the adjacency matrix of a graph can
be used to identify paths. We can use this to count the numbers of closed paths of a given
length:
• Claim: the number L` (A) of closed paths of length ` > 0 in an N -node graph with
adjacency matrix A (directed or nondirected) is given by
XN N Y
X `−1  N
X
L` (A) = ... Aik ik+1 Ai` i1 = (A` )ii (27)
i1 =1 i` =1 k=1 i=1

This follows directly from our earlier identities on paths. Note that the sum of the diagonal
entries of a matrix is called its trace, Tr(B) = i Bii , so we have L` (A) = Tr(A` ).
P
26

• Definition: the number of triangles Ti (A) involving node i in a nondirected N -node


graph with adjacency matrix A is defined as
N
1 X
Ti (A) = (1−δj,k )Aij Ajk Aik ∈ IN. (28)
2 j,k=1;j,k6=i

The factor 21 in Ti (A) corrects for overcounting: any triangle starting and ending in
node i can be drawn with two possible orientations. For simple graphs, Ti (A), can be
written as Ti (A) = 12 (A3 )ii .

Note that in simple nondirected graphs one has (see exercises)


(
2Ti (A)/ (ki (A)[ki (A) − 1]) if ki (A) > 1,
Ci (A) = (29)
0 if ki (A) = 0, 1.
Show that replacing ki by ki − Aii the above formula extends to simple nondirected graphs
that are not necessarily simple.
In Fig. 17 we illustrate the various node characteristics with some simple examples.

3.2. Quantities related to pairs of nodes


The distance between two nodes. We define the distance dij (A) between nodes i and j in a
nondirected graph with adjacency matrix A as the length of the shortest path from node i
to node j. It can be expressed in many ways, e.g. upon using our earlier formulae involving
paths:
• Definition: the distance dij (A) between nodes i and j is defined as follows
if there is no path f rom j to i : dij (A) = ∞
if there is a path f rom j to i : dij (A) = smallest integer ` ≥ 0 such that (A` )ij > 0
Note 1: this distance is not the same as the distance between nodes in an image of the graph,
it is only based on how many links need to be crossed when walking from i to j.
Note 2: a path along which the shortest distance between two nodes is realised (there could
be more than one) is also called a geodesic.

An alternative way to obtain the distances between nodes in the graphs, without
checking one by one all possible routes from i to j, is provided by the following identity.
Here the inverse C −1 of an N × N matrix C (if it exists) is the unique matrix with the
property CC −1 = C −1 C = 1I, in which 1I denotes the N ×N identity matrix.
• Claim: log[(1I − γA)−1 ]ij
dij = lim (30)
γ↓0 log γ
Proof:
27
P∞ ` `
We first note that (1I − γA)−1 = `≥0 γ A (see exercises). This series is always
convergent for sufficiently small γ. It then follows that
X∞ ∞
X
−1 `
[(1I − γA) ]ij = `
γ (A )ij = γ ` (A` )ij = γ dij (Adij )ij + O(γ dij +1 )
`≥0 `≥dij
h i
dij dij
=γ (A )ij + O(γ) .
Hence,
h i
dij dij
log[(1I − γA)−1 ]ij log γ + log (A )ij + O(γ)
lim = lim
γ↓0 log γ γ→0 log γ
h i
log (Adij )ij + O(γ)
= dij + lim
γ→0 log γ
1 dij −x
h i
= dij − lim log (A )ij + O(e ) = dij . 
x→∞ x

The matrix inversion is usually impossible analytically, so is done numerically (see exercises).

Node centrality. To quantify how important an individual node i may be to sustain traffic or
flow of information over a graph, two measures of ‘centrality’ have been defined for undirected
simple graphs: the closeness centrality and the betweenness centrality.
• Definition: the average distance di (A) to node i is di (A) = N −1 N
P
j=1 dij (A).
• Definition: the closeness centrality xi (A) of node i is defined as xi (A) = 1/di (A).‡
• Definition: the (unnormalised) betweenness centrality yi (A) of node i is
N X
N
X ni (j, `; A)
yi (A) = , (31)
j=1 `=1
g(j, `; A)
where ni (j, `; A) is the number of shortest paths in the graph from ` to j that pass
through i and where g(j, `; A) denotes the total number of shortest paths from ` to j.
We adopt the convention that, ni (i, i; A) = g(i, i; A) = 1, ni (j, j; A) = 0 if j 6= i, and
that the ratio equals zero when both ni (j, `; A) and g(j, `; A) are zero.
• Definition: Let A be the adjacency matrix of a nondirected, connected graph. The
eigenvector centrality vi (A) ≥ 0 is the i-th entry of the eigenvector v associated with
the so-called Perron root µmax (A) of A (the Perron root is the largest eigenvalue of A),
such that
N
1 X
vi (A) = Aij vj (A). (32)
µmax (A) j=1

‡ This definition is helpful and makes sense (and is therefore used) only for connected graphs, since otherwise
di (A) = ∞ (due to the appearance of node pairs that give dij (A) = ∞).
28

According to the Perron-Frobenius theorem the entries of the corresponding eigenvector


have all the same sign (see Appendix 7.6 for a precise statement of the Perron-Frobenius
theorem). Also, note that the eigenvector centrality is defined up to a nonnegative
proportionality constant, which does not affect the relative ranking of nodes. By
convention, we set the minimal value of vi (A) equal to one.
Nodes with a high closeness centrality have small typical distances to the other nodes, and
are hence relatively close to any area of the graph. Nodes with a high betweenness centrality
are important relay stations that reduce the shortest path lengths between node pairs in the
graph. They need not be on average close to the other nodes, but tend to be the pivotal
nodes that connect otherwise separate parts of the graph.
Figure 18 illustrates these two notions of centrality of the street network of Venice.
We observe that closeness centrality mainly is determined by the geographical location of a
node (nodes with high closeness centrality are central in the picture), whereas betweenness
centrality is a more subtle measure of centrality, which loosely spoken determines the flow
of matter or information through a node.

Figure 18. Illustration of the closeness centrality (left) and betweenness centrality (right)
for a street network (nodes are intersections and edges are streets) of a one-square-mile
sample of Venice [adapted from reference P. Crucitti, V. Latora, and S. Porta, Phys. Rev.
E 73, 036125 (2006)]. The darker a node, the larger its centrality measure.

Note that in the definition of the betweenness centrality we count paths in both
directions. Furthermore, we set ni (i, i; A) = 1 and ni (j, j; A) = 0 if j 6= i, which can
be seen as including paths of length zero into the formula. This may appear superfluous.
However, since this affects all nodes equally (the first contribution leads to an additional
factor two for each node and the second contribution adds a constant term 1), this will not
29

affect the relative ranking of the nodes in the graph, which is what we are really after.

Similarity between node pairs. The functional role of any node i in an N -node graph with
adjacency matrix A is defined strictly by the specification of the links that flow into or out
of it, i.e. by giving the two sets
∂iin = {k ≤ N | Aik = 1}, ∂iout = {k ≤ N | Aki = 1} (33)
In non directed graphs these two sets are identical for all i, so there we would simply speak
of the neighbourhood ∂i (without superscripts). Hence, any measure of similarity of nodes i
and j will somehow quantify the differences between (∂iin , ∂iout ) and (∂jin , ∂jout ). Here we will
show two common definitions for non directed graphs (possible generalisations to directed
graphs are obvious), with |S| denoting the number of elements in the set S:
• Definition: the cosine similarity between nodes i and j with nonzero degrees in a
nondirected N -node graph with adjacency matrix A is defined as
PN
|∂i ∩ ∂j | Aik Ajk
σij (A) = p = p k=1 (34)
|∂i ||∂j | ki (A)kj (A)
If kj = 0 or ki = 0, then we set σij (A) = 0.
• Definition: the Pearson correlation similarity between nodes i and j with nonzero
degrees in a nondirected N -node graph with adjacency matrix A is defined as
PN  P  P 
1 1 N 1 N
N
A A
k=1 ik jk − N
A
k=1 ik N
A
k=1 jk
τij (A) = r  P 2 r  P 2
1
PN 2 1 N 1
PN 2 1 N
N
A
k=1 ik − N k=1 Aik N
A
k=1 jk − N k=1 A jk
PN 1
k=1 Aik Ajk − N ki (A)kj (A)
=q q (35)
1 1
ki (A)[1 − N ki (A)] kj (A)[1 − N kj (A)]

These measures obey −1 ≤ σij (A), τij (A) ≤ 1 for all (i, j) (see exercises).
The origin of the Pearson correlation (or Pearson coefficient) definition of distance
between node pairs is the following. In statistics the Pearson correlation of two variables
(u, v) with joint distribution P (u, v) measures the degree of linear relationship between u
and v, and is defined as follows (see also 7.2):
huvi − huihvi
PC = p (36)
(hu2 i − hui2 )(hv 2 i − hvi2 )
1
PN
One obtains formula (35) above by choosing Pi,j (u, v) = N k=1 δu,Aik δv,Ajk , see exercises
for the proof.
30

4. Macroscopic structural characteristics of graphs

The previous characteristics describe the topology of the graph in the neighbourhood of a
specified node. A global characterisation could be giving the full sequence of these local
numbers, as in the degree sequence k(A) = (k1 (A), . . . , kN (A)). However, as networks
get larger, it is increasingly inconvenient to draw conclusions and derive insight from large
sequences. To arrive at quantitative characteristics that are less sensitive to the number of
nodes, one has two simple options that continue to build on single-node features.

4.1. Mean values of single-node features


The first is to consider the mean values of the single-node quantities.
• Definition: the mean indegree of an N -node graph with adjacency matrix A is given by
k̄ in (A) = N −1 N in
P
i=1 ki (A)
• Definition: the mean outdegree of an N -node graph with adjacency matrix A is given
by k̄ out (A) = N −1 N out
P
i=1 ki (A)

The mean indegree and the mean outdegree in any graph are always identical (see exercises),
which reflects the simple fact that all arrows flowing out of a node will inevitably flow into
another node. So we can use in both cases the simpler notation k̄(A).

PN PN
• Definition: the number of links L in a directed graph is L = i=1 j=1 Aij . In a
nondirected graph we do not count Aij = 1 and Aji = 1 separately, so here the
P PN
number of links would be L = i>j=1 Aij + i=1 Aii , and in a simple, nondirected
PN
graph L = j>i=1 Aij .

• Definition: the density ρ(A) ∈ [0, 1] of an N -node graph with adjacency matrix A is
the number of edges of a graph divided by the maximum possible number of edges.

Hence
PN PN
i=1 j=1 Aij
directed graphs : ρ(A) = 2
(37)
N
P PN P PN
j>i=1 Aij + i=1 Aii j>i=1 Aij + i=1 Aii
nondirected graphs : ρ(A) = 1 = 1
2
N (N − 1) +N 2
N (N + 1)
PN
j>i=1 Aij
simple nondirected graphs : ρ(A) = 1 (38)
2
N (N − 1)
Note: in these definitions we do not count the links (i, j) and (j, i) in nondirected graphs
twice, and the number of non-diagonal entries in a symmetric matrix is 21 N (N − 1). These
31

densities can be written in terms of the average degree of a graph as follows (see exercises):
directed graphs : ρ(A) = k in (A)/N = k out (A)/N (39)
simple nondirected graphs : ρ(A) = k(A)/(N −1) (40)

• Definition: the average shortest path length in a nondirected graph with adjacency
matrix A:
N
2 X
d(A) = 2 dij (A) (41)
N j≥i=1

• Definition: the diameter of a graph with adjacency matrix A is defined as d(A) =


maxi6=j dij (A) (i.e. the distance between the pair of nodes that are furthest from one
another in the graph).
• Definition: the average local clustering coefficient of an N -node graph with adjacency
matrix A is defined as C̄(A) = N −1 N
P
i=1 Ci (A).

Note: a graph is considered ‘small-world’, if C̄(A) is significantly higher than for a random
graph constructed on the same vertex set, and if the graph has approximately the same
mean-shortest path length as its corresponding random graph.

4.2. Distributions of single node quantities


Degree statistics. For large graphs, or when comparing graphs of different sizes, we need
quantities that are intrinsically macroscopic in nature but more informative than just average
values of single-node features. The simplest of these are histograms of the observed values of
the N previously defined local features. If we divide, for each possible value, how often this
value is observed by the total number of observations (i.e. the number of nodes, we obtain
the empirical distribution of the given feature in the graph:
• Definition: the degree distribution of a nondirected N -node graph with adjacency matrix
A is defined as
N
1 X
∀k ∈ IN : p(k|A) = δk,ki (A) (42)
N i=1

It gives for each k the fraction of nodes i in the graph that have degree ki (A) = k.
• Definition: the joint in- and outdegree distribution of a directed N -node graph with
adjacency matrix A is defined as
N
1 X
∀(k in , k out ) ∈ IN2 : p(k in , k out |A) = δ in in δ out out (43)
N i=1 k ,ki (A) k ,ki (A)
32

p(k)

p(k)

k k k
Figure 19. Degree distributions as observed in several nondirected real-world graphs,
suggesting a tendency for these networks to have power-law distributions of the form
p(k) ∼ k −γ , with powers in the range 2 < γ < 3. The evidence for this is somewhat
weak in the first two examples, but the last four do indeed resemble lines in a log-log plot.

It gives for each value of the pair (k in , k out ) the fraction of nodes i in the graph that have
kiin (A) = k in and kiout (A) = k out . Often we abbreviate δkin ,kin (A) δkout ,kout (A) as δ~k,~ki (A) ,
i i

with ~k = (k in , k out ).
In many large real-world networks one observes degree distributions of a power-law form,
see e.g. Figure 19. These are also called ‘scale-free’ networks, since there is apparently no
‘typical’ scale for the degrees in such systems. Most nodes typically have small degrees, but
there is a small number of nodes (the so-called ‘hubs’) with very large degrees. Indeed, the
cumulative distribution
∞ Z ∞
X 1 1 1 1
P (k ≥ κ) = p(k) ≈ dk γ = γ−1
, (44)
k=κ
N κ k N (γ − 1) κ
where N is the normalisation constant of the distribution p(k) and where γ > 1. The
magnitude of the maximal degree kmax can be estimated by setting P (k ≥ kmax ) = 1/N ,
leading to
kmax ∈ O N 1/(γ−1)

(45)
33

and hence the maximal degree diverges as N 1/(γ−1) for N  1, implying the existence of
hubs of very large degree. If we compare this with a graph with Poisson statistics, i.e.,
k
p(k) = e−c ck! , then kmax ∈ O(log N/ log log N ).

Other statistics. Similarly we can define the joint distribution p(k, T |A) of degrees and
triangle numbers in nondirected graphs:
N
1 X
∀k, T ∈ IN : p(k, T |A) = δ δ (46)
N i=1 k,ki (A) T,Ti (A)
Now p(k, T |A) is the fraction of nodes that have degree k and that participate in T triangles.
76

pp
61

95
pp

pp
pp 21
pp
45
8

34
pp
pp pp
pp

55

84 51
pp
87 pp
pp 19
78
10 pp 34 32
pp pp pp pp
pp
pp
pp
58 12 pp 53
9 pp pp 94 pp
pp pp
pp pp
pp
21 65
pp 22
pp 56 75
63 pp pp pp
40 pp
pp pp 5 8 56
pp pp pp
pp 71 41
96 pp 3 pp pp pp pp pp
pp pp
pp 40 pp
pp pp 81
pp pp pp pp 17
89 pp
pp 91 pp 36
pp pp pp pp
pp pp 52 pp pp
pp pp 88
54
pp
pp
27
76 pp pp
pp pp pp
pp
66
pp
pp 70 pp pp pp pp pp 29
48 pp pp 52
92 pp
pp
pp
pp pp
98 24
27 84 pp pp pp
pp 51
pp 73 pp pp pp
pp 90 pp pp
pp pp
pp
pp pp
11 pp
86 70 pp
pp pp 45 37 pp
9
pp pp
64 78 pp pp pp
pp pp
82 pp pp
57 3 pp 77 pp
pp pp pp 42 pp
24 pp pp pp 41
81 pp pp pp 46
pp pp pp pp 74 pp
pp 42 pp pp 82 pp pp
77 pp pp pp 51
pp pp 12 pp pp pp pp pp 97
pp 71 pp pp pp pp 42
pp pp pp 30
pp pp 33 pp pp
pp pp pp pp 75
pp
pp
pp pp
53
pp pp 53 28 pp
pp
pp
pp
28 pp pp 44 81
13 pp 32 47 pp
35 pp
pp pp 83
pp pp 39 pp pp pp pp
pp 46 pp pp 52 pp 23 pp
pp pp 43 pp pp pp pp 26
pp 58 pp pp pp pp
26 pp pp pp 48
pp pp 49 pp pp
pp pp
31 pp pp
11 pp pp 73 91
97 49 pp pp pp pp pp
pp pp pp pp
pp pp pp
50 pp pp
pp pp pp
pp 93 45
pp 86 pp 61 pp pp pp
pp pp pp 69 pp 46 pp
27
pp pp pp pp pp pp 24 pp
pp 30 pp pp
4
pp pp pp 79 100 pp
pp pp
pp 47 74 43
pp 92 pp pp pp 40 pp
pp pp 7 48
83 69 pp pp pp pp pp 54 pp pp
pp pp pp 82
pp
pp 38 pp pp
pp 55 pp pp
pp pp pp pp 28
pp pp
90 pp pp pp 84
pp pp
6 pp pp
pp
13 pp 59 85 pp pp pp
pp pp 39 pp
pp
65 pp pp 88 pp pp pp 16 26 pp pp 47
pp pp pp 98 pp pp pp 38 pp
pp
15 pp pp pp 80 pp
pp
99 18 pp 89 pp 31
43 pp 25 pp pp
2 72 pp 25 pp pp pp pp pp pp
pp 60 pp pp pp
pp pp 63 pp
pp pp pp pp pppp 67 57 4 73
pp 23 96 pp pp 2
pp pp pp 44 pp pp
pp pp 66 37 pp 86 65 pp
pp pp pp
pp pp
pp pp pp pp 22 pp 93 83 pp pp
6 pp pp
pp 20 68 31 pp pp pp pp
pp
92 78 pp pp
pp pp pp pp 49 pp pp pp 30
pp pp 44 22 57 58 pp pp pp pp pp
pp pp pp pp
94 4 pp pp pp pp pp
pp 33 pp pp pp 3
pp 77 pp
pp pp pp pp pp pp pp pp
pp pp pp pppp pp pp
15 pp 35 60 pp pp
95
pp pp pp pp pp pp pp
pp pp 79 pp 87 pp pp
pp 23 14 pp 67
pp 21 pp 69 pp 74
pp pp
56 29 pp
80 pp pp
pp pp pp 32 pp
pp pp
pp
pp 19 5
pp pp pp pp pp
pp 15 pp pp pp pp
pp 67 pp pp pp pp pp pp
pp pp pp pp pp
pp pp pp pp pp
87 pp pp
17
pp pp 64 pp
pp pp
pp pp pp pp pp
pp pp pp 70
72 pp 96 66 pp pp
14 pp 7 85 18 pp 9 72 68 pp
pp 97 pp 20
10 39 pp pp
pp pp 17
pp 41 pp 2 88 6
29 pp 50 99 pp pp pp pp
pp pp 94 pp
pp pp 59 pp pp pp
pp pp pp pp pp
pp pp pp 89 pppp pp pp pp pp
59 100 pp pp pp
pp pp pp 35 pp pp
16 pp pp
pp
pp
76 pp
25 pp 16 pp
64 38 91 pp pp pp pp 7 pp pp 75
pp pp 95pp pp pp pp 71
pp pp 14 pp
pp 54 pp pp pp
5 pp
pp pp 18
62 63 36 10 pp
pp 80 pp pp pppp 100 pp
pp pp 11
pp pp
pppp pp pp
pp 98 pp
pp 8
33
90 pp pp
20 pp 99 pp
pp
19 pp pp
93 pp pp pp
79 60 pp
pp
pp 12
pp
pp 34
37
pp
pp
1 62 pp

1 13
62
pp
pp
34

61

85

55

pp

Erdős-Rényi Modular Small world

1.0 1.0 1.0

0.8 0.8 0.8

p(k) 0.6 0.6 0.6

0.4 0.4 0.4

0.2 0.2 0.2

0.0 0.0 0.0


0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10

k k k

29 95
pp
23 pp 96 pp
pp
90 5
28
pp pp pp pp
pp
pp pp pp
pp
pp
pp 6
pp 13 pp
63 53 86 pp
pp 85
pp
4 pp 94
pp
39 pp
pp
pp pp
97 pp
42 pp 79
pp pp pp
33 88 pp
pp 66 pp
pp pp pp
pp
pp 15 pp
pp pp pp
pp pp pp
62 87
pp 56 60 pp pp pp
pp pp 95 pp
pp 85 19 pp pp 61 84
pp pp 7 16
pp
pp pp
pp 7614
16 27 pp
pp 3 75
pp
pp 40 pp
pp 87 pp pp
pp pp 93
7 pp pp pp pp pp
pp pp
pp
pp pp
pp
68 pp pp
12 pp 98
72 pp
pp
pp
21 pp 77 pp
pp
77 pp pp pp pp
pp 25
75 pp
pp pp 88
pp
17 pp pp
pp pp
pp 69 pp 17 pp 74
pp pp pp pp
pp 43 26
pp 13
95 83
pp 2
pp pp pp 8
pp pp 9 pp 66
43 pp pp pp
24 pp
pp
66 24 pp pp 65
pp 55 25 38 pp
pp pp pp pp
pp 75
pp 41 pp pp
1
pp
64 80 pp pp pp pp pp pp pp pp pp
pp pp 81 74 pp
94 78
pp pp pp 92 pp
pp pp 46 82 pp
pp pp pp pp pp pp pp
pp pp pp pp pp pp
70 pp
70 pp 71 73 pp
pp pp 81 60 78 pp pp
pp pp pp pp pp
pp 97 6 pp 99 35 pp
pp 67 pp pp 76 67
pp pp pp pp pp
pp pp pp 57 pp pp
pp pp pp 12
76 pp 31 27 pp
65 54 38 pp pp pp 89
36
pp
pp pp pp pp
pp pp pp 37 pp
37 pp 4 pp pp pp 64
pp pp pp 28 pp pp pp pp 99 pp
pp 18
pp 4 pp pp pp 56 pp 55 73
pp 84 pp 12
pp pp pp
22 85 pp pp pp pp
pp pp pp pp pp pppp 80 23 pp
pp 100 pp pp pp pp
pp pp 5
100 pp pp 93 pp
89 5 pp pp pp pp
pp 82
47 2 9 34
21 pp pp pp 34 pp
pp pp
pp 45
pp pp pp pp pp pp
48 pp pp pp pp
pp pp pp pp 46
pp 15 pp pp pp pp pp
pp 79 pp
pppp pp pp
53 pp
pp
96 pp 100 91
2 pp pp pp pp pp
pp pp
pp pp pp pp 56 pp 90 pp
pp pp pp pp pp
pp pp pp pp pp
pp 40 pp pp pp
89 26 pp pp
pp 34pp pp pp 48 58 pp 51 pp 1 pp pp
pp pp pp 6pp pppp pp pp pp pp
pp pp 35 pp 79 pp pp
pp
pp pp pp 23 22 pp 37 pp
pp 57 pp 54
pp pp pp pp pp pp
58 pp pp pp pp pp
pp 68 pp
pppp pp 91 32 50 pp
73 pp pp pp pp 31
pp
pp pp 29 pp pp 44
pp pp pp pp 54
pp pp
pp pp
pp pp
pp pp 96 pp 52 pp
pp pp pp 10
98 14 28
pp
11
14 pp pp 25 10 pp 81 63
pp 52 pp
pp pp pp pp pp 47
pp
30 pp pp 10 pp pp pp pp pp pp
pp pp
pp pp
19
pp
pp pp pp 72
83 pp
pp pp pp pp pp pp pp pp pp 80 22 pp
pp pp pp 11 pp pp pp
33
86 98 pp pp pp
pp pp pp pp pp
8 pp pp pp
65 pp
pp pp 59 pp pp
pp pp pp 18 pp pp
pp 15 8 pp
68 pp pp
78 97 pp
pp pp pp
61 pp pp
83 pp
pp
69 26 pp pp pp pp
pp pp pp pp pp pp
36 pp pp
pp pp pp
pp pp 67 pp pp pp
pp 58 69 53
55 pp 64 pp
pp pp
pp 93 30 38
pp 13 99 pp pp 43
pp pp 20
pp 35 pp
pp
21
pp pp 90 7 pp
pp 71
pp 20 pp
pp
27 pp
pp 86 pp pp
72 pp 29 62
36 pp 48
57 pp pp 88 pp
45 pp pp pp 70
62 pp pp 32
pp
74 pp pp
39 44 pp pp pp pp
pp pp 47 pp
33 pp
pp pp
pp 50 pp pp
pp
pp pp pp

32 pp pp
pp pp 59
pp 24
pp pp 52
63 30 pp pp
pp 31 42
39
pp pp 61
19
pp
3 pp pp pp
pp pp
60
pp pp49 pp pp pp
pp
pp

pp 92
91 pp
pp pp
pp
pp pp 49
77 40 pp 41 pp 51
pp pp pp
94 50
pp
51

84

11 pp 92

Scale-free Regular random Periodic lattice

1.0 1.0 1.0

0.8 0.8 0.8

p(k) 0.6 0.6 0.6

0.4 0.4 0.4

0.2 0.2 0.2

0.0 0.0 0.0


0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10

k k k

Figure 20. Examples of small numerically generated nondirected graphs, all with N = 100
and k(A) = 4 (their precise definitions will be given in subsequent sections of these notes).
Clearly, size and average degree do not specify topologies sufficiently – there are still too
many ways to generate graphs with the same size and the same number of links. The degree
distribution provides additional information, but one would still like to go further.
35

4.3. Distributions of multi-node quantities


A logical next step, after having focused on the statistics of features that characterise nodes,
is to turn to features of links. For instance, for simple nondirected graphs A we can define
• Definition: the joint distribution of degrees of connected node pairs in a nondirected
N -node graph with adjacency matrix A is
PN PN
i=1 j=1(j6=i) δk,ki (A) Aij δk0 ,kj (A)
∀k, k 0 ≥ 0 : W (k, k 0 |A) = PN PN (47)
i=1 j=1(j6=i) Aij

@ @
Aij = 1
•@@ •@
@ @
@
@ @
@ @

ki (A) = k? kj (A) = k 0 ?

W (k, k 0 |A) gives the fraction of non-self links in the network that connect a node of degree k
to a node of degree k 0 . Clearly W (k, k 0 |A) = W (k 0 , k|A) for all (k, k 0 ), and W (k, k 0 |A) = 0
if k = 0 or k 0 = 0 (or both). From (47) follows also
• Definition: the degree assortativity a(A) in a nondirected graph is the Pearson
correlation between the degrees of connected nonidentical node pairs,
P 0 0
P 2
k,k0 >0 W (k, k |A)kk − ( k>0 W (k|A)k)
a(A) = P P 2 ∈ [−1, 1] (48)
2
k>0 W (k|A)k − ( k>0 W (k|A)k)

with the marginal distribution W (k|A) = k0 >0 W (k, k 0 |A). We use the convention
P

0/0 = 1, so that in a graph without degree fluctuations (i.e., W (k|A) = δk,c with c
fixed), the assortativity is equal to one.
If a(A) > 0 there is a preference in the graph for linking high-degree nodes to high-degree
nodes and low-degree nodes to low-degree nodes; if a(A) < 0 the preference is for linking
high-degree nodes to low-degree ones. Upon summing the definition (47) over k 0 we see that
the marginal W (k|A) follows directly from the degree distribution, for simple graphs the
relation is
N
X 1 X k
W (k|A) = W (k, k 0 |A) = δk,ki (A) ki (A) = p(k|A) (49)
k0 >0
N k̄(A) i=1
k̄(A)

(for graphs with self-links we would replace k̄(A) → k̄(A) − N −1 N


P
i=1 Aii ). The reason
why W (k|A) 6= p(k|A) is that in W (k|A) the degree likelihood of nodes is conditioned on
36

these nodes coming up when picking links at random; this favours nodes with more links
over those with less. In those graphs where there are no correlations between the degrees of
connected nodes one would find that the joint distribution (47) is simply the product of the
respective marginals (49), W (k, k 0 |A) = W (k|A)W (k 0 |A) for all k, k 0 > 0. Hence, a useful
quantity to characterise correlations is
• Definition: the degree correlation ratio in a simple nondirected graph with adjacency
matrix A is
0 W (k, k 0 |A) k̄ 2 (A) W (k, k 0 |A)
Π(k, k |A) = = (50)
W (k|A)W (k 0 |A) kk 0 p(k|A)p(k 0 |A)
This quantity is by definition equal to 1 for graphs without degree correlations. Any
deviation from Π(k, k 0 |A) = 1 will signal the presence of degree correlations.

The degree correlations captured by Π(k, k 0 |A) can provide valuable new information that
is not contained in the degree distribution p(k|A). For instance, in Fig. 21 we show
two networks with nearly identical degree distributions, that are nevertheless seen to be
profoundly different at the level of degree correlations. The result of calculating the
macroscopic characteristics p(k|A) and Π(k, k 0 |A) for the example protein interaction data
of Fig. 2 is shown in Fig. 22.

For directed networks the degree correlations are described by a function W (~k, ~k 0 |A),
where ~k = (k in , k out ) and ~k 0 = (k 0in , k 0out ), since in directed graphs we must distinguish
between in-in degree correlations, out-out degree correlations, and in-out degree correlations:
• Definition: the joint distribution of in- and outdegrees of connected node pairs in a
simple directed N -node graph with adjacency matrix A is
PN PN
i=1 j=1 δ~k,~ki (A) Aij δ~k0 ,~kj (A)
∀k, k 0 ≥ 0 : W (~k, ~k 0 |A) = P
ij Aij
N N
1 XX
= δ~ ~ Aij δ~k0 ,~kj (A) (51)
N k̄(A) i=1 j=1 k,ki (A)

@ @
Aij = 1
•@@ •@
@ @
@
@ @
@ @

~ki (A) = ~k? ~kj (A) = ~k 0 ?

with ~ki (A) = (kiin (A), kiout (A)).


37

p(k) Π(k, k 0 )
0.4 40
1.4

0.3 30
1.2
0
k
0.2 20 1

0.8
0.1 10

0.6

0.0
0 10 20 30
10 20 30 40

k k

0.4 40
1.4

0.3 30
1.2
k0
0.2 20 1

0.8
0.1 10

0.6

0.0
0 10 20 30
10 20 30 40

k k

Figure 21. Illustration of the limitations of using only degree statistics to characterise
graphs. The two nondirected N = 5000 graphs shown here look similar and have nearly
indistinguishable degree distributions p(k) (shown as histograms). However, they differ
profoundly at the level of degree correlations, which is visible only after calculating the
functions Π(k, k 0 ) for the two graphs, shown as heat maps on the right. In the top graph,
high degree nodes tend to be connected more to other high degree nodes. In the bottom
graph there is a strong tendency for high degree nodes to connect to low degree nodes.

W (~k, ~k 0 |A) gives the fraction of links in the network that connect a node with in- and out
degrees ~k to a node with in- and outdegrees ~k 0 . Clearly W (~k, ~k 0 |A) = 0 if ~k = (0, ?) or
~k 0 = (?, 0) (or both), but now we may find that W (~k, ~k 0 |A) 6= W (~k 0 , ~k|A).
The left and right marginals of W (~k, ~k 0 |A) need not be identical (in contract to
nondirected graphs). For simple directed graphs we find (see exercises):
X
W1 (~k|A) = W (~k, ~k 0 |A) = p(~k|A)k in /k̄(A),
~k0 ∈N2
38

Π(k, k 0 )
0.4 50
1.4
40
0.3
1.2
p(k) k0 30
0.2 1

20
0.8
0.1
10
0.6

0.0
0 10 20 30
10 20 30 40 50

k k

Figure 22. The degree distribution p(k) (42) (left), and the normalised degree correlation
kernel Π(k, k 0 ) (50) (right, shown as a heatmap) for the protein interaction network of Fig.
2. Here N ≈ 9000 and k̄(A) ≈ 7.5. Significant deviations from Π(k, k 0 ) ≈ 1, i.e. deviations
from green in the heat map, imply nontrivial structural properties beyond those captured
by the degree distribution.

for the left marginal, and


X
W2 (~k 0 |A) = W (~k, ~k 0 |A) = p(~k 0 |A)k out0/k̄(A),
~k∈N2

for the right marginal.

• Definition: the degree correlation ratio in a simple directed graph with adjacency matrix
A is
W (~k, ~k 0 |A) k̄ 2 (A) W (~k, ~k 0 |A)
Π(~k, ~k 0 |A) = = in out0 (52)
W1 (~k|A)W2 (~k 0 |A) k k p(~k|A)p(~k 0 |A)
This quantity is by definition equal to 1 for directed graphs without degree correlations.
Any deviation from Π(~k, ~k 0 |A) = 1 will signal the presence of degree correlations.

4.4. Generalisation to node features other than degrees


In addition to inspecting the joint statistics of degrees of connected nodes, we can generalise
the idea to arbitrary discrete features xi ∈ X of a node i (which could be the degree of i,
but could also indicate its colour, gender, physical location, etc).
• Definition: the joint distribution of discrete features x of connected node pairs in a
nondirected N -node graph with adjacency matrix A is
P
i6=j δx,xi (A) Aij δx0 ,xj (A)
∀x, x0 ∈ X : W (x, x0 |A) = P (53)
i6=j Aij
39

@ @
Aij = 1
•@@ •@
@ @
@

@ @
@ @

xi (A) = x? xj (A) = x0 ?

W (x, x0 |A) gives the fraction of non-self links in the network that connect a node with
features x to a node with features x0 . Clearly W (x, x0 |A) = W (x0 , x|A) for all (x, x0 ). From
(53) follows also
• Definition: the assortativity a(A) relative to the discrete feature x in a nondirected
graph is the Pearson correlation between features of connected nonidentical node pairs,
P 0 0
P 2
x,x0 ∈X W (x, x |A)xx − ( x∈X W (x|A)x)
a(A) = P P 2 ∈ [−1, 1] (54)
2
x∈X W (x|A)x − ( x∈X W (x|A)x)

with the marginal distribution W (x|A) = x0 ∈X W (x, x0 |A).


P

If a(A) > 0 the linked nodes tend to have positively correlated features; if a(A) < 0 they
tend to have negatively correlated features. Note that this definition (54) therefore is sensible
only for features x whose values are ordered in a meaningful way – like height or age, but in
contrast to e.g. colour.
Upon summing (53) over x0 we see that the marginal W (x|A) is
N
X 1 X
W (x|A) = W (x, x0 |A) = δx,xi (A) ki (A) (55)
0
x ∈X
N k̄(A) i=1
−1
P
Using the joint distribution p(x, k|A) = N i δx,xi (A) δk,ki (A) of features and degrees of
nodes we can simplify the marginal of W to
X k
W (x|A) = p(x, k|A) (56)
k>0
k̄(A)

4.5. Modularity and community detection


Sometimes the prominent structure of a network is modularity, see e.g. Fig 23. In such
graphs nodes connect preferentially to other nodes that have the same module label – in fact
finding the optimal modules, i.e. the optimal assignment of a string (x1 , . . . , xN ) of module
40

Figure 23. Examples of modular graphs. Here each node has a feature xi that represents
membership of a specific subset of nodes (its ‘module’, here the modules are shown colour-
coded). Nodes are more frequently connected to partners within the same module, as
opposed to partners from another module.

labels to the nodes in the network, is a common problem in network applications. We can
now use the module membership label of each node as its feature in the sense above.
To quantify the extent to which a simple nondirected graph is modular, we compare
the number of ‘like-connects-to-like’ connections (or intra-modular links) in the graph A to
what we would have found if the wiring had been completely random:
N
1 X
nr of intra−modular links in A : Lintra (A) = Aij δxi ,xj (57)
2 i6=j=1

(where the factor 21 reflects the nondirected nature of the graph, we don’t want to count
the same link twice). In contrast, in a random graph A0 (which we will study more
rigorously later) with the same degree sequence k = (k1 , . . . , kN ) as the graph A we would
calculate the expectation value of the above quantity as hLintra (A0 )i = 21 i6=j hA0ij δxi ,xj i =
P
1 0
P
2 i6=j hAij iδxi ,xj , since there is assumed to be no relation between the labels x and the
adjacency matrix. Now
hA0ij i = P (A0ij = 1|k).1 + P (A0ij = 0|k).0 = P (A0ij = 1|k) (58)
Here P (A0ij = 1|k) is the probability that in a random graph with degree sequence
k = (k1 , . . . , kN ) one will find nodes i and j connected. This must be proportional to
41

ki and kj , so we can estimate that hA0ij i ≈ ki kj /C. The value of C then follows upon
summing both sides over i and j, giving (since A and A0 have the same degrees):
N
X XN XN
hA0ij i =( ki )( kj )/C hence N k̄ = (N k̄)2 /C so C = N k̄ (59)
i,j=1 i=1 j=1

This leads to our estimate hA0ij i = ki kj /N k̄, and hence


N
1 X ki kj
nr of intra−modular links in A0 : Lintra (A0 ) ≈ δx ,x (60)
2 i6=j=1 N k̄ i j

We can then define (apart from an overall scaling factor) our measure of modularity in terms
of the difference between the number of intra-modular links seen in A and the number we
would expect to find by accident in a random non-modular graph A0 with the same degrees:
• Definition: the modularity of a simple, nondirected graph with adjacency matrix A is
N N
1 X X  ki (A)kj (A) 
Q~x (A) = Aij − δxi ,xj (61)
N k̄(A) i=1 j=1;j6=i N k̄(A)

The modularity obeys −1 ≤ Q(A) ≤ 1 (see exercises).

5. Processes on networks and their relation to spectral features

We often study networks because they are the infrastructure of some process. Here we
inspect simple dynamical processes for variables placed on the nodes of graphs, to find out
which network aspects impact on the processes that they support. This leads us to the
eigenvalue spectrum of the matrix A, and of the so-called Laplacian matrix of the graph.

5.1. Voter models on networks


Voter model on undirected graphs.

Imaging having a simple nondirected N -node graph with adjacency matrix A. Each node i
represents an individual, and carries a continuous variable si , which represents e.g. a voting
opinion (si > 0: vote for party A; si < 0: vote for party B; si = 0: undecided). Alternatively,
we could think of the si representing the orientations of magnetic atoms (si > 0: north pole
up; si < 0: north pole down).
• Dynamical equations:
Each individual i gathers opinions (or magnetic forces) from his/her social circle ∂i ,
which is its neighbourhood on the graph: ∂i = {j ≤ N | Aij = 1}, and has his/her
42

opinion si driven by social pressure (the cumulative opinions) from the environment ∂i :
N
d X X
si (t) = sj (t) − λsi (t) = (A − λ1I)ij sj (t). (62)
dt j∈∂ j=1
i

The parameter λ > 0 represents a decay parameter – in the absence of peer pressure,
i.e. for nodes without neighbours, so ∂i = ∅, one would find si (t) = si (0)e−λt . In vector
form, with s(t) = (s1 (t), . . . , sN (t)) equation (62) reads
d
s(t) = (A − λ1I)s(t). (63)
dt

• Solution of the dynamical equations:


Equation (63) is a set of N coupled, linear differential equations. We want to obtain an
expression for s(t) as a function of the initial state s(0) and the matrix A. To this aim,
we transform this coupled set of equations in a set of N independent, linear differential
equations by decomposing the matrix A into its eigenvectors. Since the matrix A is
symmetric, it has a complete set of N orthogonal eigenvectors êk , with k = 1 . . . N ,
which can be normalised such that
0
∀k : Aêk = µk êk ; ∀k, k 0 : êk · êk = δk,k0 . (64)
Here {µ1 , . . . , µN } are the N (not necessarily distinct) real-valued eigenvalues of A;
since they depend on A we should write µk (A), but if there is no risk of ambiguity we
will drop the argument A to reduce clutter in formulae. We can use the N eigenvectors
as our new basis in IRN , and write for any s(t) ∈ IRN : s = N k
P
k=1 σk (t)ê . Hence,
N
X
s(t) = σk (t)êk . (65)
k=1

Inserting this into (63) gives


N N
d X X
σk (t)êk = (A − λ1I) σk (t)êk
dt k=1 k=1
N  N
X d 
k
X
⇔ σk (t) ê = σk (t)(A − λ1I)êk
k=1
dt k=1
N  N
X d 
k
X
⇔ σk (t) ê = σk (t)(µk − λ)êk . (66)
k=1
dt k=1

Taking the inner product on both sides of (66) with ê` , using (64), gives
d
σ` (t) = (µ` − λ)σ` (t). (67)
dt
43

The solution is
σ` (t) = σ` (0)e(µ` −λ)t . (68)
The initial coefficients σ` (0) are obtained from
N
X N
X
` ` k
ê · s(0) = σk (0)ê · ê = σk (0)δk,` = σ` (0), (69)
k=1 k=1

where we have used the decomposition (65). Equations (65), (68) and (69) imply that
N
X
s(t) = (êk · s(0))e(µk −λ)t êk , (70)
k=1

which provides s(t) as a function of the initial state s(0) and the eigenvec-
tors/eigenvalues of the matrix A.
Eigenvalues of A with µk < λ will have σk (t) → 0 and those with µk > λ will have
σk (t) → ±∞. Hence either |s(t)| → 0 or |s(t)| → ∞ as t → ∞; the dynamical variables
evolve either to zero or to infinity. Since quantities in real systems do not diverge, the model
(62) can only be considered a good description of social or magnetic interactions at finite
times t.

Which opinion do the nodes i in the graph take in the limit of asymptotic times? To
answer this question, we order the eigenvalues from large to small, as µ1 > µ2 ≥ µ3 . . . ≥ µN ,
and we assume that µ1 is a nondegenerate eigenvalue, which is the case when A represents
a connected graph. There are two possible scenarios. Either (i) µ1 < λ, in which case
lim |s(t)| = 0 (71)
t→∞

and nodes develop no opinions in the large time limit, or (ii) µ1 > λ, in which case
lim |s(t)| = ∞. (72)
t→∞

In the latter case, we can express


s(t) = e(µ1 −λ)t (ê1 · s(0))ê1 + O(e(µ2 −µ1 )t )
 
(73)
where the entries of the leading eigenvector (ê1 )i = vi > 0 are the eigenvector centralities,
as defined in Eq. (32). Hence,
N
!
X
lim sign(si (t)) = sign vj sj (0) (74)
t→∞
j=1

for all i ∈ V , where sign(x) is the sign function returning the sign of x. In words, when
µ1 > λ, then asymptotically all nodes take the same opinion given by the sign of the sum of
the initial opinions weighted by their eigenvector centralities.
44

Voter model on directed graphs

We allow for situations where node i affects node j, but node j has no influence on node i.
In this case we obtain a voter model on a directed graph with a nonsymmetric adjacency
matrix A.

• Dynamical equations:
The dynamical equations for a voter model on a directed graph are of the form
N
d X X
si (t) = sj (t) − λsi (t) = (A − λ1I)ij sj (t). (75)
dt in j=1
j∈∂i

In vector form equation (75) reads


d
s(t) = (A − λ1I)s(t). (76)
dt

• Solution of dynamical equations:


We transform the coupled set of equations (76) in a set of N independent, linear
differential equations. To this aim, we make the assumption that the matrix A
is diagonalisable, which means that there exists an invertible matrix S such that
A = S−1 DS, where D is a diagonal matrix. A diagonalisable matrix A has a
k
biorthonormal set of N right eigenvectors r̂ k and left eigenvectors l̂ such that
k k k 0
∀k : Ar̂ k = µk r̂ k , AT l̂ = µk l̂ ; ∀k, k 0 : l̂ · r̂ k = δk,k0 . (77)
Here {µ1 , . . . , µN } are the N complex-valued eigenvalues of A and AT is the matrix
transpose of A. We can use the N right eigenvectors as our new basis in CN , and write
for any s(t) ∈ CN : s(t) = N k
P
k=1 σk (t)r̂ . Hence,
N
X
s(t) = σk (t)r̂ k . (78)
k=1

Inserting this into (76) gives


N N
d X X
σk (t)r̂ k = (A − λ1I) σk (t)r̂ k
dt k=1 k=1
N  N
X d  X
⇔ σk (t) r̂ k = σk (t)(A − λ1I)r̂ k
k=1
dt k=1
N  N
X d  X
⇔ σk (t) r̂ k = σk (t)(µk − λ)r̂ k . (79)
k=1
dt k=1
45
`
Taking the inner product on both sides of (79) with l̂ , using (77), gives
d
σ` (t) = (µ` − λ)σ` (t). (80)
dt
which is solved with
σ` (t) = σ` (0)e(µ` −λ)t (81)
The initial coefficients σ` (0) are obtained from
N N
` X ` k
X
l̂ · s(0) = σk (0)(l̂ · r̂ ) = σk (0)δk,l = σ` (0), (82)
k=1 k=1

where we have used the decomposition (78). Equations (78), (81) and (82) imply that
N
X k
s(t) = (l̂ · s(0))e(µk −λ)t r̂ k , (83)
k=1

which provides s(t) as a function of the initial state s(0), the right eigenvectors/left
eigenvectors, and the eigenvalues of the matrix A.

Eigenvalues of A with Re[µk ] < λ will have σk (t) → 0 and those with Re[µk ] > λ will have
σk (t) → ±∞. Hence, again either |s(t)| → 0 or |s(t)| → ∞ as t → ∞.

5.2. Stability of fixed points in the nonlinear voter model


The voter model admits divergent states. In order to overcome this unwanted feature, we
consider a set of nonlinear, coupled differential equations
N
d X
si = Aij fij (sj (t)) − λsi (t), (84)
dt j=1

where fij is real-valued function that is finite when |sj (t)| → ∞.


A general solution for (84) is not known. Nevertheless we can determine whether a fixed
point (or stationary state) of the dynamics is stable.
In general, equations of the type (84) admit a large number of fixed points. A fixed
point is a vector s∗ ∈ R that satisfies
N
X
Aij fij (s∗j ) = λs∗i , (85)
j=1

and is thus an attractor of the dynamics (84).


We say that a fixed point is stable when there exist a neighbourhood set U ⊂ RN (of
finite volume) that contains the fixed point s∗ and limt→∞ s(t) = s∗ for all initial states
s(0) ∈ U . If there exist no such neighbourhood set U , then we say that the fixed point is
unstable.
46

We would like to determine which features of networks render fixed points unstable,
since unstable fixed points are not favourable. We analyse the dynamics (84) in the vicinity
of a fixed point s∗ , viz.,
N N
d X X
Aij fij (s∗j ) − λs∗i + Aij fij0 (s∗j ) − λδij (sj (t) − s∗j )

si =
dt j=1 j=1
N
1X
+ Aij fij00 (s∗j )(sj (t) − s∗j )2 + higher order terms (86)
2 j=1

Introducing the variable yi (t) = si (t) − s∗i , that measures the deviation between the state
s(t) and the fixed point s∗ , using (85), and
N N
1X 00 ∗ ∗ 2 1 00 ∗
X
(sj (t) − s∗j )2 ∈ O |s(t) − s∗ |2 ,

Aij fij (sj )(sj (t) − sj ) ≤ maxj |Aij fij (sj )|
2 j=1 2 j=1

we obtain for (86)


N
d X
Bij yj (t) + O |y(t)|2 ,

yi = (87)
dt j=1

with Bij = Aij fij0 (s∗j ) − λδij . In other words, the dynamics of (84) in the vicinity of a fixed
point is linear.
In the previous section we have seen how to solve linear relations of the type (87) if B
is a diagonalisable matrix. We can then decompose B into its right and left eigenvectors to
obtain
N
X k
y(t) = (l̂ · y(0))eµk t r̂ k , (88)
k=1
k
where µk , l̂ and r̂ k are now the eigenvalues, left eigenvectors and right eigenvectors of the
matrix B. In the limit t → ∞ the eigenvalue µk that has the largest real part Re[µk ]
dominates. We obtain the following simple criterion for the stability of fixed points: a fixed
point is stable if all Re[µk ] < 0. If there exist at least one eigenvalue with a real part that is
positive, then the fixed point is unstable.

5.3. Diffusion processes: - the Laplacian matrix of a graph


Imaging again having a simple nondirected N -node graph with adjacency matrix A. Each
node i of the graph now contains a conserved resource zi (t) ∈ IR+ (e.g. energy, water, food,
money, etc), which can diffuse (or ‘leak’) away to its neighbours, always from high to low
levels. With ‘conserved’ we mean that, in contrast to the variables in our previous dynamical
models, if the amount of the variable increases at one node it has to decrease somewhere else
47

in compensation. The rate of diffusion between two nodes is larger when their differences in
resource levels are larger, as would be the case with e.g. heat or water pressure.
Based on the above considerations, we define the diffusion model as
N
d X X
zi (t) = [zj (t) − zi (t)] = Aij zj (t) − ki (A)zi (t), (89)
dt j∈∂ j=1
i

for all i ∈ V . This process can be written in terms of the N × N so-called Laplacian matrix
d
zi (t) = − N
P
L with entries Lij , as dt j=1 Lij zj (t), where

Lij = ki (A) δi,j − Aij (90)


In vector form, with z(t) = (z1 (t), . . . , zN (t)), this becomes
d
z(t) = − Lz(t) (91)
dt
This is again a linear equation, which can be solved similar to earlier examples by
transformation to the (complete) basis of eigenvectors of the symmetric matrix L, and ends
up giving a solution expressed in terms of eigenvalues and eigenvectors of L, namely,
N
X
z(t) = (êk · z(0))e−µk t êk , (92)
k=1

where µk and êk should now be understood as the eigenvalues and eigenvectors of the
Laplacian L.
Hence, the dynamical and asymptotic features of diffusion-type processes on graphs are
apparently controlled by the eigenvalue spectrum of the Laplacian matrix, rather than that
of the adjacency matrix. Note: the entries of the Laplacian matrix of a simple nondirected
graph are no longer binary, Lij ∈ {0, −1} if i 6= j, and Lii ∈ IN.
We could again worry about the possibility of exponentially diverging solutions, but we
will see below that all eigenvalues of L are nonnegative, so here this cannot happen. In fact
we can show from (89) that the total amount Z(t) = N
P
i=1 zi (t) is conserved over time:
N XN N N
d X  X X
Z(t) = Aij zj (t) − ki (A)zi (t) = kj (A)zj (t) − ki (A)zi (t) = 0. (93)
dt i=1 j=1 j=1 i=1

5.4. Random walks on networks


Random walks on nondirected graphs are discrete versions of diffusion processes. We define
pj (n) ∈ [0, 1] as the probability that the walker is at site j at time n ∈ IN. At each time
step (s)he moves to a new site i, selected randomly and with equal probabilities from the
48

neighbours of j. Since there are kj (A) sites to choose from, the probability to go to each
one of these is kj−1 (A). Hence the dynamical equations are
N
X 1 X Aij
pi (n + 1) = pj (n) = pj (n). (94)
kj (A) j=1
k j (A)
j∈∂i (A)

For simplicity we have excluded the possibility of nodes with degree equal to zero.
The random walk process also obeys mass conservation as
N N PN N
i=1 Aij
X X X
pi (n + 1) = pj (n) = pj (n), (95)
i=1 j=1
kj (A) j=1
PN
where we have used that i=1 Aij = kj (A). Since the pP j (n) are probabilities, we set
PN N
j=1 pj (0) = 1, and then according to (95) it holds that i=1 pi (n) = 1 for all values
0
of n ∈ N. Analogously, let V ⊆ {1, 2, . . . , N } be the vertex set of a connected component
of G. It then holds that (try to show this as an exercise)
X X
pi (n) = pi (0). (96)
i∈V 0 i∈V 0

With the definition Dij (A) = ki (A)δij this can be rewritten as pi (n + 1) =


PN −1
j=1 (AD )ij pj (n), and hence, upon defining p(n) = (p1 (n), . . . , pN (n)), we can write
the solution at any time as
p(n) = (AD −1 )n p(0) (97)
At long times t  1, p(n) may evolve towards a stationary solution p∗ for which it
holds that
p∗ = AD −1 p∗ . (98)
We call this a stationary solution as p(n) = p(n − 1) whenever p(n − 1) = p∗ , and hence
this state does not evolve in time.
Multiplying the left-hand side of (98) by DD −1 and defining x∗ = D −1 p∗ we find that
(D − A)x∗ = 0. (99)
Since D − A is the Laplacian matrix, it holds that
p∗ = Dx∗ (100)
where x∗ is a vector in the kernel of L, i.e., Lx∗ = 0.
49

5.5. PageRank
The World Wide Web is a large depository of webpages. Search algorithms are important in
order to efficiently browse through the web and to find the information relevant to you. In
order to provide you with the most useful web pages, search engines use ranking algorithms,
that rank webpages according to their importance. The PageRank algorithm, developed by
by Brin and Page, was a breakthrough at the time: it does not rank webpages based on their
content, but rather on how they are connected with each other.
PageRank considers the world wide web as a graph of webpages (nodes) and hyperlinks
(directed edges). A rating pi ≥ 0 is assigned to each webpage i ∈ {1, 2, . . . , N } according to
its importance: the larger the value of pi , the more important is the webpage. In order to
determine the rating, PageRank interprets directed edges j → i as recommendations from a
node j to a node i. The rating pi satisfies the recursive relation
N
X pj
pi = Aij out , (101)
j=1
kj (A)
which follows the credo that important webpages are referenced by important webpages.
Notice that the rating pi increases significantly when it is recommended by webpages with
a high rating pj that contain few hyperlinks (small kjout ). A non-zero solution to (101) is
called a PageRank vector.
Comparing (94) with (101), we observe that the rating pi is equal to the probability to
observe a stationary, random walker on a directed graph at node i. Indeed, we can write
N
X pj (t)
pi (t + 1) = Aij out . (102)
j=1
k j

The PageRank vector is the stationary state p = limt→∞ p(t).


Unfortunately, since the World Wide Web has a directed structure, the formula (101)
cannot work. Due to the directed nature of the world wide web (see figure 16 for a general
sketch of the network topology of the web) the random walker will get trapped in nodes with
kjout = 0. Moreover, because of the directed nature of the web, the random walker can get
trapped in a cycle state and limt→∞ p(t) does then not exist.
Both issues can be resolved through the introduction of a damping factor α > 0,
N
X pj (t) 1
pi (t + 1) = α Aij out + (1 − α). (103)
j=1
kj N
The factor α implies that the random walker can jump to any other node in the graph with
a probability N1 (1 − α). In principle, we would like α to be close to one, since otherwise
all information about the webgraph gets lost. At the same time, we wish the iterative
algorithm (103) to converge fast, and for this we need α to be smaller than one. Google
chooses α ≈ 0.85, see reference [S. Brin and L. Page, Computer networks and ISDN systems
30, (1998) 107–117].
50

5.6. Spectra of adjacency matrices of undirected graphs


Properties of eigenvalue spectra of adjacency matrices. The previous pages showed why the
spectra of adjacency matrices and Laplacians are important, from the point of view of the
impact of topological features of graphs on the processes for which they are the infrastructure.
We now investigate the properties of these spectra, and get some feeling for which spectra
we might expect to find for real networks. We know from linear algebra that all eigenvalues
of symmetric matrices are real-valued, hence the above restriction to nondirected graphs.
Adjacency matrices of directed graphs will indeed normally have complex eigenvalues.

In the remainder of this subsection, let A be the adjacency matrix of a nondirected


N -node graph, and let µmin (A) and µmax (A) denote the smallest and the largest eigenvalue
in the set {µ1 (A), . . . , µN (A)}. Let u be the N -dimensional vector u = (1, 1, . . . , 1):
• Claim: µmin (A) ≤ k̄(A)
We use the variational principle for eigenvalues (see Appendix formula (213)):
Proof: N
x · Ax u · Au 1 X
µmin (A) = minx∈IRN ≤ = Aij = k̄(A)
x2 u2 N i,j=1

• Claim: µmax (A) ≥ k̄(A)


Proof: x · Ax u · Au 1 X
N
µmax (A) = maxx∈IRN ≥ = Aij = k̄(A)
x2 u2 N i,j=1

• Claim: µmax (A) ≤ maxj=1...N kj (A).


Proof:
Let µ be an eigenvalue of A and x ∈ IRN the corresponding eigenvector. Define
x? = maxi xi . If x? ≤ 0 (so all components of x are non-positive) we replace x → −x,
so that we will have x? > 0 (note: since x 6= 0 this is always possible to achieve). Now
choose i to be a site with xi = x? , and use xj ≤ x? for all j:
N
X
?
µx = Aij xj
j=1
N N
X xj X x?
µ = Aij ≤ A ij ? = ki (A) ≤ maxj=1...N kj (A)
j=1
x? j=1
x
Since this result holds for any eigenvalue µ, we can indeed also state the above claim.

We can combine the above three inequalities into the following corollary:
µmin (A) ≤ k̄(A) ≤ µmax (A) ≤ maxj=1...N kj (A) (104)
51
human PIN N = 5000 simple N = 5000 complex
1 1 1

%(µ)

0 0 0
-5 -3 -1 1 3 5 -5 -3 -1 1 3 5 -5 -3 -1 1 3 5

µ µ µ

Figure 24. Adjacency matrix eigenvalue distributions (i.e. histograms of the numbers
µ1 (A), . . . , µN (A)) of three graphs shown and quantified in previous figures. Left:
eigenvalue distribution for the human protein interaction graph (see Figures 2 and 22).
Middle and right: eigenvalue distributions for the two graphs in Fig. 21. The middle
histogram refers to the graph with only weak degree correlations (top line in Fig. 21) and
the right histogram refers to the graph with strong degree correlations (bottom line in Fig.
21). The eigenvalue spectra of the last two graphs are seen to be significantly different, in
spite of their nearly identical degree distributions.

• Definition: a nondirected N -node graph with adjacency matrix A in which all degrees
ki (A) are identical to q, i.e. ki (A) = q for all i, is called a q-regular graph.

• Claim: the largest eigenvalue of the adjacency matrix A of a q-regular nondirected N -


node graph with adjacency matrix A is µmax (A) = q.
Proof: this follows directly from (104).

• Claim: the eigenvalue spectrum {µ1 (A), . . . , µN (A)} of a simple nondirected N -node
graph obeys
N
1 X
µk (A) = 0 (105)
N k=1
Proof:
We use the fact that for each symmetric matrix A there exists a unitary N × N matrix
U , i.e. one such that U U † = U † U = 1I, such that A = U D(µ)U † , where D(µ) is the
diagonal matrix with entries D(µ)ij = µi (A)δij . Now
X Xh i Xh i
` ` † `
µk (A) = D (µ) = (U AU )
kk kk
k k k
340175
481 430
52
447 776

594

811 843 255 155 123 124


688 254 256 154 156
186 187 92
80 468 529 253 250 249 248
727 999 122 91 125 93
1000 209 185 60
474 247 208 61
887 997 998 251 210 257 218153
772 95 200 217 29 94 62
996 207 90 59
28 30
990 211 473 201 121
720 995 252 199 184 188 157
686 991 246 206 216 126
994
258 202 3163
647 992 212 152 219 189 27
475 249 220 58 95
475 907 989 158 127
391 789 205 203 89
993 472 505 248 251 96 64 32
161 160 213 245
346 131 198 504
333 711 215 250 65
570 698 637 204 183 120 97
999 405 476 638 506 259
636 128
503 190 159 66 34
982 162 635 33 1
284 124 130 214 674 247 26 35
389 159 988675 639 282 221
759 132
160 330 244 634 502 151 4 3
947 113 315 725 673 57
626 280 252 98 67 36 5
230 722 899 2
471 676 633
723 761 129 163 507 129
477 197 625 281 88 160
519 753 151 869 672 215 640 260 214
413 832 133 243 627 279
578 205 158 374 283
985 763 987 501 246 191 6
512 900 898 632
444 452 182
111 357 119 68 37
661 677 538
473 128 628 222 569
671 216 242 313 99
85 164 478 25
641 373 952 624 278 570
685 975 508 500 631 506
328 134 375 953 253 601
165 629 130 537
157 470 479 314 507 600
173 697 15 901 986 196 951 539
828 261 150 284 56 7
991 897 241 161
952 898 670 245
87 127 217 480 678 213 311 312 568
327 954 630 475
808 167 307 21 499 474 505
762 329 642 38 571 632
49 768 481 623 950 87 192 602
815 791 277
465 165 372 240 310 315 69 631
269 822 930 135 902 985
171 976 509 443 536 508 599
100 669 156 181
619
878 987 442 476 540
638 684 326 376 223
998 664 182 755 891 218 485 100
535 420 484 473 633
195 949 498 118 411
126 896 482 469 239 663
8 713 233 234 309 344 345 285 410 567
553 9 974 643 262 347 379 444 603 662
436 144 330 955 679 24 346 254 572
882 486 244 131 504
216 668 50 622 235
400 241 289 903 276 378 630
752 267 683 60 325 425 984 49 232 441 8
262 136 487 51 412 634
617 971 166 316
852 325 483 238 212 348 380 509
609 666 564 256 155 219 488 371 497 236 149
343 162 664
61 948 598
944 248 535 477 541 693
902 48 55 409
510 290 308 342 472
39 314 888 739 667 644 377 39 661 694
876 143 602 613 995 125 52 496
707 839 620 482 341 445
678 274 904 895 489 377 231
293 331 237 193 665
187 317
933 136 612 194 495 289 341
580 24 581 172 468 413 566
706 137 86 604
524 324 220 180 376 440 70 724
215 490 494 621 275 349 503 629
693 480 893 71 47 680 291 573 692
792 101 154 983 947 243 286 381 635 695
936 840 780 167 221 956
392 457 702 666 493 263 224
222 53 375 408 510
821 491 340
164 230 307
531 774 645 492 795 288 117 478
316 38 223 101 725
128 309 238 905 696723
115 275 124 597 660
489 224 370 796 23 534 542
367 201
567 783 649 662 794 255 471 666
311 294 332 661 292 211 446
291 809 959 379 323 138 663 894 225 511
355 946 374 132 755
323 896 156 46 229 318
536 574 665 28 339 9
385 696 412 220 226 148
635 467 407 414 691 726
191 493 174 575 153 193 228 793 54 29 274 373 439
166 145 745 227 378 287 350 382
494 664 306 54 754
92 125 459 646 573 982 620 403 565 628
956 797 163
892 600 917 168 660 572 681 502
55 555 371 736 574 372 511 605 722 756
958 369 561 217 94 906 293 242 40 479 636 697 727
169 54 485 27 287
769 960 123
322 139 333 571 957 574
909 404 338 179
424 264 406 194
5 401 500 786 792 945 85
464 921 351 297 575 45 30
280 626 429 659 785 786
583 588 893 91 371
562 369 402 470 447 667 753
681 292 734 180 152 55 757
506 329 647 570 90 512 71 533 596
243 57 903 659 286 438
778 294 305 225 543
916 140 466 798 319 690
116 256 415
499 398 96 405 273 210 405
277 363 874 798 981 26 728
695 58 349 907321 334 576 379 22 351 721
169 192 337 383 787
997 680 569 791 370
849 910 89 619 102 758
421 886 92 784
630 525 934 140 122 682 31 944 480 816
845 322
152 870 698
427 533 648 658 44 401 404 501 512
339 442 151 295 147 627 752
154 675 68 715 577 56 10
483 116 406 437 564 817
913 744751 276 141 568 958 241 53 133
859
402 455 939
540 353 285 892 265 285 288 788
82 236 590 951 880 335 513 369 403 469
978 922 682 781 88 304 448
105 71426 528 183 730 320 93 799 637
25 606 783 815
107 279 908 657 465 790 980 336 658 759
649 368 407 178 575
665 393 526 296 164
288 396 133 360 382 656 578 32 729
800 74 185 616 260 150 43 940 943 41 818847
534 46 170 84 320 532 668 689
831 356 199 272 402 436 416
308 547 584 672 16 758 567 400 720
299 142 121 57 938 481 789
76 942 226 582 530 380 544 595
692 232 240 336 191 368 195 257 352 751
509 605 233 123 866 861 618 941 384
558 650 992 814
650 828 829 94 408 683 846
784 914 655 87 209
915 319 827 297 284 115 848
51 487 282 401 468 500
426 577 690 42 226 72 513 699
618 149 514 335 819
895 962 579 33 24 435 449 782
941 2 802 969 909 566 891 800 21
912 954 206 22 3 654 959 979 942 303
411 544 652 474 296 464 830
35 463 414 143 826 789 266 937 760 790
377 626 878
129 42 651 399 367
361 409 58 240 146 289
237 1 628 994 337 41 563 845
660 263 171 298
7 53 795 95 400 103 849
445 572 931 653 565 52 813 877
378 460 306 754 458 214 50 148 120 318 367 434 482 730
86 820
228 928 824 148 615 40 11 417
157 731 652 34 467 657
138 825 831
249 144 283 271 321 531
639 710 539 404 568 190 334 450 607 638 719 879
319 286 738 456 910 381 617 684 936 177
726 39 580 410 23 750
923 564 67 134 576 688
257 967 354 663 537 299 366 499 669
89 515 978 801 399 433 353 385 876
338 38 398 83 258 844 850
662 318 640 501 589 298 321 147 317 35 545 781 791 909
817 78 69 463 59
362 79 130 223 668 939 37 302 594 908
973 904 834 890 96 36 66 42 514 821
765 145 761
372 671 960 300 788 267 165
352
721 104 466 700
712 93 375 637 119 146 172 563 303 302 208 880
195 899 34 591 435 227 85 301 432 451 812
231 700 543 310 832 411 68 483
28 18 384 824 304 227
938 466 255 311 309 282 398 114
316 305 935 418
312 278 565 770 911 306
308 22 65 365 333 20 196 875 907851 910
641 350 312 307
579 397 73 290 939
741 20 807 797 339 516 685 562 625
453 225 450 581 731 940
471 17 977 69 60 239
548 418 818 399 366 431
748 97 498 843
161 234 742 879 184 313 562 412 802 465 822
557 862 315 189 616 452 792
855 732 545 868 382 145 530 881
972 948 317 830
677 624 703 462 397 51 386
229 120 551 659 118 64 270 749
449 70 322 419 656 718 780
563 84 852
906 938
648 702 771 517 268 364 104 911
940 980 173 823 889 833 787 934 281 12
881 981 413 354 941
338 961 21 301 687
364 314 762 874
272 479 873 912 772 396 61 430 484 515 639 670
669 819 724 546 608 811
423 561 332 176 464 577
507 340 71 259 420
407 190 196 453
395 386 179 704 63 882
607 305 98 396 82 593
596 801 629 439 701 773 976 686 497 701
344 10 36 250 270 414 937
469 434 718 302 872 582 518 803 135 823
443 56 498 770 363 421 842 912
901 935 492 117 853
872 932 73 874 43
331 926 328 429 905 942
334 141 467 31 889 72
258 336 636 83 280
505 202 810 347 955 779 440 560 383 933 20 62 463 207 422 387 732 793
326 462 446 261 787 461 188 365 615 395
11 576 395 228 561
542 244 774 423 883
177 142 689 822 834 394 113 454 529
656 653 643 438 174 700 269 166 291 873
733 968 871 415 786 19 393 624
705 913
957 888 519 600 362
99 485 779
611 197 740 441 341 962 331 424 936
606 121 599 428 392 748 913
303 433 875 810
538 885 694 769 73 462 496 74 197 943
310 514 431 559 238 355 763
703 975 687 516 323 717 854
242 842 599 804
627 116 82 279 19 388 824 904
884 163 735 734 655
335 29 441 775 425 391 841 884
19 676 919 235 153 437 583 50 300 686
848 176 194 883 419 408 376 442 699 736 601 269 455
632 287 932 427 144 547
833 365 870 416 733 394 390 389
604 897 118 313 361
394265 717 979 348 41 750 558 598 426 671
390 443 461 578 794 914
30 706 100 74 384 609 640 872
920 625 515 112 876 460 520 13 592 935 944
644 964 821 737 702
679 865 988 790 667 614 914 187 614 105
146 491 517 175 81 785 270 732 260
829 873 835 18 495 528
566 756 953 549 768 278
511 929 486
45 63 825 436 698 342 364 733 903
83 846 603 552 887 602 81 855 885
409 688 805 175 456
110 239 25 444 115 460 330
518 986 301 869 776 963 75 560
204 417 557 597 360 356 778 809 915
541 593 735 825
210 472 731 974
106 642 812 80 393 764 840 945
33 556 101 44 934
673 796 738 76 18
6 490 450 931 136 292 747
704 17 517 623 871
281 86 877 77 457 494
804 631 380 65 707 697 79 78 584
200 496 98 532 445 459 229 324
503 108 277 521 112 206 795
428 373 793 221 603 359 716 902
495 368 435 271 385 458 299 886
858 621 162 459 767 915 689 916
324 777 820 596 487 357 856 946
337 875 451 449 176 868
767 943 203 854 320 418 784 730 806 527 654
246 794 775 546 102 167 358 548 685
188 747 13 114 777 186 836 556 16 49 933
657 168 343 613 392 493 75
4 47 771 990 451 826
497 993 106 886 696 198
27 40 651 446 739 329
708 237 579
178 268 672 703 839
300 520 764 67 945 219 105 973 591 808
841 996 132 103 964 363 870
448 276 690 143 14 610 641 734
587 878 930 604 492 777 917 947
728 88
417
104
595 15 488
559 887 901
114 634 708 695 272 765
989 760 827 81 134 434 522 261 857
743 107 80 518
109 419 386 729 17 932
645 867 778 555 391 325
342 422 247 452 766 691 106 746
447 585 526 796
837 835 826 461 72 916 819 807 491
75 59 91 113 458 783 14 328
266 155 513 177 740 694 45 298 293
847 877 117 489
174 948
788 900 344 692 827
119 885 185 275 490 622 869
729 837 693 715 900 918 931
454 813 198 137 126 271 612 605 838 888
477 440 108 972 48 549
397 283 43 925 879 779 273 594 326
48 111 327 807
252 420 554 929 390 137
709 149 90 484 139 99 433 858
806 453 965 387 523 13 15 525 230 684
709 728 205 653
508 66 866 16 519 558
102 186 765 782 362 776 930
597 304 267 76
522 383 894 746 109 112 818 780 590
601 406 699 741 274 808 580 735 766 949
799 457 917 704 899919
222 486 209 586 906 178 781 168 673 868
655 884 586 389 797
388 79 524 297 199
343 23 595 890 510 345 971 606 12 611 828 889
502 454 184 142 46 236 642 745
64 633 880 421 553 837 961
983 853 554 773 814 838 611 47 262 294 929
97 415
691 110 593 859
84 719 432 928 520
62 911 212 524 523
150 871 658 527 77 111 865 966 107 950
749 470 817 727 550 806
478 44 416 710
610 571 456 764 11 898
359 949 883 742 809 522 521 557 296 621 714 960
439 970 920
864 358 785 455 179 295
208 437 918 110 173
820 550 687 422 183 361 552 607 77 775 890 867
264 585 701 881 346 587 266 928
844 823 78 683
950 674 525 592 589
189 610 138 652
592 211 388 192 816 10 581 231 798 829 836
245 882 431 736 767 860
966 521 181 967 969 726 204 951
295 864 839 927 705 959
158 646 180 182 810
670 253 423 551 556551
856 711 546 674 897
598 410 32 181 545 743 550 608 141 235 169 263 612 744
867 70 763 547 921 927
838 805 254 488 549 968 108 200 643 805
381 654 919 347 548 526 9 109 891
14 588 591 265 866
539 544 815
430 360 609
131 213 811 552 555 958
918 725 553 620
569 37 543 863 264 713
424 926 588 774 896
540 814 527 554 172 861 835
857 538 348 812 590 8 582 952
926
623 560 542 139 830
716 541 712 744 813 682
504 762 840 920 589 140 651 799 922
448 862 349 234 232 768
259 429 724 203 737 957
127 892 865
218 706
537 925 7
224 937 425 350 528 613 743 804 925
193 359 170 675 895
766 370 159 587
921 723 171 583 233201 644
438 170 428
984 970 536 426 853 6 619
559 332 965 713 861 529 924 584 862 834 953
761 351 745 202 923 956
122 922 712 773
923 722 831 864 924
977 535 530 5 586 894
841 854 585 650 893
531
860 358 852 614 681 800
207 427 352 534 533 532 855
769
4 738 803
836 714 856 3 721 851 645 707 742
860 760 859 618 676 863 954 955
387 290 746858 615 833
135 622 857
476 946 353 354 357 850
608 720 711 772
908 757 617 649 832
927 355 715 356 747 849 616 680
756 759 848 646 802
842 801
1000 12 719 2 677 708 739 770 741
327 755 716 847748 648
924 758 754 751 718
717 710 771
753 752 647 679
750 846
374 103 749
863 843 678 740
844 845 709

757
737

403
905
963 523

N = 1000, k̄ = 4 N = 1000, k̄ = 2 N = 961, k̄ = 4


Erdős-Rényi periodic ring periodic 2D lattice
1 1 1

%(µ)

0 0 0
-5 -3 -1 1 3 5 -5 -3 -1 1 3 5 -5 -3 -1 1 3 5

µ µ µ

Figure 25. Adjacency matrix eigenvalue distributions (i.e. histograms of the numbers
µ1 (A), . . . , µN (A)) of the three further graphs, to emphasise that the shape of these
distributions can vary considerably, which, via the quantities N −1 k µ`k (A), reflects the
P

different statistics of closed paths in these graphs. The Erdős-Rényi random graphs will be
the subject of a subsequent section of these notes.

Xh  `−1 i Xh i
† † † `−1
= U AU U AU = U A AU
kk kk
k k
Xh i XX XX
= U † A` U = (U † )ki (A` )ij Ujk = (A` )ij Ujk (U † )ki
kk
k k ij k ij
X X X
= (A` )ij (U U † )ji = (A` )ij δij = (A` )ii
ij ij i

1
PN 1
P
For ` = 1 this gives N k=1 µk (A) = N i Aii = 0.
• Claim: the eigenvalue spectrum {µ1 (A), . . . , µN (A)} of a nondirected N -node graph
53

obeys
N
1 X 2
µ (A) = k̄(A) (106)
N k=1 k
Proof:
From the previous proof we know that for any integer ` > 0: N1 N `
P
k=1 µk (A) =
1
P ` 2
N i (A )ii . Upon choosing ` = 2 we find, using Aij = Aij for Aij ∈ {0, 1}:
N
1 X 2 1 X 2 1 X
µk (A) = (A )ii = Aij Aji
N k=1 N i N ij
1 X 1 X
= Aij = ki (A) = k̄(A) (107)
N ij N i

Link between adjacency matrix spectra and closed path statistics. From the eigenvalue
spectrum of the adjacency matrix A of a nondirected graph one can also obtain the numbers
L` (A) of closed paths of all possible lengths ` in this graph, since

• Claim: for any integer ` > 2 the eigenvalue spectrum {µ1 (A), . . . , µN (A)} of a
nondirected graph with adjacency matrix A obeys
N
1 X ` 1
µk (A) = L` (A) (108)
N k=1 N
where L` (A), defined in (27), gives the number of closed paths of length ` in the graph.
Proof:
1
µ`k (A) =
P
From the proof of (105) we also know that for any integer ` > 2: N k
1
P ` 1
N i (A )ii = N L` (A).

• Claim: if the eigenvalue spectrum of the adjacency matrix A of a nondirected N -node


graph is symmetric, i.e. the histogram of eigenvalues is symmetrix with respect to
reflection in the line µ = 0, and hence N −1 k f (µk (A)) = 0 for any anti-symmetric
P

function f (x), then this graph has no closed paths of odd length (no triangles, no
pentagons, etc).
Proof:
We use the previous result with ` = 2m + 1 and m ∈ IN, and use the fact that the
function f (x) = x2m+1 is anti-symmetric:
X
L2m+1 (A) = µ2m+1
k (A) = 0
k
54

Some examples of adjacency matrix eigenvalue spectra for nondirected graphs that we have
already inspected earlier are shown in Fig. 24. Further examples are shown in Fig. 25, to
emphasise the large variability in spectra one should expect.
For large values of N , it is convenient to describe the spectrum of A in terms of the
distribution of eigenvalues, also called the empirical spectral distribution, defined by
N
1 X
ρ(µ|A) ≡ δ(µ − µj (A)), (109)
N j=1

where δ(x) denotes the Dirac distribution, i.e.,


Z +∞
f (x)δ(x − a) = f (a) (110)
−∞

for all functions f that are sufficiently smooth (all derivatives exist). For simple, nondirected
graphs, it holds that (this problem is left open as an exercise)
Z +∞ N
1 X `
µ` ρ(µ|A) = µ (111)
−∞ N j=1 j
and therefore
Z +∞
ρ(µ|A) = 1, (112)
−∞
Z +∞
µρ(µ|A) = 0, (113)
−∞

and
Z +∞
µ2 ρ(µ|A) = k(A). (114)
−∞

In Fig 26 we show the eigenvalue spectra %Lap (µ|A) of the matrices shown earlier (with their
adjacency matrix eigenvalue spectra) in Fig. 25.

5.7. Spectra of Laplacian matrices


We saw that an alternative spectral characterisation of nondirected graphs, especially
relevant for graphs describing diffusive processes, is based on the eigenvalues of the so-called
Laplacian N × N matrix L = {Lij }, rather than those of the adjacency matrix A.
• Definition: the Laplacian matrix L of an N -node graph with adjacency matrix A is
defined by the entries
Lij = ki (A)δi,j − Aij (115)
Since the Laplacian is a symmetric matrix, it must have real-valued eigenvalues.
340175
481 430
55
447 776

594

811 843 255 155 123 124


688 254 256 154 156
186 187 92
80 468 529 253 250 249 248
727 999 122 91 125 93
1000 209 185 60
474 247 208 61
887 997 998 251 210 257 218153
772 95 200 217 29 94 62
996 207 90 59
28 30
990 211 473 201 121
720 995 252 199 184 188 157
686 991 246 206 216 126
994
258 202 3163
647 992 212 152 219 189 27
475 249 220 58 95
475 907 989 158 127
391 789 205 203 89
993 472 505 248 251 96 64 32
161 160 213 245
346 131 198 504
333 711 215 250 65
570 698 637 204 183 120 97
999 405 476 638 506 259
636 128
503 190 159 66 34
982 162 635 33 1
284 124 130 214 674 247 26 35
389 159 988675 639 282 221
759 132
160 330 244 634 502 151 4 3
947 113 315 725 673 57
626 280 252 98 67 36 5
230 722 899 2
471 676 633
723 761 129 163 507 129
477 197 625 281 88 160
519 753 151 869 672 215 640 260 214
413 832 133 243 627 279
578 205 158 374 283
985 763 987 501 246 191 6
512 900 898 632
444 452 182
111 357 119 68 37
661 677 538
473 128 628 222 569
671 216 242 313 99
85 164 478 25
641 373 952 624 278 570
685 975 508 500 631 506
328 134 375 953 253 601
165 629 130 537
157 470 479 314 507 600
173 697 15 901 986 196 951 539
828 261 150 284 56 7
991 897 241 161
952 898 670 245
87 127 217 480 678 213 311 312 568
327 954 630 475
808 167 307 21 499 474 505
762 329 642 38 571 632
49 768 481 623 950 87 192 602
815 791 277
465 165 372 240 310 315 69 631
269 822 930 135 902 985
171 976 509 443 536 508 599
100 669 156 181
619
878 987 442 476 540
638 684 326 376 223
998 664 182 755 891 218 485 100
535 420 484 473 633
195 949 498 118 411
126 896 482 469 239 663
8 713 233 234 309 344 345 285 410 567
553 9 974 643 262 347 379 444 603 662
436 144 330 955 679 24 346 254 572
882 486 244 131 504
216 668 50 622 235
400 241 289 903 276 378 630
752 267 683 60 325 425 984 49 232 441 8
262 136 487 51 412 634
617 971 166 316
852 325 483 238 212 348 380 509
609 666 564 256 155 219 488 371 497 236 149
343 162 664
61 948 598
944 248 535 477 541 693
902 48 55 409
510 290 308 342 472
39 314 888 739 667 644 377 39 661 694
876 143 602 613 995 125 52 496
707 839 620 482 341 445
678 274 904 895 489 377 231
293 331 237 193 665
187 317
933 136 612 194 495 289 341
580 24 581 172 468 413 566
706 137 86 604
524 324 220 180 376 440 70 724
215 490 494 621 275 349 503 629
693 480 893 71 47 680 291 573 692
792 101 154 983 947 243 286 381 635 695
936 840 780 167 221 956
392 457 702 666 493 263 224
222 53 375 408 510
821 491 340
164 230 307
531 774 645 492 795 288 117 478
316 38 223 101 725
128 309 238 905 696723
115 275 124 597 660
489 224 370 796 23 534 542
367 201
567 783 649 662 794 255 471 666
311 294 332 661 292 211 446
291 809 959 379 323 138 663 894 225 511
355 946 374 132 755
323 896 156 46 229 318
536 574 665 28 339 9
385 696 412 220 226 148
635 467 407 414 691 726
191 493 174 575 153 193 228 793 54 29 274 373 439
166 145 745 227 378 287 350 382
494 664 306 54 754
92 125 459 646 573 982 620 403 565 628
956 797 163
892 600 917 168 660 572 681 502
55 555 371 736 574 372 511 605 722 756
958 369 561 217 94 906 293 242 40 479 636 697 727
169 54 485 27 287
769 960 123
322 139 333 571 957 574
909 404 338 179
424 264 406 194
5 401 500 786 792 945 85
464 921 351 297 575 45 30
280 626 429 659 785 786
583 588 893 91 371
562 369 402 470 447 667 753
681 292 734 180 152 55 757
506 329 647 570 90 512 71 533 596
243 57 903 659 286 438
778 294 305 225 543
916 140 466 798 319 690
116 256 415
499 398 96 405 273 210 405
277 363 874 798 981 26 728
695 58 349 907321 334 576 379 22 351 721
169 192 337 383 787
997 680 569 791 370
849 910 89 619 102 758
421 886 92 784
630 525 934 140 122 682 31 944 480 816
845 322
152 870 698
427 533 648 658 44 401 404 501 512
339 442 151 295 147 627 752
154 675 68 715 577 56 10
483 116 406 437 564 817
913 744751 276 141 568 958 241 53 133
859
402 455 939
540 353 285 892 265 285 288 788
82 236 590 951 880 335 513 369 403 469
978 922 682 781 88 304 448
105 71426 528 183 730 320 93 799 637
25 606 783 815
107 279 908 657 465 790 980 336 658 759
649 368 407 178 575
665 393 526 296 164
288 396 133 360 382 656 578 32 729
800 74 185 616 260 150 43 940 943 41 818847
534 46 170 84 320 532 668 689
831 356 199 272 402 436 416
308 547 584 672 16 758 567 400 720
299 142 121 57 938 481 789
76 942 226 582 530 380 544 595
692 232 240 336 191 368 195 257 352 751
509 605 233 123 866 861 618 941 384
558 650 992 814
650 828 829 94 408 683 846
784 914 655 87 209
915 319 827 297 284 115 848
51 487 282 401 468 500
426 577 690 42 226 72 513 699
618 149 514 335 819
895 962 579 33 24 435 449 782
941 2 802 969 909 566 891 800 21
912 954 206 22 3 654 959 979 942 303
411 544 652 474 296 464 830
35 463 414 143 826 789 266 937 760 790
377 626 878
129 42 651 399 367
361 409 58 240 146 289
237 1 628 994 337 41 563 845
660 263 171 298
7 53 795 95 400 103 849
445 572 931 653 565 52 813 877
378 460 306 754 458 214 50 148 120 318 367 434 482 730
86 820
228 928 824 148 615 40 11 417
157 731 652 34 467 657
138 825 831
249 144 283 271 321 531
639 710 539 404 568 190 334 450 607 638 719 879
319 286 738 456 910 381 617 684 936 177
726 39 580 410 23 750
923 564 67 134 576 688
257 967 354 663 537 299 366 499 669
89 515 978 801 399 433 353 385 876
338 38 398 83 258 844 850
662 318 640 501 589 298 321 147 317 35 545 781 791 909
817 78 69 463 59
362 79 130 223 668 939 37 302 594 908
973 904 834 890 96 36 66 42 514 821
765 145 761
372 671 960 300 788 267 165
352
721 104 466 700
712 93 375 637 119 146 172 563 303 302 208 880
195 899 34 591 435 227 85 301 432 451 812
231 700 543 310 832 411 68 483
28 18 384 824 304 227
938 466 255 311 309 282 398 114
316 305 935 418
312 278 565 770 911 306
308 22 65 365 333 20 196 875 907851 910
641 350 312 307
579 397 73 290 939
741 20 807 797 339 516 685 562 625
453 225 450 581 731 940
471 17 977 69 60 239
548 418 818 399 366 431
748 97 498 843
161 234 742 879 184 313 562 412 802 465 822
557 862 315 189 616 452 792
855 732 545 868 382 145 530 881
972 948 317 830
677 624 703 462 397 51 386
229 120 551 659 118 64 270 749
449 70 322 419 656 718 780
563 84 852
906 938
648 702 771 517 268 364 104 911
940 980 173 823 889 833 787 934 281 12
881 981 413 354 941
338 961 21 301 687
364 314 762 874
272 479 873 912 772 396 61 430 484 515 639 670
669 819 724 546 608 811
423 561 332 176 464 577
507 340 71 259 420
407 190 196 453
395 386 179 704 63 882
607 305 98 396 82 593
596 801 629 439 701 773 976 686 497 701
344 10 36 250 270 414 937
469 434 718 302 872 582 518 803 135 823
443 56 498 770 363 421 842 912
901 935 492 117 853
872 932 73 874 43
331 926 328 429 905 942
334 141 467 31 889 72
258 336 636 83 280
505 202 810 347 955 779 440 560 383 933 20 62 463 207 422 387 732 793
326 462 446 261 787 461 188 365 615 395
11 576 395 228 561
542 244 774 423 883
177 142 689 822 834 394 113 454 529
656 653 643 438 174 700 269 166 291 873
733 968 871 415 786 19 393 624
705 913
957 888 519 600 362
99 485 779
611 197 740 441 341 962 331 424 936
606 121 599 428 392 748 913
303 433 875 810
538 885 694 769 73 462 496 74 197 943
310 514 431 559 238 355 763
703 975 687 516 323 717 854
242 842 599 804
627 116 82 279 19 388 824 904
884 163 735 734 655
335 29 441 775 425 391 841 884
19 676 919 235 153 437 583 50 300 686
848 176 194 883 419 408 376 442 699 736 601 269 455
632 287 932 427 144 547
833 365 870 416 733 394 390 389
604 897 118 313 361
394265 717 979 348 41 750 558 598 426 671
390 443 461 578 794 914
30 706 100 74 384 609 640 872
920 625 515 112 876 460 520 13 592 935 944
644 964 821 737 702
679 865 988 790 667 614 914 187 614 105
146 491 517 175 81 785 270 732 260
829 873 835 18 495 528
566 756 953 549 768 278
511 929 486
45 63 825 436 698 342 364 733 903
83 846 603 552 887 602 81 855 885
409 688 805 175 456
110 239 25 444 115 460 330
518 986 301 869 776 963 75 560
204 417 557 597 360 356 778 809 915
541 593 735 825
210 472 731 974
106 642 812 80 393 764 840 945
33 556 101 44 934
673 796 738 76 18
6 490 450 931 136 292 747
704 17 517 623 871
281 86 877 77 457 494
804 631 380 65 707 697 79 78 584
200 496 98 532 445 459 229 324
503 108 277 521 112 206 795
428 373 793 221 603 359 716 902
495 368 435 271 385 458 299 886
858 621 162 459 767 915 689 916
324 777 820 596 487 357 856 946
337 875 451 449 176 868
767 943 203 854 320 418 784 730 806 527 654
246 794 775 546 102 167 358 548 685
188 747 13 114 777 186 836 556 16 49 933
657 168 343 613 392 493 75
4 47 771 990 451 826
497 993 106 886 696 198
27 40 651 446 739 329
708 237 579
178 268 672 703 839
300 520 764 67 945 219 105 973 591 808
841 996 132 103 964 363 870
448 276 690 143 14 610 641 734
587 878 930 604 492 777 917 947
728 88
417
104
595 15 488
559 887 901
114 634 708 695 272 765
989 760 827 81 134 434 522 261 857
743 107 80 518
109 419 386 729 17 932
645 867 778 555 391 325
342 422 247 452 766 691 106 746
447 585 526 796
837 835 826 461 72 916 819 807 491
75 59 91 113 458 783 14 328
266 155 513 177 740 694 45 298 293
847 877 117 489
174 948
788 900 344 692 827
119 885 185 275 490 622 869
729 837 693 715 900 918 931
454 813 198 137 126 271 612 605 838 888
477 440 108 972 48 549
397 283 43 925 879 779 273 594 326
48 111 327 807
252 420 554 929 390 137
709 149 90 484 139 99 433 858
806 453 965 387 523 13 15 525 230 684
709 728 205 653
508 66 866 16 519 558
102 186 765 782 362 776 930
597 304 267 76
522 383 894 746 109 112 818 780 590
601 406 699 741 274 808 580 735 766 949
799 457 917 704 899919
222 486 209 586 906 178 781 168 673 868
655 884 586 389 797
388 79 524 297 199
343 23 595 890 510 345 971 606 12 611 828 889
502 454 184 142 46 236 642 745
64 633 880 421 553 837 961
983 853 554 773 814 838 611 47 262 294 929
97 415
691 110 593 859
84 719 432 928 520
62 911 212 524 523
150 871 658 527 77 111 865 966 107 950
749 470 817 727 550 806
478 44 416 710
610 571 456 764 11 898
359 949 883 742 809 522 521 557 296 621 714 960
439 970 920
864 358 785 455 179 295
208 437 918 110 173
820 550 687 422 183 361 552 607 77 775 890 867
264 585 701 881 346 587 266 928
844 823 78 683
950 674 525 592 589
189 610 138 652
592 211 388 192 816 10 581 231 798 829 836
245 882 431 736 767 860
966 521 181 967 969 726 204 951
295 864 839 927 705 959
158 646 180 182 810
670 253 423 551 556551
856 711 546 674 897
598 410 32 181 545 743 550 608 141 235 169 263 612 744
867 70 763 547 921 927
838 805 254 488 549 968 108 200 643 805
381 654 919 347 548 526 9 109 891
14 588 591 265 866
539 544 815
430 360 609
131 213 811 552 555 958
918 725 553 620
569 37 543 863 264 713
424 926 588 774 896
540 814 527 554 172 861 835
857 538 348 812 590 8 582 952
926
623 560 542 139 830
716 541 712 744 813 682
504 762 840 920 589 140 651 799 922
448 862 349 234 232 768
259 429 724 203 737 957
127 892 865
218 706
537 925 7
224 937 425 350 528 613 743 804 925
193 359 170 675 895
766 370 159 587
921 723 171 583 233201 644
438 170 428
984 970 536 426 853 6 619
559 332 965 713 861 529 924 584 862 834 953
761 351 745 202 923 956
122 922 712 773
923 722 831 864 924
977 535 530 5 586 894
841 854 585 650 893
531
860 358 852 614 681 800
207 427 352 534 533 532 855
769
4 738 803
836 714 856 3 721 851 645 707 742
860 760 859 618 676 863 954 955
387 290 746858 615 833
135 622 857
476 946 353 354 357 850
608 720 711 772
908 757 617 649 832
927 355 715 356 747 849 616 680
756 759 848 646 802
842 801
1000 12 719 2 677 708 739 770 741
327 755 716 847748 648
924 758 754 751 718
717 710 771
753 752 647 679
750 846
374 103 749
863 843 678 740
844 845 709

757
737

403
905
963 523

N = 1000, k̄ = 4 N = 1000, k̄ = 2 N = 961, k̄ = 4


Erdős-Rényi periodic ring periodic 2D lattice
0.6 0.6 0.6

0.4 0.4 0.4

%Lap (µ)

0.2 0.2 0.2

0 0 0
-15 -10 -5 0 5 10 15 -15 -10 -5 0 5 10 15 -15 -10 -5 0 5 10 15

µ µ µ

Figure 26. Laplacian matrix eigenvalue distributions %Lap (µ) of the three graphs of Figure
25, which indeed show nonnegative eigenvalues only. It is clear the the eigenvalue spectra
of the adjacency matrix and of the Laplacian matrix sometimes will and sometimes will not
be similar. In the exercises we will find out why the Laplacian spectra in the middle and
right graph are of the same shape as their adjacency matrix spectra in Fig. 25.

• Claim: all eigenvalues of a Laplacian matrix L of a graph are nonnegative.


Proof:
We show that for any x ∈ IRN one will find x · Lx ≥ 0:
N
X   N
X N
X
2
x · Lx = xi ki (A)δij − Aij xj = xi ki (A) − Aij xi xj
i,j=1 i=1 i,j=1
N N N N
X 1 X X X
= x2i Aij − Aij xi xj = 2 2
Aij (xi +xj ) − Aij xi xj
i,j=1 i,j=1
2 i,j=1 i,j=1
N N
1X 2 2 1X
= Aij (xi +xj −2xi xj ) = Aij (xi −xj )2 ≥ 0.
2 i,j=1 2 i,j=1
56

Any eigenvector x of L with eigenvalue µ < 0 would have given x · Lx = µx2 < 0, in
contradiction with the above. Hence L cannot have negative eigenvalues.
• Claim: the Laplacian matrix L of a graph always has at least one eigenvalue µ = 0.
Proof:
Define u = (1, 1, . . . , 1), and show that it is an eigenvector with eigenvalue zero:
X N N 
X
(Lu)i = Lij uj = ki (A)δi,j − Aij ) × 1 = ki (A) − ki (A) = 0.
j=1 j=1

• Claim: the multiplicity of the kernel of a Laplacian matrix L of a graph (i.e. the
dimension of the eigenspace corresponding to eigenvalue zero) equals the number of
connected components in the graph. In addition, the eigenvectors that span the kernel
take constant values on the connected components of the graph.
Proof:
Consider a vector x with eigenvalue zero. Using the identity x · Lx = 21 N
P
i,j=1 Aij (xi −
2
xj ) derived in the previous proof, it follows that for such an eigenvector
N
X
0= Aij (xi − xj )2
i,j=1

Hence ∀(i, j) : Aij = 0 or xi = xj . For each connected component V 0 ⊆ {1, . . . , N } of


0
our graph we have thereby found an eigenvector ~xV ∈ IRN with eigenvalue 0:
( 0
0 xVi = 1 if i ∈ V 0 ,
connected component V : 0
/ V 0.
xVi = 0 if i ∈
Imagine there was a further zero eigenvalue, with an eigenvector x that is not one of
the above. Again we would find 0 = ij Aij (xi − xj )2 . We can now decompose
P
X X
0= Aij (xi − xj )2 .
V 0 i,j∈V 0

Hence we would again get, for any connected component V 0 : xi = xj for all i, j ∈ V 0 .
But that implies that x is a linear combination of the eigenvectors above, which is
not possible. Hence the dimension of the kernel of L, i.e., the number of independent
eigenvectors with eigenvalue zero, is exactly the number of connected components.

Accordingly, if a graph has nconn connected components that partition the vertex set V
into V = V1 ∪ V2 ∪ . . . Vnconn then we can choose the following set {êα }α∈{1,2,...,nconn } of nconn
orthonormal eigenvectors that span the kernel of L:
(
0 if i ∈
/ Vα ,
(êα )i = √1 if i ∈ Vα . (116)
|Vα |
57

Introducing the function α : V → {1, 2, . . . , nconn } : i → α(i) such that i ∈ Vα(i) , then we
can write (116) as
1
(êα )i = p δα,α(i) .
|Vα |
The above spectral features of the Laplacian matrix allow us to immediately predict the
stationary state of diffusion processes and random walk processes.
For instance, from expression (92) we may now conclude that in the stationary state
z(∞) = limt→∞ z(t):
1 X
f or each connected component V 0 : ∀i ∈ V 0 : zi (∞) = 0 zj (0). (117)
|V | j∈V 0
Indeed, if we order the eigenvalues as follows
µN (L) ≥ µN −1 (L) ≥ . . . ≥ µnconn +1 (L) > µnconn (L) = µnconn −1 (L) = . . . = µ1 (L) = 0 (118)
then from Eq. (92) it follows that
nX
conn

lim z(t) = (êα · z(0))êα . (119)


t→∞
α=1
α P p
Using (117), we get ê · z(0) = j∈Vα zj (0)/ |Vα |, with |Vα | the number of nodes in Vα ,
and therefore
nXconn
1 X 1 X
lim zi (t) = zj (0)δα,α(i) = zj (0). (120)
t→∞
α=1
|Vα | j∈V |Vα(i) | j∈V
α α(i)

Therefore, as t approaches infinity, zi (t) is equivalent to the initial mean mass contained in
the connected component to which i belongs.
Analogously, we show that
X ki (A)
p∗i (A) = pj (0) P . (121)
j∈V j∈Vα(i) kj (A)
α(i)

Using Eq. (100) we get


nX
conn

p = cα Dêα , (122)
α=1
where êα are the vectors specified in Eq. (117). Using the fact that the degree matrix D is
a diagonal matrix with entries Dij (A) = ki (A)δij , we get component wise
nX
conn N nXconn N
cα X cα X cα(i)
p∗i = p α
ki (A)(ê )i = p ki (A)δα,α(i) = p ki . (123)
α=1 |Vα | i=1 α=1 |Vα | i=1 |Vα(i) |
The coefficients cα are specified by the mass conservation equations (96) that holds for each
connected component V 0 = Vα , such that
X X
p∗i = pi (0) (124)
i∈Vα i∈Vα
58

for all α ∈ {1, 2, . . . , nconn }. Using (123) in the right-hand side we get
P
i∈Vα pi (0)
p
cα = Vα P (125)
i∈Vα ki

and using this in (123) yields (121) what we were meant to show.
59

6. Random graphs

6.1. Random graphs as ‘null models’


The need for ‘null models’. We have seen many ways to quantify network topologies. The
values we find for these quantifiers in a network, however, need to be interpreted. We need
to know what values we would have expected to find by default or typically. If we observe
that a graph has a Poissonian degree distribution, should we be excited? If we find that
the number of triangles in an N -node graph equals N/5, is this a large or a small number?
Which features of adjacency matrix spectra are common to most networks, and which are
informative and special? We lack a yardstick against which to measure what we see.
We can define ‘typical’ values as those that we would find in a ‘null model’, which we
define as a random graph that is otherwise similar to the network at hand. But how do
we define ‘similar’ ? Observations in a null model will depend on which features of the real
network we imposed upon its random counterpart – the devil is in the detail. For instance,
in constructing a measure for modularity, we compared observations in a network to what
we would expect from a randomly generated graph with the same degrees as the observed
one. We could have chosen other quantities than degrees to be copied to our null model ...

• Definition: a random graph ensemble {G, p} is defined as a set G of adjacency matrices


A, together with a measure p that specifies a probability p(A) for each A in G.

• Definition: ensemble averages of observable quantitative features f (A) of random graphs


are defined as X
hf i = p(A)f (A) (126)
A∈G

In this section we first define and study the simplest nontrivial random graph ensemble, the
Erdős-Rènyi model. Later we turn to more systematic ways of defining and constructing
random graph ensembles to serve as null models.

6.2. The Erdős-Rènyi model


Definition and basic properties. The Erdős-Rényi (ER) model is the random graph ensemble
in which G is the set of all simple nondirected N -node graphs, and all links are drawn
independently, according to p(Aij = 1) = p? and p(Aij = 0) = 1 − p? , with p? ∈ [0, 1]:
G = {A ∈ {0, 1}N ×N | Aij = Aji and Aii = 0 ∀i, j ≤ N } (127)
YN Y
i−1
 ?
p δAij ,1 + (1 − p? )δAij ,0 ,

p(A) = (128)
i=1 j=1
60

We have to be careful to distinguish between averages that are defined for a single graph,
such as k̄(A), and averages over the ensemble, to be written as h. . .i, which are average values
of graph features calculated over randomly generated graph instances A.
• Claim: for graphs generated in the ER ensemble , the average value of the average
degree k̄(A) = N −1 N ?
P
i,j=1 Aij equals p (N − 1).

Proof:
N
X 1 X X 2 X 2 XX
hk̄(A)i = p(A) Ars = p(A) Ars = p(A)Ars
N r,s=1 N r<s N r<s
A∈G A∈G A∈G
N
2 XX Y 
p? δAij ,1 + (1 − p? )δAij ,0

= Ars
N r<s i<j=1
A∈G
1 1
2 X X ? ?
 Y X  ?
p δAij ,1 +(1−p? )δAij ,0

= Ars [p δArs ,1 +(1−p )δArs ,0 ]
N r<s A =0 A =0
rs i<j, (i,j)6=(r,s) ij

2 X ? 2p? X 2p? 1
= p .1 = 1= . N (N − 1) = p? (N − 1)
N r<s N r<s N 2
Note: hki is the average over the ensemble of the average degree k̄(A) of its graphs, i.e.
P
hki = A∈G p(A)k̄(A). Individual random graphs A generated according to (128) will
generally have k̄(A) 6= hki.

Here, and in what follows, we use the short-hand notation i<j for N
P P Pi−1
i=1 j=1 , as well
Q QN Qi−1
as, i<j for i=1 j=1 .
• Claim: the Erdős-Rényi ensemble assigns equal probabilities to all graphs with the same
number of links.
Proof:
Since Aij ∈ {0, 1}, the probabilities (128) can be written in the alternative form:
N
Y PN 1 PN
(p? )Aij (1 − p? )1−Aij = (p? ) j>i=1 Aij (1 − p? ) 2 N (N −1)− j>i=1 Aij
 
p(A) =
j>i=1

Hence the dependence of p(A) on A can indeed be expressed fully in terms of the
P
number L(A) = i<j Aij of links in A, via

p(A) = (p? )L(A) (1 − p? )N (N −1)/2−L(A) (129)

• Claim: the graph probabilities (128) of the ER ensemble can equivalently be written as
N    
Y hki hki
p(A) = δAij ,1 + 1 − δAij ,0 (130)
i<j=1
N − 1 N − 1
61

Proof: This follows directly from the above result hki = p? (N − 1).

In the ER ensemble we control the likelihood of graphs via just one graph observable, which
can either be k̄(A) or the number of links L(A) (one follows from the other), and all graphs
with the same value for this parameter are equally probable. In spite of this superficial
simplicity, analysing this model turns out to be less than trivial.
• Claim: for graphs generated in the ER ensemble, the average value of the degree
distribution p(k|A) = N1 N
P
i=1 δki (A),k is given by
!
N −1
hp(k|A)i = (1 − p∗ )N −1−k (p∗ )k (131)
k
for k ∈ {0, 1, . . . , N − 1}. Note that this is the Bernoulli distribution with N − 1 trials,
k successes and with p∗ the probability of success.
Proof:
Recall the definition of the degree distribution:
N
1 X
p(k|A) = δk,ki (A) . (132)
N i=1
The average degree distribution is

N
1 X P
hp(k|A)i = h δk, j Aij i (133)
N i=1
N
1 X P
= hδk, j Aij i (134)
N i=1
N Z π
1 X 1 PN
= dωeiωk he−iω `=1 Ai` i (135)
N i=1 2π −π
N Z π N
1 X 1 DY E
= dωe iωk
e−iωAi` (136)
N i=1 2π −π `=1

Hence, we require
DY N E
e−iωAi` =? (137)
`=1
In order to derive this quantity, we factorise the probability distribution p(A) into
a distribution conditioned on the elements {Ai` }`=1,...n and the distribution of those
elements. In other words,
N
Y
p(A) = p(A| {Ai` }`=1,...n ) [p? δAi` ,1 + (1 − p? )δAi` ,0 ] . (138)
`=1
62

Hence,
DYN E
−iωAi`
e
`=1
 
X X Y N
Y
∗ ∗
= p(A| {Ai` }`=1,...n ) (p δAi` ,1 + (1 − p )δAi` ,0 ) e−iωAi`
A\{Ai` }`=1,...n {Ai` }`=1,...n `(6=i) `=1

X Y N
Y
∗ ∗
= (p δAi` ,1 + (1 − p )δAi` ,0 ) e−iωAi` (139)
{Ai` }`=1,...n `(6=i) `=1

where we have used the fact that p(A| {Ai` }`=1,...n ) is a normalised distribution.
Since
" #" #
X X X
p(x)p(y)f (x)f (y) = p(x)f (x) p(y)f (y) (140)
x,y x y

it holds that
N 1
!N −1
DY E X
e−iωAi` = (p∗ δAi` ,1 + (1 − p∗ )δAi` ,0 ) e−iωAi`
`=1 Ai` =0
N −1
= 1 − p∗ + p∗ e−iω
−1
N
!
X N −1 m
= (1 − p∗ )N −1−m (p∗ )m e−iω . (141)
m=0
m
DQ E
N −iωAi`
Substituting the expression (141) for `=1 e into the equation (136), we obtain
−1
N Z π N
!
1 X 1 X N − 1 m
hp(k|A)i = dωeiωk (1 − p∗ )N −1−m (p∗ )m e−iω (142)
N j=1 2π −π m=0
m
−1
N
! Z π
X N −1 ∗ N −1−m ∗ m 1
= (1 − p ) (p ) dωeiω(k−m) (143)
m=0
m 2π −π
−1
N
!
X N −1
= (1 − p∗ )N −1−m (p∗ )m δk,m (144)
m=0
m
!
N −1
= (1 − p∗ )N −1−k (p∗ )k , (145)
k
which is the equation (131) we aimed to derive.

There is also a probabilistic approach to derive the expression for hp(k|A)i in


equation (131). In this approach, we recognise that
Prob({k1 (A) = k}) = hδk1 (A),k i. (146)
63

is the probability that node 1 has a degree equal to k. Therefore, hp(k|A)i is the probability
that an arbitrary node in the graph has a degree k, which equals the probability that node
1 has degree equal to k,
N
1 X
hp(k|A)i = hδk (A),k i = hδk1 ,k i = Prob({k1 = k}). (147)
N i=1 i
PN
The degree k1 (A) = `=1 A1` = | {` ∈ V \ {1} : A1` = 1} |, Prob({k1 (A) = k}) can be
interpreted as the probability to obtain k successes out of N −1 trials with p∗ the probability
for a success. As a consequence, Prob({k1 (A) = k}) equals the Bernoulli distribution of
having k successes in N − 1 trials, and is given by (131).
• Claim: the average clustering coefficient Ci = hCi (A)i of any node i in graphs generated
from the ER ensemble (128), with the definition of Ci (A) given in (25), is
hCi (A)i = p? [1 − (1−p? )N −1 − p? (N −1)(1−p? )N −2 ] (148)
Proof:
We use definition (25), and have to be careful to distinguish between ki (A) < 2 and
ki (A) ≥ 2. To handle this implicit conditioning on the degree value we use the integral

representation of the Kronecker δ-symbol (see 7.4), δnm = (2π)−1 −π dω ei(n−m)ω :
* + * P +
r6=s Air Ars Asi
X X
hCi (A)i = δk,ki (A) Ci (A) = δk,ki (A)
k≥0 k≥2
ki (A)(ki (A) − 1)
X 1 XD E
= δk,ki (A) Air Ars Asi
k≥2
k(k−1) r6=s
Z π 
X 2 X dω iω(k−Pj Aij )
= e Air Ars Asi
k≥2
k(k−1) r<s −π 2π
* +
X 2 X Z π dω Y
= eiωk Air Ars Asi e−iωAij
k≥2
k(k−1) r<s, r,s6=i −π 2π j6=i
* +
Z π
X 2 X dω iωk    Y
= e Ars Air e−iωAir Ais e−iωAis e−iωAij
k≥2
k(k−1) r<s, r,s6=i −π

j ∈{i,r,s}
/

So far we have only substituted definitions, and rearranged factors such that entries of
the adjacency matrix are grouped together. Now we do the actual ensemble averages.
The measure p(A) in the ER ensemble (128) factorises over the links, reflecting the fact
that they are indeed generated independently, which means that the average over p(A)
above simplifies to so the above can be reduced to the product of ensemble averages:
* +
   Y Y
Ars Air e−iωAir Ais e−iωAis e−iωAij = hArs ihAir e−iωAir ihAis e−iωAis i he−iωAij i
j ∈{i,r,s}
/ j ∈{i,r,s}
/

= p? (p? e−iω )2 (p? e−iω +1−p? )N −3 = (p? )3 e−2iω (p? e−iω +1−p? )N −3
64

Next we use Newton’s binomium formula to work out the quantity (p? e−iω +1−p? )N −3 :
* + N −3 

−iωAir

−iωAis
 Y
−iωAij ? 3 −2iω
X N −3  ? ` −`iω
Ars Air e Ais e e = (p ) e (p ) e (1−p? )N −3−`
`
j ∈{i,r,s}
/ `=0
N −3 
X N −3  ? `+3 −(`+2)iω
= (p ) e (1−p? )N −3−`
`=0
`

We insert this into our expression for hCi (A)i, use r<s, r,s6=i 1 = 12 (N −1)(N −2) , and
P

do some simple cleaning up:


N −1 N −3
(N −1)(N −2) π dω iωk X  N −3  ? `+3 −(`+2)iω
X Z
hCi (A)i = e (p ) e (1−p? )N −3−`
k≥2
k(k−1) −π 2π `=0
`
N −1 N −3
X (N −1)(N −2) X  N −3  Z π
? `+3 ? N −3−` dω iω(k−`−2)
= (p ) (1−p ) e
k≥2
k(k−1) `=0
` −π 2π

At this stage we use the integral representation of the Kronecker delta-symbol to get
rid of the ω-integral:
N −1 N −3
X (N −1)(N −2) X  N −3  ? `+3
hCi (A)i = (p ) (1−p? )N −3−` δk,`+2
k≥2
k(k−1) `=0
`
N −1
X (N −1)(N −2)  N −3  ? k+1
= (p ) (1−p? )N −1−k
k=2
k(k−1) k−2
We now write explicitly the combinatorial factor, and clean up the various quantities
where possible:
N −1
X (N −1)(N −2) (N −3)!
hCi (A)i = (p? )k+1 (1−p? )N −1−k
k=2
k(k−1) (k−2)!(N −1−k)!
N −1
X (N −1)!
= (p? )k+1 (1−p? )N −1−k
k=2
k!(N −1−k)!
N −1 
X N −1  ? k
=p ?
(p ) (1−p? )N −1−k
k=2
k
N −1 
X N −1  ? k
= p? (p ) (1−p? )N −1−k − p? (1−p? )N −1 − p? (N −1)p? (1−p? )N −2
k=0
k
We then recognise that Newton’s binomial formula can be used to do the sum over `,
and proceed to our final result:
n o
? ? N −1 ? ? N −2
hCi (A)i = p 1 − (1−p ) − (N −1)p (1−p ) . (149)
65

The above proof is a useful exercise in the use of various bookkeeping tools, such
us summation formulae from Calculus, Newton’s binomial formula, and the integral
representation of the Kronecker delta-symbol. These tools will continue to serve us.

6.3. Generating functions


Averages over the degree distribution p(k) of a graph or an ensemble of graphs can be
expressed in terms of the following generating function, calculation of which will reduce the
amount of work (and the likelihood or error) in our calculations:
• Definition: the generating function of the degree distribution is defined for x ∈ [0, 1] as
X
G(x) = p(k)xk (150)
k≥0
d
We see that it obeys: dx
G(x) ≥ 0, with G(0) = p(0) and G(1) = 1.

• Claim: the degree distribution follows from its generating function via
1 dk G(x)
p(k) = lim (151)
x→0 k! dxk

Proof:
This follows directly from application of the Taylor expansion to the function G(x),
which tells us that G(x) = `≥0 `!1 G(`) (0)x` .
P

• Claim: all moments hk m i of the degree distribution, with m ∈ IN, follow from the
generating function via
 d m
m
hk i = lim x G(x) (152)
x→1 dx
Proof:
For m = 0 the claim holds trivially. We just work out the recipe on the right for m > 0:
 d m  d m X  d m−1 X d
x G(x) = x p(k)xk = x p(k)x xk
dx dx k≥0
dx k≥0
dx
 d m−1 X  d m−2 X
k
= x p(k)kx = x p(k)k 2 xk
dx k≥0
dx k≥0
= ......
 d 0 X X
= x p(k)k m xk = p(k)k m xk
dx k≥0 k≥0

Setting x → 1 then leads to the above claim.

Let us work out the generating function of the ER ensemble.


66

• Claim: the average generating function of the ER ensemble is


hG(x|A)i = (p∗ x + 1 − p∗ )N −1
Proof:
N
X −1
hG(x|A)i = hp(k|A)i xk
k=0
−1
N
!
X N −1 ∗
= (p∗ )k (1 − p∗ )N −1−k xk
k=0
k
−1
N
!
X N −1 ∗
= (xp∗ )k (1 − p∗ )N −1−k
k=0
k
= (p x + 1 − p∗ )N −1 .

d
Exercise: Confirm that indeed G(0) = hp(0|A)i, G(1) = 1, and limx→1 x dx G(x) = p∗ (N − 1).

6.4. The Erdős-Renyi model in the finite connectivity regime


We are usually interested in large networks with a finite average degree – these tend to be
found in the real world. Therefore many properties of the ER ensemble have been studied
in the so-called finite connectivity regime, starting from (130), where: N → ∞ with hki
finite. It follows from a relation found earlier, namely hk̄(A)i = p? (N − 1), that in this
regime p? = O(N −1 ). The probability for an individual link to be present must indeed scale
as O(N −1 ) in order to have on average a finite number of partners per node in the system.
We now investigate properties of the ER ensemble in this finite connectivity limit.

• Claim: in the finite connectivity limit, i.e. for N → ∞ with hki fixed, the degree
distribution of the Erdős-Rènyi ensemble has the Poissonnian form
lim hp(k|A)i = e−hki hkik /k! (153)
N →∞

Proof:
We substitute p∗ = c/N = hki in the expression for the generating function obtaining
hG(x|A)i = (1 + (x − 1)c/N )N −1 = eN log(1+(x−1)c/N ) + O(N −1 )
= e(x−1)c + O(N −1 )
The Taylor expansion of hG(x|A)i is
∞ ∞
−c
X ck k
X
hG(x|A)i = e x = hp(k|A)ixk . (154)
k=0
k! k=0

Hence,
hp(k|A)i = e−hki hkik /k! + O(N −1 ) (155)
67

which completes the proof.


Proof:
We substitute p? = hki/(N − 1) into (148) and expand the result for N → ∞, using
log(1 + x) = x + O(x2 ) and ex = 1 + x + O(x2 ):
hki h hki N −1 hki N −2 i
hCi (A)i = 1 − (1− ) − hki(1− ) (156)
N −1 N −1 N −1
hki h hki
(N −1) log(1−N−1 )
hki
(N −2) log(1−N−1 )
i
= 1−e − hkie
N −1
hki h hki −1 hki −1
i
= 1 − e−(N −1) N−1 +O(N ) − hkie−(N −2) N−1 +O(N )
N −1
hki h −hki+O(N −1 ) −hki+O(N −1 )
i
= 1−e − hkie
N −1
hki h i
= 1 − e−hki − hkie−hki + O(N −2 ) (157)
N
So in the finite connectivity scaling regime all clustering coefficients of typical Erdős-Rényi
graphs vanish for N → ∞, and the number of triangles per node is order N −1 . Using in
principle similar tools (but involving calculations that are more tedious), one can show that
large random graphs generated from the Erdős-Rényi ensemble (128) with fixed hki will be
locally tree-like (i.e. have a vanishing number of short loops per node) and will on average
have vanishing degree correlations,
for all (k, k 0 ) : lim hW (k, k 0 |A)i = [ lim hW (k|A)i][ lim hW (k 0 |A)i] (158)
N →∞ N →∞ N →∞

with W (k, k 0 |A) as defined in (47).

6.5. Random graphs with a given prescribed degree distribution p(k)


Recall that real-world networks are not of a Poissonian form, see figure 19. It is therefore
of interest to consider random graph ensembles with a given prescribed degree distribution.
Graphs from an ensemble with a given prescribed degree distribution p(k) are generated as
follows:
(i) generate the degrees of the graph: draw N independent samples from the degree
distribution p(k)
(ii) generate the edges of the graph: randomly connect the vertices with edges given the
sampled sequence (k1 , k2 , . . . , kN ) of degrees.
The last step is usually done by associating sockets to nodes (k sockets for a node with
degree k), and by then randomly connecting the sockets with edges.
Three canonical examples of random graphs with a prescribed degree distribution are
e−c ck
• Poissonian random graphs: p(k) = k!
;
68

• Regular random graphs: p(k) = δc,k ;


c
k
• Exponential random graphs: p(k) = 1+c
/(1 + c).
Note that the average degree hki = c for each of these ensembles.
Let us work out the generating function (150) for the degree distribution p(k) these
ensembles:
• Claim: the generating function for regular random graphs is
G(x) = xc . (159)
Proof: this is trivial, in the sum over k we retain only the term k = c.

• Claim: the generating function for Poissonian random graph is


G(x) = e−c(1−x) . (160)
Proof:
y k /k! = ey :
P
We sum over k in (150) using k≥0
X ck x k
G(x) = e−c = e−c ecx = e−c(1−x) (161)
k≥0
k!

• Claim: the generating function for exponential random graphs is


1
G(x) = . (162)
1 + c(1 − x)

Proof:
We sum over k in (150) using k≥0 y k = 1/(1−y):
P

1 X  cx k 1 1
G(x) = =
1+c k≥0 1+c 1+c 1 − cx/(1+c)
1 1
= = (163)
1 + c − cx 1 + c(1 − x)
Note that in the exercises we confirm for these three examples that indeed G(0) = 0,
d
G(1) = 1, and limx→1 x dx G(x) = c.
In what follows, we will also need the generating function H(x) = ∞ k
P
k=0 W (k)x of the
degree distribution W (k) = kp(k)/hki (which is the average of the degree distribution (49)).
Since the two generating functions G and H are related by (see tutorials)
x∂x G(x)
H(x) = (164)
c
69

(a) hki = 0.5 (b) hki = 1.5

Figure 27. Two graphs sampled from the ER ensemble in the finite connectivity limit with
parameters hki = 0.5, N = 1000 (Panel a) and hki = 1.5, N = 1000 (Panel b) Nodes with
degree 0 are not shown.

we can readily compute H(x) from the expressions (159), (160), and (162). We obtain for
regular random graphs,
H(x) = xc , (165)
for Poissonian random graphs
H(x) = xe−c(1−x) , (166)
and for exponential random graphs
x
H(x) = . (167)
(1 + c(1 − x))2

6.6. Percolation theory for random, locally tree-like graphs


Random graphs with a prescribed degree distribution exhibit a percolation transition. The
percolation transition is illustrated in Figures 27 and 28 for the Poissonian and regular
ensembles, respectively. The left panel of Figure 27 presents a graph drawn from the
Poissonian ensemble with mean degree hki = 0.5, and the right panel presents a Poissonian
graph with mean degree hki = 1.5. The graph on the right exhibits a giant connected
subgraph, whereas the graph on the left consist of a large number of small connected
components. Random graphs exhibit with probability one a giant component when their
mean degree is large enough. The percolation transition is a general phenomenon of random
graphs. and figure 28 illustrates how a giant component emerges in regular random graphs.
Percolating graphs are often functional (consider for instance the example of a trans-
port network or a distributional network). Therefore, we want to develop a quantitative
70

(a) c = 2 (b) c = 3

Figure 28. Two graphs sampled from the regular ensemble with parameters c = 2 and
N = 1000 (Panel a) and c = 3 and N = 1000 (Panel b).

understanding of percolation in random graphs, which is the aim of the present section.

Relative size f of the largest connected component.


We count the relative number of vertices that occupy the largest connected component
LCC(A), which is the largest connected subgraph of G. Since the number of vertices that
occupy the largest component can diverge in the limit of large N , we consider its relative
size
|LLC(A)|
f (A) = . (168)
N

Percolation transition.
In the limit N → ∞, the random variable f (A) converges to a deterministic limit f . If
f > 0, then there exist a giant component, whereas if f = 0, then there exist no giant
component. The limiting value f = limN →∞ f (A) exhibits a phase transition: f = 0 for
small mean degrees c < c∗ , whereas f > 0 for large enough mean degrees c > c∗ . Erdős and
Rényi have proven this result with following theorem.
Theorem 1. Consider the Erdős-Rényi ensemble in the finite connectivity limit with mean
degree c.
• If c < 1, then with probability one it holds that
N f = α log(N ) + O(log log N ), (169)
where α > 0 is a constant.
71

• If c ≥ 1, then with probability one


f = γ(c) + O(N −1/2 ), (170)
with 0 < γ(c) < 1 the solution of
1 − γ = e−cγ . (171)
At the critical point c = c∗ = 1 it holds that f = O(N −1/3 ).
Molloy and Reed have extended this result to the case of random graphs with a prescribed
degree distribution.
Theorem 2. Consider a random graph with a (well-behaved) prescribed degree distribution
p(k).
• If ∞
P
k=1 k(k − 2)p(k) < 0, then with probability one it holds that
2
N f = O(kmax (N ) log(N )). (172)
where kmax (N ) is the largest degree in the graph.
• If ∞
P
k=1 k(k − 2)p(k) > 0, then then with probability one it holds that

1 − f = G(y) + O(N −1/2 ), (173)


with y the smallest nonnegative solution to
H(y)
y= . (174)
y
The functions G and H in (173) and (174) are the generating functions of the degree
distributions p(k) and W (k) (defined in the previous section). If ∞
P
k=1 k(k − 2)p(k) = 0,
then the ensemble is critical.
The theorem by Molloy and Reed — proven in their paper [Molloy and Reed, Combina-
torics, probability and computing 7, 295 (1998)] — implies that the degree distribution p(k)
is well-behaved in the sense that the tails of the distribution decay fast enough.

Derivation of the Molloy-Reed condition ∞


P
P∞ k=1 k(k − 2)p(k) = 0.
We derive the condition k=1 k(k − 2)p(k) = 0 for the critical mean degree c∗ .
Let i be a node uniform at random drawn from the graph and let Ni (d, A) be the
number of nodes j(6= i) separated a distance dji = d from the root node i. In what follows,
we derive an expression for the average
N D
X E
−1
N (d) = lim N Ni (d, A) . (175)
N →∞
i=1
in the limit d → ∞. The quantity N (1) is the average number of direct neighbours of a
randomly selected node and is therefore equal to

X
N (1) = kp(k) = c. (176)
k=0
72

The quantity N (2) is the average number of neighbours separated a distance d = 2 from the
root node and
∞ ∞
!
X X
N (2) = kp(k) kW (k) − 1 , (177)
k=0 k=0
P∞
since every node neighbouring the root node has an average k=0 kW (k) number of
neighbours. The −1 appears because we have to substract the root node. Analogously,
∞ ∞
!2
X X
N (3) = kp(k) kW (k) − 1 , (178)
k=0 k=0
and eventually
∞ ∞
!d−1
X X
N (d) = kp(k) kW (k) − 1 . (179)
k=0 k=0
Interestingly,
(
( ∞
P
∞ k=0 kW (k) − 1) > 1,
lim N (d) = P∞ (180)
d→∞ 0 ( k=0 kW (k) − 1) < 1.
We conclude that ( ∞
P
k=0 kW (k) − 1) = 1 is the critical point. Using that W (k) = kp(k)/c,
we obtain

X k 2 p(k)
P∞ −2 =0
k=0 k=0 kp(k)

X
⇔ k(k − 2)p(k) = 0, (181)
k=1
which is the Molloy-Reed condition for the percolation transition of a random graph with a
prescribed degree distribution.

The relative size of the giant component.


We derive a formula for the relative size f of the giant component.
We use that random graphs with a prescribed degree distribution are locally tree-like
(see Figure 29 for a sketch). Locally tree-like random graphs have convenient mathematical
features. The main one is this: given the statistical features of a node i, those of the ki
nodes in its environment ∂i can in leading order be taken as statistically independent, since
each ‘branch’ of the tree centred at i is connected to each other branch only via node i. See
Figure 29. We use this statistical independence to derive a set of recursion relations that are
valid in the limit of large N .
The quantity 1 − f is the probability that a randomly drawn (root) node is not part of
the LCC,
1−f = Prob(randomly drawn node not in LCC).
73

' $
∂i
i

& %

Figure 29. In locally tree-like graphs the number of short loops is vanishingly small, for the
vast majority of the nodes the local topology is that of a tree. Hence, starting from a node
i, we would find that the tree branches descending from i are nearly disconnected – they
connect only at site i. In this example p(k) = δk,3 . With non-regular degree distributions
we will see local randomness in this environment.

= + + +… = + + +…

(a) Series (182) for 1 − f (b) Series (185) for y

Figure 30. Graphical illustration of the series (182) and (185).


X
= p(k) × Prob(neighbours of randomly drawn node with degree k not in LCC)
k=0
Since for locally tree-like graphs the neighbours of the central root node are statistically
independent, we obtain
X∞
1−f = p(k) y k , (182)
k=0
where y is the probability of obtaining a finite connected subgraph of G when randomly
picking up an edge in the graph and following the edge to one of its end points (see figure 30(a)
74
giant component size
1.0
Erdos-Renyi
Exponential
0.8 Regular
0.6

f
0.4

0.2

0.0
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
c

Figure 31. Fraction f of nodes that are part of the giant component as a function of the
mean connectivity c in the limit of N → ∞.

for a graphical illustration). Notice that we work in the limit where the graph G is infinitely
large. We recognise on the right hand side of (182) the generating function G of p(k), and
therefore (182) reads
1−f = G(y). (183)
Analogously, we obtain a self-consistent equation for the probability y, namely,
y = W (1) + W (2)y + W (3)y 2 + . . . . (184)
See figure 30(a) for a graphical illustration of this series. Equation (184) is equivalent to
X∞
y= y j−1 W (j). (185)
j=1

We recognise on the right hand side of (185) the generating function H divided by y, and
therefore (185) reads
H(y)
y= . (186)
y
Equations (186) and (183) are exactly the equations (174) and (173) that appear in the
theorem by Molloy and Reed.

Percolation transition in random graphs from (174) and (173).


We analyse the equations (174) and (173) that provide the size of the giant component
f . The equation (173) is a self-consistent equation that can be solved iteratively, whereas
equation (174) determines the size of f as a function of y. We consider the iterative equation
H(yn )
yn+1 = (187)
yn
75
1.0 1.0
c = 0.5 c = 0.2 1.0 c=1
0.8 c=1 0.8 c = 0.5 0.8
c=2
c = 1.5 c=1 c=3
0.6 y 0.6 y y2
0.6
H(y)/y
0.4 0.4 0.4
0.2 0.2 0.2
0.0 0.0 0.0
0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0
y y y
(a) Erdős-Rényi ensemble (b) Exponential ensemble (c) Regular ensemble

Figure 32. Graphical illustration of the selfconsistent equation (174) for the probability
y that one of the end points of a randomly selected edge does not belong to the giant
component.

with n a natural number and with initial condition y1 ∈ (0, 1). The size of the giant
component
f = lim G(yn ) − 1. (188)
n→∞

Writing a small computer program to solve (187), we can obtain the size of the giant
component. The results for our three canonical ensembles are presented in Figure 31.
We analyse in more depth the self-consistent equation (186) and the various solutions
it admits. The self-consistent equation (173) always admits three trivial solutions y = 1
since H(1) = ∞
P
k=0 W (k) = 1. This solution describes the case where all vertices belong to
finite isolated subgraphs. If the mean degree c is large enough, then there exists a second
(nontrivial) solution y ∈ (0, 1), as is illustrated in Figure 32, which will determine the size
of the giant component.
Finally, we analyse the behaviour of the size of the giant component f in the vicinity of
the critical point c∗ . We observe in figure 31 that f (c) is continuous (the regular ensemble
is a special case since then c is discrete). Hence, our approach is a continuous bifurcation
analysis of the self-consistent relation (186) around y = 1. Since it is more convenient to work
with a quantity that is zero near the critical point, we introduce the parameter α = 1 − y
that solves, see (186),
H(1 − α)
α=1− . (189)
1−α
Using the definition of H(x), we obtain
∞ ∞
X kp(k) k−1
X (k + 1)p(k + 1)
α=1− (1 − α) =1− (1 − α)k
k≥0
c k≥0
c
∞ k
X (k + 1)p(k + 1) X  k 
=1− (−α)`
k≥0
c `=0
`
76
∞ ∞
X
`
X (k + 1)p(k + 1)  k 
=1− (−α)
`≥0 k≥`
c `
(∞ ∞
X (k + 1)p(k + 1) X (k + 1)p(k + 1)  k 
=1− −α
k≥0
c k≥1
c 1

)
2
X (k + 1)p(k + 1)  k 
+α + O(α3 )
k≥2
c 2
∞ ∞
X k(k + 1)p(k + 1) 2
X (k + 1)p(k + 1) k!
=α −α + O(α3 )
k≥1
c k≥2
c 2(k − 2)!
∞ 2 X∞
X k(k + 1)p(k + 1) α (k + 1)k(k − 1)p(k + 1)
=α − + O(α3 )
k≥1
c 2 k≥2
c
∞ ∞
2 X
X k(k − 1)p(k) α k(k − 1)(k − 2)p(k)
=α − + O(α3 ) (190)
k≥0
c 2 k≥1
c
Hence, our equation for α can be written as
P∞ P∞
k≥0 k(k − 1)p(k) 1 k≥1 k(k − 1)(k − 2)p(k)
 
α 1− + α + O(α3 ) = 0 (191)
c 2 c
For sufficiently small α we may neglect the cubic term. Using y = 1 − α we obtain that close
the critical point c∗
P∞
k≥0 k(k − 2)p(k)
y = 1 or y = 1 − 2 P∞ . (192)
k≥0 k(k − 1)(k − 2)p(k)
The first term is the trivial solution whereas the second term is the nontrivial solution that
appears for c > c∗ . The nontrivial solution equals zero when ∞
P
k≥0 k(k − 2)p(k) = 0, which
is precisely the Molloy-Reed condition (181). Hence, we have derived this condition from
equation (186) using a bifurcation analysis.
In order to derive an analytical expression for the size of the giant component f , we
expand the right-hand side of (173) around α = 1 − y ≈ 0. We obtain that
∞ ∞ k 
X X X k
f =1− p(k)(1 − α)k = 1 − p(k) (−α)`
k≥0 k≥0 `=0
`
X∞ X∞ k
=1− (−α)` p(k)
`≥0 k≥`
`

nX ∞
X k o
=1− p(k) − α p(k) + O(α2 )
k≥0 k≥1
1

X ∞
X ∞
X
=α p(k)k + O(α2 ) = α kp(k) = (1 − y) kp(k) + O((1 − y)2 )
k≥1 k≥0 k≥0
(193)
77

Equations (192) and (193) imply that close to the percolation transition f takes the form
c ∞
P
k≥0 k(k − 2)p(k)
f = 2 P∞ , (194)
k≥0 k(k − 1)(k − 2)p(k)
where we have used c = ∞
P
k≥0 kp(k)

Examples.
We derive the critical value c∗ of the percolation transition for our three canonical examples
of random graphs with a prescribed degree distribution. The numerical solution of equation
(187) in figure 31 indicates that c∗ = 1, c∗ = 2 and c∗ = 0.5 for the Poissonian, regular
and exponential ensembles, respectively. We also derive the expression for f (c) close to the
critical point c∗ :
• Poissonian graphs p(k) = e−c ck /k!:
The critical connectivity c∗ solves the Molloy-Reed relation (181). The moments of
p(k) follows from the generating function G(x) = e−c(1−x) using limx→1 (x dx
d m
) G(x) =
P∞ m
k=0 k p(k), viz.,

X
kp(k) = c (195)
k=0

X
k 2 p(k) = c2 + c, (196)
k=0
and hence

X
p(k)(k 2 − 2k) = c2 − c = c(c − 1). (197)
k=1
Relation (197) has two solutions: a trivial solution c = 0 and a non-trivial solution
c∗ = 1, which is the critical point in Theorem 1.
Close to the percolation transition, f takes the form (194). Using the generating function
G(x), we obtain that

X
k 3 p(k) = c + 3c2 + c3 . (198)
k=0
Therefore,
c2 (c − 1)
f =2 2 3 2
+ O((c − 1)2 )
c + 3c + c − 3(c + c) + 2c
c − c∗
=2 + O((c − c∗ )2 ) = 2(c − c∗ ) + O((c − c∗ )2 ). (199)
c
for c > c∗ = 1.
In figure 31 we present the relative size of the giant component f as a function of c and
in figure 33(a) we compare this theoretical result with numerical simulation results for
finite values of N .
78
c
k
• Exponential graphs p(k) = 1+c /(1 + c): The moments of p are obtained using
d m
P∞ m 1
limx→1 (x dx ) G(x) = k=0 k p(k) and G(x) = 1+c(1−x) , viz.,
X
kp(k) = c (200)
k=0
X
k 2 p(k) = c + 2c2 . (201)
k=0

The Molloy-Reed condition reads


X∞
p(k)(k 2 − 2k) = 2c2 − c = 0. (202)
k=1

We again obtain two solutions to this equation: the trivial solution c = 0 and the critical
percolation point c∗ = 1/2 for exponential random graphs.
Close to the percolation transition, f we can use the formula (194). Using the generating
function G(x), we obtain that

X
k 3 p(k) = c + 6c2 + 6c3 . (203)
k=0

Therefore,
c(2c2 − c)
f =2 2 3 2
+ O((2c − 1)2 )
c + 6c + 6c − 3(c + 2c ) + 2c
2 c − c∗ 4
= + O((c − c∗ )2 ) = (c − c∗ ). (204)
3 c 3
for c > c∗ = 1/2.
In figure 31 we present the relative size of the giant component f as a function of c and
in figure 33(a) we compare this theoretical result with numerical simulation results for
finite values of N .
Comparing (204) with (199) we find that although the percolation transition appears
earlier for exponential graphs (c∗ = 1/2) than for Poissonian graphs (c∗ = 1), the linear
slope at the transition is smaller for exponential graphs (4/3(c−c∗ )) than for Poissonian
graphs (2(c−c∗ )). Loosely said: exponential graphs have a higher tendency to develop a
giant component but the size of the giant component will be smaller (than for Poissonian
graphs).
• Regular graphs p(k) = δk;c : The critical connectivity c∗ that solves the Molloy-Reed
relation (181), which now reads

X
p(k)(k 2 − 2k) = c(c − 2) = 0. (205)
k=1

There are two solutions to this equation, namely, the trivial solution c = 0 and the
critical point c∗ = 2.
79

1.0 1.2
1.1
0.8 1.0
0.6 0.9

f(A)
f(A)

0.8
0.4 N infy
0.7
N = 10
0.2 N = 100 0.6 c=2
N = 1000 0.5 c=3
0.0 0.4
0 1 2 3 4 101 102 103 104 105
c N
(a) Finite size effects for the Erdős-Rényi (b) Finite size effects for the regular ensemble
ensemble

Figure 33. Panel (a): numerical results for hf (A)i for Poissonian graphs with finite N are
compared with the asymptotic limit f from the theory (170)-(171). Panel (b): hf (A)i as a
function of N for regular random graphs with c = c∗ = 2 and c = 3. For c = 3 we obtain
that f = 1, as predicted from the theory (207). For c = 2 the ensemble is critical: the mean
size hf (A)i decays slowly as a power law N −β with some exponent β. Each symbol is a
sample average over 100 graphs that are randomly generated from the described ensembles.

For regular graphs the variable c is discrete and therefore the bifurcation analysis leading
to formula (194) does not apply.
Since G(x) = H(x) = xc we obtain the following equation for the relative size of the
giant component
1 − f = (1 − f )c/2 (206)
If c = 1, then only f = 0 solves the relation (206). If c = 2, then any f ∈ [0, 1] is a
solution to (206). If c > 2, then there exist two solutions, f = 0 and f = 1. The latter
describes the relative size of the largest connected component of the graph. Hence,
(
0 c ≤ 2,
f= (207)
1 c ≥ 3.
In figure 32(c) we show the graphical solution of (206) and in figure 31 we present f as
a function of c.
In Figure 33 we compare theoretical predictions for f with experimental data for
hf (A)i/N obtained from numerically generating graphs. Figure 33(a) presents result for
Erdős-Rènyi graphs. For N = 1000 theory and experiment are in very good correspondence.
Figure 33(b) presents results for the regular ensemble with c = 2 and c = 3. For c = 3,
hf (A)i = 1, whereas for c = 2 the quantity hf (A)i decreases slowly as a function of N and
we observe fluctuations in the numerical data. This is because for regular graphs c = c∗ = 2
is the critical connectivity.
80

7. Appendices

7.1. Network software


The following software resources for imaging and/or analysis of networks, created within the
academic community, are free:
• Cytoscape: www.cytoscape.org
• Gephi: http://gephi.github.io
• R – with igraph package: www.r-project.org
• Pajek: vlado.fmf.uni-lj.si/pub/networks/pajek/

7.2. The Pearson correlation


The Pearson correlation of two random variables (u, v) with joint distribution P (u, v) is
defined as
huvi − huihvi
PC = p (208)
(hu i − hui2 )(hv 2 i − hvi2 )
2

It tests for statistical dependence in the form a (partially) linear relationship between u and
v. To get some intuition for this, let us work out two simple extreme cases:
• Statistically independent u and v
Now P (u, v) = P (u)P (v), and hence
X X X  X 
huvi = P (u, v)uv = P (u)P (v)uv = P (u)u P (v)v = huihvi
uv uv u v
Hence we obtain PC = 0.
• Linearly related u and v
Suppose u = λv + c for all combinations (u, v). Now we obtain
huvi = hv(λv + c)i = λhv 2 i + chvi
hu2 i = h(λv + c)2 i = λ2 hv 2 i + c2 + 2λchvi
hui = hλv + ci = λhvi + c
Inserting all this into formula (208) then leads to
huvi − huihvi
PC = p
(hu2 i − hui2 )(hv 2 i − hvi2 )
λhv 2 i + chvi − λhvi2 − chvi
=p p
λ2 hv 2 i + c2 + 2λchvi − λ2 hvi2 − c2 − 2λchvi hv 2 i − hvi2
 
λ hv 2 i − hvi2
= p p = sgn(λ)
|λ| hv 2 i − hvi2 hv 2 i − hvi2
81

Hence if u and v are perfectly positively linearly correlated we find PC = 1, and if they
are perfectly negatively linearly related we find PC = −1.

7.3. Properties of symmetric matrices


Eigenvectors and eigenvalues. We derive some properties of real symmetric N × N matrices
A. The eigenvalue polynomial det (A − λ1I) = 0 is of order N , so A will have N (possibly
complex) solutions λ (where some may coincide) of the eigenvalue problem
Ax = λx, x 6= 0 (209)
We denote complex conjugation of complex numbers z in the usual way: if z = a + ib (where
| N is x · y = i x∗i yi .
a, b ∈ IR), then z ∗ = a − ib and |z|2 = z ∗ z ∈ IR. The inner product on C
P

• Claim: all eigenvalues of the matrix A are real.


Proof:
Take the inner product in (209) with the conjugate vector x∗ , which gives
N
X N
X
x∗i Aij xj =λ |xi |2
i,j=1 i=1

We use the symmetry of A, and substitute Aij → 21 (Aij + Aji ):


1 ij x∗i (Aij + Aji )xj 1 ij Aij (x∗i xj + xi x∗j )
P P
λ= PN = PN
2 i=1 |xi |
2 2 i=1 |xi |
2

Since (x∗i xj + xi x∗j )∗ = xi x∗j + x∗i xj = x∗i xj + xi x∗j , the above fraction is real-valued.

• Claim: all eigenvectors of the matrix A can be chosen real-valued.


Proof:
We separate real and imaginary parts of every eigenvector:
1 1
x = Re x + iIm x Re x = (x + x∗ ) Im x = (x − x∗ )
2 2i
N N
with Re x ∈ IR and Im x ∈ IR . Complex conjugation of (209) gives Ax∗ = λx∗
(since λ is real). Hence, if x is an eigenvector with eigenvalue λ, so is x∗ . By
adding/subtracting the conjugate equation to/from (209) it follows: if x and x∗ are
eigenvectors, so are Re x and Im x. Since the space spanned by x and x∗ is the same as
the space spanned by Re x and Im x, we are always allowed to choose the real-valued
pair Re x and Im x.

• Claim: for every linear subspace L ⊆ IRN the following holds:


if AL ⊆ L then also AL⊥ ⊆ L⊥
82

in which L⊥ denotes the orthogonal complement, i.e. IRN = L ⊕ L⊥ .


Proof:
For each x ∈ L and y ∈ L⊥ we find (x · Ay) = (y · Ax) = 0 (since Ax ∈ L and
y ∈ L⊥ ). Therefore Ay ∈ L⊥ , which completes the proof.

• Claim: we can construct a complete orthogonal basis in IRN of A-eigenvectors.


Proof:
Consider two eigenvectors xa and xb of A, corresponding to different eigenvalues:
Axa = λa xa Axb = λb xb λa 6= λb
We now form:
0 = (xa · Axb ) − (xa · Axb ) = (xa · Axb ) − (xb · Axa )
= λb (xa · xb ) − λa (xb · xa ) = (λa − λb )(xa · xb )
Since λa 6= λb it follows that xa · xb = 0. If all eigenvalues are distinct, this completes
the proof, since now there will be N eigenvalues with eigenvectors x 6= 0. Since these N
eigenvectors are orthogonal, after normalization they form a complete orthogonal basis.
To deal with degenerate eigenvalues we need the third property above. If Ax = λx,
then ∀y with x · y = 0: (Ay) · x = 0. Having found an eigenvector for eigenvalue λ (not
unique in the case of a degenerate eigenvalue), a new reduced (N −1) × (N −1) matrix
can be constructed by restricting ourselves to the subspace x⊥ . The new matrix is again
symmetric, the eigenvalue polynomial is of order N − 1 (and contains all the previous
roots except for one corresponding to the eigenvector just eliminated), and we can repeat
the argument. This shows that there must again be N orthogonal eigenvectors.

Basis of eigenvectors and diagonal form. The final consequence of the above facts is that
there exist a set of N vectors {êi }, where i = 1, . . . , N and êi ∈ IRN for all i, with the
following properties:
Aêi = λi êi , λi ∈ IR, λi > 0, êi · êj = δij (210)
We can now bring A onto diagonal form by a simple unitary transformation U , which we
construct from the components of the normalised eigenvectors ê: Uij = êji . We denote the
transpose of U by U † , Uij† = Uji , and show that U is indeed unitary, i.e. U † U = U U † = 1I:
X X X X
(U † U )ij xj = U ki Ukj xj = êik êjk xj = δij xj = xi
j jk jk j
X X X X
† ik
(U U )ij xj = U Ujk xj = êki êkj xj = êki (ê · x) = xi
j jk jk k
83

(since {ê` } is a complete orthogonal basis). From U being unitary it follows that U and U †
leave inner products, and therefore also lengths, invariant:
(U x) · (U y) = x · U † U y = x · y U †x · U †y = x · U U †y = x · y
We can see explicitly that U indeed brings A onto diagonal form:
XN N
X N
X
† j

(U AU )ij = Uik Akl Ulj = i
êk Akl êl = λj êik êjk = λj δij (211)
kl=1 kl=1 k=1
−1
Note that the inverse A of the matrix A exists, and can be written as follows:
N
X
(A−1 )ij = λ−1 k k
k êi êj (212)
k=1

To prove that this is the inverse of A, we work out for any x ∈ IRN the two expressions
XN N
X N
X
−1 −1 ` `
(AA x)i = Aik λ` êk êj xj = ê`i (ê` · x) = xi
kj=1 `=1 `=1

(again since {ê` } forms a complete orthogonal basis), and


X N
N X N
X
−1
(A Ax)i = λ−1 ` `
` êi êk Akj xj = ê`i (ê` · x) = xi
kj=1 `=1 `=1

Variational principle for eigenvalues.


• Claim: Let µmax (A) be the largest eigenvalue of a square, symmetric matrix A of size
N . Then,
x · Ax
µmax (A) = maxx∈IRN . (213)
x·x

Proof: We decompose x into the eigenvectors êk of A:


XN
x= σk êk , (214)
k=1
and obtain that
PN PN j k
x · Ax k=1 j=1 σj σk ê · (Aê )
= PN 2 j j (215)
x·x j=1 σj ê · ê
PN PN j k
k=1 j=1 σj σk µj ê · ê
= PN 2 (216)
j=1 σj
PN 2 PN 2
j=1 σj µj j=1 σj
= PN 2 ≤ µmax PN 2 = µmax (217)
j=1 σj j=1 σj
The equality holds when x is the eigenvector associated with µmax . This completes the
proof.
84

• Claim: Let µmin (A) be the largest eigenvalue of a square, symmetric matrix A of size
N . Then,
x · Ax
µmin (A) = minx∈IRN . (218)
x·x

Proof: Follows directly by applying the previous theorem to −A.

7.4. Integral representation of the Kronecker δ-symbol


Here we show that for any n, m ∈ ZZ the Kronecker δ can be written in the integral form
Z π
dω i(n−m)ω
δn,m = e (219)
−π 2π
To see this one simply does the integral on the right:
Z π Z π
dω i(n−m)ω dω
n=m: e = =1
−π 2π −π 2π
Z π Z π
dω i(n−m)ω dω  
n 6= m : e = cos((n−m)ω) + i sin((n−m)ω)
−π 2π −π 2π
1 h 1 i iω=π
= sin((n−m)ω) − cos((n−m)ω) =0
2π n−m n−m ω=−π

7.5. The Landau order symbol


Let f (x) and g(x) be two functions of a variable x that is taken to zero, such that
limx→0 f (x) = limx→0 g(x) = 0. We then define the order symbol O as follows:
f (x) = O(g(x)) for x → 0 ⇔ (∃C > 0,  > 0)(∀|x| < ) : |f (x)/g(x)| < C
(220)
In words: asymptotically for x → 0, f (x) decays to zero equally fast or faster than g(x).
Similarly we could use it to characterise the behaviour of functions that diverge, such as
f (x) = O(g(x)) for x → ∞ ⇔ (∃C > 0, X > 0)(∀x > X) : |f (x)/g(x)| < C
(221)
In words: asymptotically for x → ∞, f (x) diverges equally fast or slower than g(x).

7.6. Perron-Frobenius Theorem


We state here the Perron-Frobenius Theorem, which is a general theorem in matrix
theory. Usually, this theorem is stated for nonnegative, irreducible matrices. However,
we reformulate it here in terms of adjacency matrices A of graphs that are connected.
Let A be the adjacency matrix of a connected nondirected graph G = (V, E) of order
|V | = N . Then A possesses an eigenvalue µmax (A), which we call the Perron root, such that
85

• µmax > 0
• For any eigenvalue µi (A) of A, |µi (A)| < µmax (A)
• There exists an eigenvector v(A) corresponding to the eigenvalue µmax (A) that has
strictly positive entries, i.e. vi (A) > 0
• The eigenvector v(A) is unique up to a constant multiple
Note that the entries of the adjacency matrix are all nonnegative. If a symmetric matrix,
say B, has both positive and negative entries, then the Perron-Frobenius Theorem does not
apply and the eigenvector v associated with the leading eigenvalue µmax (B) can have both
negative (vi (B) < 0) and positive entries (vi (B) > 0).
There exists also a Perron-Frobenius Theorem for strongly connected directed graphs.
In this case
• µmax is real
• For any eigenvalue µi (A) of A, |µi (A)| < µmax (A)
• There exists a right eigenvector r(A) corresponding to the eigenvalue µmax (A) that has
strictly positive entries, i.e. vi (A) > 0
• The right eigenvector r(A) is unique up to a constant multiple
Note that the same holds for left eigenvectors as the transpose of A also represents a strongly
connected directed graph.
86

8. Exercises

Tutorial 1: Adjacency matrices, paths, and degrees of graphs

Questions marked by an asterix (∗) should be submitted for feedback.

1. Which if the three graphs below is simple? Which of them is directed? Give for each
these graphs the vertex set V and the edge set E.

5 4

9
•6 7 7

•@I •K
A
•@ •6
@ A @
@ A @
2 • •
-@
6
I
@
-A •4 1 • •@
@ •3
3@@ 2 @
@

10 • • 11
@
8 • •9
@

• •
6 5

2.∗ Calculate the adjacency matrices for each of the three graphs above, upon relabelling
the nodes of the first graph such that its vertex set becomes V = {1, . . . , 9}.
3. Use the adjacency matrices calculated in the previous exercise to prove that the first
of the three graphs has exactly four paths of length three and no paths of length 4 or
larger. Argue why we can be sure that the middle and right graphs will contain paths
of any length ` > 0.

4. Calculate all indegrees and all outdegrees of the above three graphs.
5. Prove that in nondirected graphs always kiin (A) = kiout (A). Prove that for nondirected
graphs one always has ki (A) = (A2 )ii .
6. Prove that in directed graphs always N
P in
PN out
i=1 ki (A) = i=1 ki (A).
87

Tutorial 2: Cluster coefficient and distance

Questions marked by an asterix (∗) should be submitted for feedback.


7.∗ Show that in any simple nondirected graph with adjacency matrix A one has Ci (A) =
2Ti (A)/ki (A)[ki (A) − 1], where Ti (A) is the number of triangles in which node i
participates, and ki (A) is its degree.
8. Calculate the clustering coefficients for all nodes in the second and the third of of the
above graphs. Why would we not calculate them for the first graph?
9. Let A be the adjacency matrix of a tree. Show that Ci (A) = 0 for all nodes i.
10. Prove the matrix identity (1I − γA)−1 = `≥0 γ ` A` . Given a matrix norm |A| that
P

satisfies the usual conditions (i.e. |A| ∈ IR+ , |λA| = |λ||A| for any λ ∈ IR, |A| = 0
if and only if A = 0, |A1 + A2 | ≤ |A1 | + |A2 |, and |A2 | ≤ |A||A|), show that there
is always a sufficiently small but nonzero value of γ such that the series `≥0 γ ` A`
P

converges in norm.
11. Consider the triangle graph

3

1 • •
2

Show that  
1−γ γ γ
1
(1 − γA)−1 =  γ 1−γ γ 
 
1 − γ − 2γ 2
γ γ 1−γ
with A the adjacency matrix of the triangle graph. In addition, show that
log[(1I − γA)−1 ]ij
lim = 1 − δi,j .
γ↓0 log γ
12. Consider the N -node graph with N > 2 and the following adjacency matrix entries:
Aij = δi,j+1 mod N + δi,j−1 mod N . Draw the graph for the case of N = 5. Prove that for
this graph
N −1
−1 1 X e2πi`(r−s)/N
r, s ∈ {1, . . . , N } : [(1I − γA) ]rs = ,
N `=0 1 − 2γ cos(2π`/N )
where i is the imaginary number (you may find the geometric series helpful in proving
this, as well as the series (1 − )−1 = ∞ m
P
m≥0  ; see Calculus lectures).
88

Tutorial 3: Node centrality

Questions marked by an asterix (∗) should be submitted for feedback.

Figure 34. Example of a tree containing an edge connecting two disjoint regions of size N1
and N2 .

13.∗ Calculate the closeness centrality and the betweenness centrality of nodes i = 2 and
i = 3 in the second graph and the third graph of the first tutorial.
14. Consider the tree in Fig. 34. The edge connecting node 1 to node 2 divides the tree
into two disjoint regions containing N1 and N2 nodes, respectively (with N = N1 + N2 ).
Show that the closeness centrality x1 of node 1 is related to the closeness centrality x2
of node 2 by the formula
1 N1 1 N2
+ = + . (222)
x1 N x2 N
15. Consider a tree of size N . Suppose that there exist a node of degree k, say node 1,
such that its removal would divide the tree into k isolated trees of size N1 , N2 , . . ., Nk
(N = N1 + N2 + . . . + Nk + 1). Show that the betweenness centrality of this node is
k
X
y1 = N 2 − Nα2 (223)
α=1

Use this result to derive the betweenness centrality of the nodes in the second graph of
Tutorial 1.
16. Calculate the adjacency matrix eigenvalue spectrum {µ1 (A), . . . , µN (A)} of the middle
graph in exercise 1. of the first tutorial. Consequently, determine the eigenvector
centrality vi of all the nodes in the middle graph.
89

Tutorial 4: Coefficients of similarity between node pairs

Questions marked by an asterix (∗) should be submitted for feedback.

17.∗ Show that the definition of the Pearson correlation similarity τij (A) can be derived
from the definition of the Pearson correlation of two random variables (u, v), upon
choosing P (u, v) = N1 N
P
k=1 δu,Aik δv,Ajk .
18. Let x · y denote an inner product on IRN , so that it meets the defining criteria:
(i) (∀x, y, z ∈ IRN ) : (x + y) · z = x · z + y · z
(ii) (∀x · y ∈ IRN )(∀λ ∈ IR) : x · (λy) = λx · y
(iii) (∀x · y ∈ IRN ) : x · y = y · x
(iv) (∀x ∈ IRN ) : x · x ≥ 0, with equality if and only if x = 0
Prove the Schwartz inequality: |x·y| ≤ |x||y|. Hint: calculate |x+λy|2 |y|2 with λ ∈ IR,
and choose a clever value for λ at the end.
19. Explain why the two expressions given for the cosine similarity σij (A) of two nodes
i and j in a nondirected graph, see Eq. (34), are identical. Show that |σij (A)| ≤ 1,
and that σij (A) = 1 if and only if ∂i = ∂j . Hint: define for each node i the vector
a(i) = (Ai1 , Ai2 , . . . , AiN ) ∈ {0, 1}N , and write σij (A) in terms of the two vectors a(i)
and a(j) .
20. Explain why the two expressions given for the Pearson correlation similarity τij (A) of
two nodes i and j in a nondirected graph are identical. Show that |τij (A)|
√ ≤ 1. Hint:
(i) (i) 1
P
define for each node i the vector a with entries ak = [Aik − N ` Ai` ]/ N , and write
τij (A) in terms of a(i) and a(j) .
90

Tutorial 5: Degree distributions and modularity

Questions marked by an asterix (∗) should be submitted for feedback.

21. Calculate the degree distributions p(k|A) or p(~k|A) for the N -node graphs with the
following adjacency matrices (check carefully whether they are directed or nondirected,
and use the correct degree distribution definition in each case):
(a) Aij = δi,j+1 for j < N , and AiN = 0.
(b) Aij = 1 for all i, j ∈ {1, . . . , N }
(c) Aij = 0 for all i, j ∈ {1, . . . , N }
(d) Aij = 1 if either i, j ∈ {1, . . . , N/2} or i, j ∈ {N/2+1, . . . , N }; Aij = 0 otherwise
(e) Ai1 = 1 for all i > 1, Aij = 0 for all other (i, j).
22.∗ Calculate the joint distribution of degrees of connected node pairs W (k, k 0 |A) for the
middle graph and right graph in question 1. of the first tutorial. For the middle graph,
compute the marginal W (k|A) and compare your result with the degree distribution
p(k|A) for this graph.
23. Compute the assortativity a(A) for the middle graph and right graph in question 1. of the
first tutorial. Relate the obtained numbers to the graph structure shown in question 1.
24. Show that the degree correlation ratio Π(k, k 0 |A) of a ‘regular’ simple nondirected graph
A, i.e. one with p(k|A) = δk,k? for some k ? ∈ IN, is always equal to 1 for any (k, k 0 ).
25. Prove that W1 (~k|A) = p(~k|A)k in /k̄(A) and that W2 (~k 0 |A) = p(~k 0 |A)k out0 /k̄(A).
26. Prove the following general bounds for the modularity: −1 ≤ Q(A) ≤ 1.
27. Assign the following module labels to the nodes of the right graph in exercise 1:
x1 = x2 = 1, x3 = x4 = x5 = 2. Calculate the graph’s modularity Q(A). Next
turn to graphs (b) and (d) in exercise 21. Assign the following module labels to the
nodes: xi = 1 for i ≤ N/2 and xi = 2 for i > N/2 (take N to be even). Calculate the
modularity Q(A) for both graphs.
91

Tutorial 6: Macroscopic properties of graphs

Questions marked by an asterix (∗) should be submitted for feedback.


28. Show that the total number of links in a directed graph can be written as either
L = N
P in
PN out
i=1 ki (A) or L = i=1 ki (A). Show that Pin simple nondirected graphs the
1 −1
number of links is L = 2 N k̄(A), where k̄(A) = N i≤N ki (A) is the average degree.
29. Let G = (V, E) be a bipartite graph with the two disjoint subsets V1 and V2 , such that
all edges (i, j) ∈ E have either i ∈ V1 and j ∈ V2 or i ∈ V2 and j ∈ V1 . Let k 1 be the
mean degree of nodes i ∈ V1 and let k 2 be the mean degree of nodes i ∈ V2 . Show that
N2
k1 = k2, (224)
N1
where N1 = |V1 | and N2 = |V2 |.
30. Prove the following identities for the density ρ(A) of a graph with adjacency matrix A:
directed graphs : ρ(A) = k in (A)/N
N
X
nondirected graphs : ρ(A) = k(A)/(N +1) + Aii /N (N +1)
i=1
simple nondirected graphs : ρ(A) = k(A)/(N −1)
31.∗ Calculate the diameter d(A) and the degree distribution p(k|A) for the middle and
the right graph in question 1. of the first tutorial.
32. Consider the following degree distribution for an infinitely large, nondirected graph:
p(k) = e−q q k /k!, ∀k ∈ IN, with q a postive number. Calculate the average degree
hki = k≥0 p(k)k and the degree variance σk2 = hk 2 i − hki2 .
P

33. Consider the following degree distribution for an infinitely large, nondirected graph:
p(k) = Ce−k , ∀k ∈ IN. Give a formula for the constant C. Calculate the average degree
hki = k≥0 p(k)k and the degree variance σk2 = hk 2 i − hki2 .
P

34. Consider the following power-law distribution




 0 k = 0,
−γ
p(k) = CN k k ∈ {1, 2, . . . , N } , (225)

 0 k > N,
which for N → ∞ is the degree distribution of an infinitely large graph.
Calculate the normalization constant CN . For which γ values is p(k) normalisable for
N → ∞? Give formulas for the mean value hki and the variance σk2 = k≥0 p(k)k 2 −
P

hki2 . For which γ values are hki and σk2 finite in the limit N → ∞? Give an estimate of
RN
hki and σk2 for γ = 2.5 and N = 10,000, using the approximation N −λ
≈ 1 dk k −λ .
P
k=1 k
92

Tutorial 7: Spectra of adjacency matrices

Questions marked by an asterix (∗) should be submitted for feedback.

35.∗ Show how the mean degree k(A) and the total number of triangles T (A) in
a simple nondirected N -node graph can be calculated directly from the spectrum
{µ1 (A), . . . , µN (A)} of its adjacency matrix.
36. Calculate the adjacency matrix eigenvalue spectrum {µ1 (A), . . . , µN (A)} of the middle
graph in exercise 1. of the first tutorial. Use your result to calculate the average degree,
and to prove that this graph has no closed paths of odd length.
37. Consider a star graph with N nodes centred around the node labeled by 1. The entries
of the adjacency matrix A∗ of this star graph are given by
A∗ij = δi,1 + δj,1 − 2δi,1 δj,1 , i, j = 1, 2, . . . , N, (226)
where δi,j is the Kronecker delta function. As an example, draw a star graph of size

N = 5. Show that µ = N − 1 is an eigenvalue of a star graph of size N , and derive
an expression for an eigenvector associated with this eigenvalue.
38. Show that
p
µmax (A) ≥ kmax (A) (227)
where A is the adjacency matrix of a simple, nondirected graph, where kmax (A) is the
largest degree of a simple and nondirected graph represented by the adjacency matrix
A, and where µmax (A) is the largest eigenvalue of A.
Tip: Use the inequality µmax (A) ≥ x·Ax x·x
for an appropriate vector x ∈ RN , namely,
choose an x that is closely related to the eigenvector associated with the maximal
eigenvalue of the star (sub)graph centred around the node with maximal degree kmax (A).
93

Tutorial 8: Laplacians and diffusion processes

Questions marked by an asterix (∗) should be submitted for feedback.

39.∗ Calculate the Laplacian matrix L of the middle graph in exercise 1, and its eigenvalue
spectrum. Hint: write L = 1I+B and first find the eigenvalues of B, where 1I is the unity
matrix. Use your result to prove that this graph has only one connected component.
40. Use the results of the previous exercise to solve the dynamical equations describing a
diffusion process on the middle graph in exercise 1, that starts with zi (0) = z0 δi2 (i.e.
diffusion from the central node i = 2). Verify that your result makes sense for t = 0
and in the limit t → ∞. Verify that the quantity N
P
i=1 zi (t) is conserved over time.
41. Show that for regular N -node graphs, i.e. those for which all N degrees ki (A) are
identical, one can express the Laplacian eigenvalue spectrum in terms of the adjacency
matrix eigenvalue spectrum. Give the mathematical relation between the two spectra
in explicit form.
42. Consider the N -node graph with the following adjacency matrix entries, with N > 2:
Aij = δi,j+1 mod N + δi,j−1 mod N . Calculate the adjacency matrix spectrum %(µ|A) and
the Laplacian spectra and %Lap (µ|A). Hints: use the result of the previous exercise, and
try Fourier modes xk = eiωk as an ansatz for the eigenvectors. Confirm that the smallest
eigenvalue of the Laplacian is zero, and use the spectrum to determine the number of
connected components in the graph.
94

Tutorial 9: Erdős-Rènyi model

Questions marked by an asterix (∗) should be submitted for feedback.

44.∗ Consider the Erdős-Rènyi model with parameter p∗ . What probability does the Erdős-
Rènyi model assigns to the middle graph and the right graph in question 1. of the first
tutorial?
45. Consider the following variant of the Erdős-Rènyi model. Let (G, p) be the random graph
ensemble where G is the set of simple and nondirected graphs of size N , and where p
associates a probability
N
Y  
p (A) = γij δAij ,1 + (1 − γij )δAij ,0 (228)
j>i=1

to each graph, with γij ∈ [0, 1].


Compute an explicit expression for the mean degree hk̄(A)i as a function of the
parameters γij .
46. Consider the finite connectivity limit
κi κj
γij = PN (229)
j=1 κj

where κi ∈ oN (1). Compute the average degrees hki (A)i and express the probability
given by Eq. (228) in terms of average degrees hki (A)i.
47. For the Erdős-Rènyi model we know that hk̄(A)i = p? (N −1). Show that the variance
σk̄2 = hk̄ 2 (A)i − hk̄(A)i2 is given by the expression
N −1
σk̄2 = 2p∗ (1 − p∗ ) . (230)
N
48. Compute the variance σk̄2 for the Erdős-Rènyi model in the finite connectivity regime
and express it in terms of hki for N → ∞. What can you conclude from the result?
49. Show that in the Erdős-Rènyi model the probability that k(A) is equal to 2n/N is given
by
!
  N (N −1)
2n N (N −1)
Prob k(A) = = 2 (p∗ )n (1 − p∗ ) 2 −n , (231)
N n
n o
where n ∈ 0, 1, 2, . . . , N (N2−1) .
95

Tutorial 10: Random graphs with a prescribed degree distribution

Questions marked by an asterix (∗) should be submitted for feedback.

50. Show for the degree distribution


q k
p(k) = ( ) /(1+q) (232)
1+q
P P
without using its generating function that k≥0 p(k) = 1 and k≥0 p(k)k = q.
51.∗ Calculate the generating function G(x) for the degree distribution,
p(k) = αδk,q1 + (1−α)e−q2 q2k /k!,
with α ∈ [0, 1] and q1 , q2 ∈ IN.
52. Confirm that the three generating functions for regular, Poissonnian, and exponential
d
random graphs all obey: G(0) = p(0), G(1) = 1, and limx→1 x dx G(x) = hki. Use the
2
generating function to determine expressions for hk i for the three canonical ensembles.
53. Let G(x) be the generating function of p(k) and H(x) be the generating function of
p(k)k/hki. Show that
x∂x G(x)
H(x) = . (233)
hki
54. Derive the generating function H(x) for the Poisson ensemble with mean degree c. Use
the generating function H to derive an expression for the mean and variance of the
distribution p(k)k
hki
. Compare with the known results for the Poisson distribution.
55. Consider a random regular graph with degree distribution
p(k) = δk,3 . (234)
Due to a failure or a catastrophic event, half of the edges of the graph are destroyed.
What is the degree distribution of the graph after the catastrophic event. Does the
graph still have a giant component after half of its edges are destroyed?
96

Tutorial 11: Percolation transition in random graphs


.

Let
1
G(x) = αx + (1 − α)
1 + c(1 − x)
be the moment generating function of the degree distribution pα (k) of an ensemble of simple
undirected graphs, where α ∈ [0, 1].
56. Provide an explicit expression for the degree distribution pα (k).
P
57. Derive an expression for the mean degree hki = k≥0 pα (k)k and the variance
2 2
P
var(k) = k≥0 pα (k)k − hki .
58. Derive an explicit expression for the generating function H(x) of the degree distribution
qα (k) = pα (k)k/hki, where pα (k) is the degree distribution associated with the
generating function G(x).
59. Show that the critical values of the parameter (α∗ , c∗ ) at which a giant component
appears in this ensemble (i.e., the percolation transition point) satisfy
c∗ (2c∗ − 1)
α∗ = . (235)
c∗ (2c∗ − 1) + 1
60. Discuss the two limiting cases α → 0 and α → 1. Sketch the critical line (α∗ , c∗ ) in the
plane of (α, c) parameters.
61. Lets for now focus on the case of α = 0. Show that the smallest nonnegative solution
of the equation y = H(y)/y is given by
(
1q c < 1/2,
y= 
1 c+2
 (236)
2 c
− c+4 c
c > 1/2,
and that the relative size f of the giant component is given by
(
0 c < 1/2,
f = 1 − √2 c ≥ 1/2. (237)
c+ c(c+4)

You might also like