LECTURE 10
RANDOMIZATION ALGORITHM
1
Learning Outcomes
Understanding the concept of randomization
Analyse quick sort and min cut algorithm
2
Randomized Algorithm
Make use of randomness in their computation (in other words,
it flips coins during execution to determine what to do next)
E.g.: quicksort that use a random pivot
Work not only on the input but also on random choices made
by the algorithm
Input
Output
Random
bits/number
During execution, takes random choices depending on those
random numbers
Output can vary when the algorithm run multiple times on
the same input
Random seed of a training data before feed into a3
Randomized Algorithm
Example (recall what you have learned in probability)
Toss a coin
The output are: {HH, HT, TH, TT}, where each with
probability ¼
Random variable: assign a number to each elementary
event. For e.g., define X as the number of heads: X(HH)
= 2, X(HT) = X(TH) = 1, X(TT) = 0
We can count the expected value of random variable:
E[X] = 2(1/4) + 1(1/4) + 0(1/4) = 1
For any two random variables, X, Y, we have E[X+Y] =
E[X] + E[Y]
However, it is not always E[]=E[X] E[Y] unless both
events, X and Y are independent
4
Probability Theory
To analyse randomized algorithms, need to understand probability
From tossing a coins, we know that the samples space =
elementary events = possible outcomes of experiment = {HH, HT,
TH, TT}
Events = subsets of samples space. For e.g., at least one head,
we will have {HH, HT, TH}
Each event has certain probability. For e.g.: uniform probability
when each even has equal probability: Pr[HH] = Pr[TH] = Pr[HT] =
Pr[TT] = ¼
Then, we can proceed to random variable and expected value of
random variable.
Instead of random variable X, we can add in more random
variable, e.g. Y.
For any two random variables X, Y we have E[X+Y] = E[X] +
E[Y]
E[X . Y] ≠ E[X] . E[Y] 5
Randomized Algorithm
RA can be seen as a nondeterministic like algorithm which
has a probability assigned to each possible transition
Or as a probability distribution on a set of deterministic
algorithms
Traditional worst-case and average-case analysis have
No random variable and
No expectation
When analyse randomized algorithm, need to consider the
Random variable
The reported value is not a maximum but an
expectation over some distribution on input x
6
Randomized Algorithm
Randomized algorithm A
As a probability distribution on deterministic
algorithms:
¼ ½
7
Where does the randomness come from?
Theoretically, we do not care about how we get the
random bits
Practically, three choices, from strongest to weakest
Physical randomness
Physical phenomenon that is expected to be
random
Cryptographically-secure pseudorandomness:
choose a function that is cryptographically secure
/dev/urandom device in Linux systems
Statistical pseudorandomness: choose a function
that is not cryptographically-secure
E.g.: random function in standard C library
8
Advantage of Randomised Algorithm
For certain problems, it is simpler
Fast with high probability (for e.g.: quicksort with
random pivot; to determine whether a number is
prime)
Produces optimum output with very high probability
Play important role in parallel algorithms rather than
sequential algorithm
A general design technique to improve algorithms with
bad worst case but good average case complexity
9
Difficulties
Analysing randomized algorithms can be difficult
The running time
Probability of getting correct answer
Result truly depends on the quality of the random
numbers
Truly random numbers is impossible, needs to
depend on pseudo random
10
Domains involve Randomized Algorithm
Various research direction
Sample generation in various sets
Bounds on the sample complexity
Stochastic gradient algorithms for control design
RAs for specific application
Quantum algorithms are randomised
Combination of deterministic/randomised techniques
RAs for nonconvex problems
11
Randomized Algorithm
Two types of bounds on cost:
Expected bounds
Expected worst-case performance, i.e. average
amount of time on the worst input of a given size
n
High probability bounds
To show that the algorithm does not consume
too much resources most of the time
12
Randomized Algorithm
Expected bounds represent the average case across all
random numbers in the algorithm
Easier to get
E.g.: 100 workers, each uses average 2 hours to
complete an assignment, the expected cost is
100*2=200 hours
High probability bound
Harder to prove
E.g.: 100 workers with each uses average 2 hours to
complete an assignment, we cannot say that the
maximum time among all workers is 2 hours. It is
possible that 1 worker uses 100 hours, but the other use
1 hour to complete the assignment which give the
average as 2 hours.
13
Classes of Randomized Algorithms
Las Vegas
Always correct; expected running time “probably
fast”
Fails with some probability but we can tell when it
fails
Can run again till it succeeds
E.g.: randomized Quicksort
Las Vegas is preferable
14
Classes of Randomized Algorithms
Monte Carlo (mostly correct):
Probable correct; guaranteed running time
Fails with some probability but we can not tell when
it fails
We can only reduce the probability of failure by
running it many times and take majority of the
answer
Why we want to use?
High probability that it will not be wrong
Cost of deterministic algorithm is high
E.g.: Karger’s algorithm, primality test (whether a
number is prime), polynomial equality-testing
15
Randomized Quick Sort
Goal: to sort a sequence with n elements into
increasing/decreasing order
Previous design to solve the problem:
Choose the first/last element, or it could be median
as pivot
Divide S into two sequences with S1 smaller than
the target t and S2 larger than t
Recursively sort both S1 and S2
Merge the sequences into unique sorted sequence
16
Randomized Quick Sort
Worst case occurs when:
Input sorted or reverse sorted
Partition around min or max element
One side of partition always has no element
Worst case recursion tree with one site empty, we get arithmetic
progression:
n(n-1)/2
If we are lucky, the sequence split evenly, the complexity is n log n
17
Randomized Quick Sort
Assume that we always split the sequences into 1/10 :
9/10 (propositional split)
When we are lucky, the If we are unlucky, the
complexity is: complexity is n2
I.e. n log n 18
Quick Sort
Any split of constant proportionality will yield a recursion tree of
depth with complexity log n
How to make sure that we are always lucky?
Partition around the “middle” so that the sequence is always
equally split
Randomization (works well in practice)
When the pivot element is randomly select, it means that each
element has equally likely to be selected
Partition around a random element where:
Running time is independent of the input order
No assumptions need to be made about the input distribution
Worst case is determined only by the output of a random-
number generator
19
Randomized Quick Sort
We focus on the comparison.
Randomized-Partition (A, p, r)
Call partition function, select
i = Random (p, r) pivot, proceed with other
exchange A[r] with A[i] partitions
Return Partition (A, p, r)
Randomized-Quicksort (A, p, r)
If p< r
q = Randomized-Partition (A, p, r)
Randomized-Quicksort (A, p, q – 1)
Randomized-Quicksort (A, q+1, r)
Partition (A, p, r)
x = A[r]
i=p–1
for j = p to r-1 x is the pivot element. This
if A[j] < = x for loop, will make
i=i+1 comparison with each A[j]
exchange A[i] with A[j] element
exchange A[i+1] with A[r]
return i+1
20
Randomized Quick Sort Analysis 1
Assume we have set S
si – ith smallest element of S
Indicator of random variable k = 1, …, n
s1 is chosen as pivot with probability 1/n: sub-
problems of sizes 0 and n-1
This means that other s elements are
compared with s1
sk is chosen as pivot with probability ¼: sub
problems of sizes k-1 and n-k
sn is chosen as pivot with probability 1/n: sub
problems of sizes n-1 and 0
21
Randomized Quick Sort Analysis 1
Let T(n) = random variable of the running time of randomized quick
sort on input size n
Assume random numbers are independent
All splits are equally likely to occur with probability 1/n
Assume the elements are distinct
Thus with probability 1/n
we have seen this before in divide and conquer.
It is n log n.
Recurrence:
By induction, it is true for k<n, need to show
22
Randomized Quick Sort Analysis 2
Present it as a tree:
Let X = number of comparison across all calls to partition
where Xij = si compare to sj
23
Randomized Quick Sort Analysis 2
To obtain average comparison time, in probability, it is
called expected outcome.
To get expected number of comparison:
If pij is the probability that si is compared to sj during
execution, then ]= pij
24
Randomized Quick Sort Analysis 2
pij = probability of si is compared to sj
Recall the number of comparison
sk-2 sk-1 sk sk+1
Compare with Compare with
k-1 element n-k element
Now, we are interested to now the probability to be
randomly chosen and compared
25
Randomized Quick Sort Analysis 2
To compute pij
si is compared with sj iff si or sj is chosen as pivot element
before any other sl, i < l < j
Since each element is chosen as the pivot with same
probability, pij = , where j-i+1 represent the total number of
elements to be selected
If si is selected, then sj is not selected. Other non pivot
elements will be inserted into the tree
When the other element is selected (not the root of
subtree), it is randomly chosen from at least |j - i| + 1
elements.
pij = p(si is chosen as pivot) + p(sj is chosen as pivot) = +
=
26
Randomized Quick Sort Analysis 2
Harmonic series (when we have a formula which not
able to transform (j and i are two different elements),
we will use harmonic series):
Harmonic series (refer mathworld):
Complexity of Hn is log n + O(1)
Try to fit the formula into Harmonic series, we
obtain the following
2= 2n log n, hence n log n
27
Karger’s Min Cut Algorithm
Goal: finding the minimum cut in a undirected graph (a global
minimum cuts)
Let G = {V,E} be a connected, undirected, loopfree multigraph with n
vertices
Multigraph = graph where more than one edge exist between two
vertices
A cut = partition of the vertex set V into two disjoint nonempty sets V,
where V = V1 V2 and V1 V2 = Ø
E.g.: V1 = {A,C}; V2 ={B,D,E,F}
Need to remove 5 crossing edges
Size of the cut = number of edges crossing
28
We aim to minimize the size of cut
Karger’s Min Cut Algorithm
Input: a connected loopfree multigraph G = {V,E} with
at least 2 vertices
while |V| > 2
Select e E uniformly at random
G := G/e;
Remove self loop
Return |E|
For the two remaining nodes u1 and u2, set V1 =
{nodes that go into u1} and V2 = {nodes in u2}
29
Example
Assume the random generator selects
a edge ‘a’ (probability 1/5)
0 1
c 0,1
b d d
c
b
2 3
e
2 3
e
Assume the random
generator selects edge
‘d’
(probability ¼ + ¼ = ½
as there are
c two edges
connected {0,1} and
b b
0,1, {3}) 0,1,
2 2
e 3 e 3
Stop when left only 2 Remove self loop ‘C’
vertices.
30
However, it may goes wrong!!!
The minimum cut to split {a,b,c,d}
and {e,f,g,h}
a b f
g
e
c
d
h
a b,e f
If the selected random edge is {b,e}, then
there is no way to split {a,b,c,d} and
{e,f,g,h}. Same thing happen if edge g
{d,e} is selected
c
d
h
31
Exercise
c
2 4 f
a
d
g 6
1
b
e
3 5 h
Assume that the random sequence of edges
are: {d, f, e, b} after Karger’s Min Cut
algorithm is applied. Illustrate the steps of the
change in graph. List down V1 and V2 and the
edges to remove.
32
Analyse Karger’s Min Cut Algorithm
Fact 1. If degree (u) denotes the no of edges touching
vertex u, then
This is the handshaking lemma, total sum of all
degrees of the vertices is similar to 2* number of
edges
Each edge contribute two to the total degree, so |
Fact 2. If there are n vertices, the average degree of a
vertex is 2|E|/n (referring fact 1)
E[degree(X)] =
=
E is the expectation, X is a random variable
representing a vertex in a graph
33
Analyse Karger’s Min Cut Algorithm
Fact 3. The size of the minimum cut is at most 2|E|/n
(means that the maximum number of cut is always
equal to the degree of each vertex)
Let f denotes the size of minimum cut <=
Fact 4. If an edge is picked at random, a probability
that it lies across the minimum cut is at most 2/n
(based on fact 3)
34
Analyse Karger’s Min Cut Algorithm
Correct answer is returned as long as it never picks
an edge across the minimum cut
P(final cut is correct)= P(1st selected edge is not in
min cut) x P(2nd selected edge is not in min cut) x …
≥(1-
We can execute it many times to increase the
success rate and pick only answers that fulfil our
needs
35
Analyse Karger’s Min Cut Algorithm
However, how many times we should run to
“probable” succeed?
Let l denotes a constant time, and p denotes the
probability that succeed at least once
p = 1-p(fail in all runs)
This means that denotes the fail probability
The number of times to run: times
Conclusion: to get minimum cut with high probability,
need to run at least or O(n2 log n)
36
Karger’s Min Cut Algorithm
This is useful in image segmentation.
E.g.: To cut out the head of ebee
Each square represents a pixel
where computer reads in
digital form 37