Random Walk is not so Random
Vidyanshu Mishra (MACS)
19-December-2020
Abstract
In this article we wish to discuss the most common result in Random-
Walk theory and analyse the result with a computer program for a very
large sample. We see how increasing the number of trials affects the test
results. As the discussion continue you will see various applications of the
result using Monte Carlo simulation and some concepts from statistical
probability. We will conclude the article with a problem to explore for
our readers.
Key words and phrases: Random Walk, Statistical probability, Law of
statistical behaviour, Monte Carlo simulation,
1 Introduction
The term “Random Walk” was originally proposed by Karl Pearson in 1905 .
Since then many advancements have been made in this field. Over the years
random-walk models have been extensively and successfully applied in many
fields ranging from solid-state physics and polymer chemistry to photosynthe-
sis, economics and the social sciences. The motives for the choice of a Random-
Walk model to describe a particular empirical situation may vary considerably,
but even so, the approaches followed for different applications are built on the
same formalism. The underlying concepts of Random-Walk theory have played
an important role in the early studies of Brownian motion and diffusion and
have been instrumental in the development of the basic principles of nonequilib-
rium statistical mechanics . In the past fifty years random walks have emerged
as a significant tool for the understanding of a variety of transport processes
occurring in physical systems. In mathematics random walks occupy a place
as a special class of Markov chains and have become an important element of
modern probability theory. The field has reached a high level of mathematical
sophistication due to the interplay with Fourier analysis, generating-function
techniques, potential theory 1 and ergodic theory 2 .
1 Study of harmonic functions
2 Study of statistical properties of deterministic dynamical systems
1
2 Law of Statistical Behaviour
If you pour a glass of water and look at it, you will see a clear uniform fluid with
no trace of any internal structure or motion in it whatsoever (provided, of course,
you do not shake the glass). We know, however, that the uniformity of water is
only apparent and that if the water is magnified a few million times, there will
be revealed a strongly expressed granular structure formed by a large number
of separate molecules closely packed together. Under the same magnification it
is also apparent that the water is far from still, and that its molecules are in
a state of violent agitation moving around and pushing one another as though
they were people in a highly excited crowd. This irregular motion of water
molecules, or the molecules of any other material substance, is known as heat (or
thermal) motion, for the simple reason that it is responsible for the phenomenon
of heat. Although molecular motion as well as molecules themselves are not
directly discernible to the human eye, it is molecular motion that produces a
certain irritation in the nervous fibers of the human organism and produces the
sensation that we call heat. For those organisms that are much smaller than
human beings, such as, for example, small bacteria suspended in a water drop,
the effect of thermal motion is much more pronounced, and these poor creatures
are incessantly kicked, pushed, and tossed around by the restless molecules that
attack them from all sides and give them no rest. This amusing phenomenon,
known as Brownian motion, named after the English botanist Robert Brown,
who first noticed it more than two century ago in a study of tiny plant spores,
is of quite general nature and can be observed in the study of any kind of
suffciently small particles suspended in any kind of liquid, or of microscopic
particles of smoke and dust floating in the air. It would be, however, a grave
mistake to think that because of the irregularity of thermal motion it must
remain outside the scope of any possible physical description. Indeed the fact
itself that thermal motion is completely irregular makes it subject to a new kind
of law, the Law of Disorder better known as the Law of Statistical Behavior. In
order to understand the above statement let us turn our attention to the famous
problem of a Drunkard’s Walk.
3 Drunkard’s Walk
Suppose we watch a drunkard who has been leaning against a lamp post in the
middle of a large paved city square (nobody knows how or when he got there)
and then has suddenly decided to go nowhere in particular. Thus out he goes,
making a few steps in one direction, then some more steps in another, and so on,
changing his course every few steps in an entirely unpredictable way. How far
will be our drunkard from the lamp post after he has executed, say, a hundred
phases of his irregular journey? One would at first think that, because of the
unpredictability of each turn, there is no way of answering this question. If
however, we consider the problem a little more attentively we will find that,
although we really cannot tell where the drunkard will be at the end of his
2
walk, we can answer the question about his most probable distance from the
lamp post after a given large number of turns.
3.1 Most Probable Distance
Our aim in this section is to find out if there is an interesting relationship
between the number of steps the drunk take and how far the drunk is from
the lamp post at the end of those steps. For some people intuitively the more
steps the drunk takes the farther away he is from lamp post or for some people
since he is moving at random he just wanders away and wanders back in all
directions and more or less never gets really far. The walking of Drunkard is a
probabilistic process and this makes it impossible to find a relationship which
always holds but we can still do very much. We find most probable distance 3
of drunkard from lamp post after n number of steps. We only consider a special
case of the problem, that is when the drunk is walking on an infinite 1 × 1 grid
and length of his each step is 1 unit.
Let’s phrase the Drunkard Walk problem in mathematical terms.
Our drunk is travelling on an infinite grid of 1 × 1 squares taking 1 unit steps.
Initially he is at origin. He is not so drunk and only moves North-South-East
or West, so he will always be on the grid. Our aim is to find the relationship
between number of steps the drunk takes and how far he is from the origin.
The drunk will move only on X− axis or on Y − axis, so let’s say his ith step is
xi or yi . Let the distance of drunkard from origin after n steps is R. So
R2 = (x1 + x2 + x3 + · · · · +xn )2 + (y1 + y2 + y3 + · · · · +yn )2
Here xi , yj ∈ {−1, 0, 1}. Also exactly half of xi ’s and half of yi ’s are 0. This
is because there are n steps and the drunkard moves with equal probability on
both X and Y − axis.
n
X n
X n
X
R2 = xi 2 + yi 2 + 2 xi yi
i=1 i=1 i=1
ForPnow let’s turn P
our attention to
Pthe last summation.
Pn It can also be written
n n n
as 2x1 i=1 yi + 2x2 i=1 yi + 2x3 i=1 yi · · · 2xn i=1 yi and all these terms
are equal toPzero as xi ’s are 1 and −1 with equal probability and there are two
n
xi for each i=1 yi . So
n
X n
X
2 2
R = xi + yi 2 = x1 2 + x2 2 · · + xn 2 + y1 2 + y2 2 · · + yn 2
i=1 i=1
Now, half of xi and yj are equal to 0 and the rest are ±1 for which xi 2 and
yj 2 both equal 1. Therefore
3 Most probable distance is the distance where probability of finding the drunkard is highest.
It can also be interpreted as expected distance of drunkard after given number of steps
3
R2 = (1 + 1 + · · +1 + 0 + 0 + · · +0) + (0 + 0 + · · +0 + 1 + 1 + · · +1 ) = N
| {z } | {z }
n n
2 times 2 times
√
=⇒ R = N
Turning from planar lattices to the plane itself, allowing the walker to step
a unit distance in any randomly selected direction, the situation becomes more
complicated in some ways and simpler in others. For example, the expected
(average) distance of the walker from his starting spot, after n equal steps, is
simply the length of the step times the square root of n. That is the result we
just derived is not only valid for random walk on a grid but also for random
walk on a plane. This was proved by Albert Einstein in a paper on molecular
statistics published in 1905, the same year he published his celebrated first paper
on relativity. (It was independently proved by Marian Smoluchowski. You can
find a simple proof in George Gamow’s One Two Three · · · Infinity.)
3.2 Computer verification of the result
Now we define a function that returns the expected distance of a random walker
from origin (his initial position) after he has taken n number of steps.
# Calculating the expected distance of a random walker
import random
import math
def random_walk(n):
x=0
y=0
for i in range (n):
step=random.choice([’N’,’S’,’E’,’W’])
if step==’N’:
y=y+1
elif step ==’S’:
y=y-1
elif step==’E’:
x=x+1
else:
x=x-1
d=math.sqrt(x**2 + y**2)
return d
Now we call this function 10 times, each time finding the expected distance
of the random walker after 100 steps. Theoretically the distance must come out
to be 10 but this will not be an actual case. Any value of distance from 0 to
100 is possible though some are highly unlikely.
4
Walk Expected distance Distance Percentage error
1 10 16.492422502470642 64
2 10 8.0 20
3 10 9.486832980505138 5.2
4 10 4.47213595499958 53
5 10 10.198039027185569 1.9
6 10 5.0990195135927845 50
7 10 5.830951894845301 42
8 10 8.602325267042627 14
9 10 7.0710678118654755 29.3
10 10 17.08800749063506 70
You can see that huge error percentage on either side of expected distance in
rightmost column. Our aim is to reduce that error percentage and get an idea of
where a random walker will be after n number of steps. To do that we simulate
the random walk 100000 times and then average the expected distances that
come out for each random walk. We will do it for three values of n which are
64, 100, 144 and expected distances are 8, 10, 12
The additional code will be as following:
average=0
for i in range(0,100000):
average+=random_walk(100)
average=average/100000
print(average)
Here in this code 100 will change to 64 and 144 for other cases.
n Expected distance Distance Percentage error
64 8 7.089111335300358 11.3
64 8 7.095063071829727 11.3
64 8 7.081461902211198 11.4
100 10 8.874775004425105 12.3
100 10 8.84569722936279 12.6
100 10 8.870756149939673 12.3
144 12 10.62016807564472 11.4
144 12 10.679130818658091 11
144 12 10.6396952302016 11.3
We see that percentage error is between 10 − 15% all the time and the
distance is alwayays less than what is expected.
5
4 Monte Carlo Simulation
We use monte carlo simulation to solve two problems that are of mathematical
nature. Monte Carlo simulations are used to model the probability of different
outcomes in a process that cannot easily be predicted due to the intervention
of random variable. It is a technique used to understand the impact of risk and
uncertainity in prediction and forcasting models.
4.1 Problem 1
In this section we try to answer the following question.
What is the longest random walk a drunkard can take so that on average
he will end up 3 steps or fewer from origin in two dimentional infinte 1 × 1
grid? This problem is non deterministic in nature that is why the question asks
for longest walk so that on average he is 3 steps or fewer from origin. This
problem can be solved mathematically. Instead of solving mathematically we
will conduct thousands of random trials and compute the percentage of random
walks that end in a short walk home (at most 3 steps from origin). Below is the
code for the problem.
import random
import math
def random_walk(n): # here n is the number of blocks the drunkard is walking
"""The function will return the coordinates after n block random walk"""
x=0
y=0
for i in range (n):
step=random.choice([’N’,’S’,’E’,’W’])
if step==’N’:
y=y+1
elif step ==’S’:
y=y-1
elif step==’E’:
x=x+1
else:
x=x-1
return (x,y)
number_of_walks=20000
for walk_lenth in range (1,21):
at_most_3_steps=0
for i in range(number_of_walks):
(x,y) =random_walk(walk_lenth)
6
d=math.fabs(x) + math.fabs(y)
if d <=3:
at_most_3_steps+=1
at_most_3_steps_percentage=float(at_most_3_steps)/number_of_walks
print("walk size = ", walk_lenth, "/ % of at most 3 steps ",
100* at_most_3_steps_percentage)
In the code the walk is over the moment the function returned coordinates
of the drunk. To get an accurate number we executed a Monte carlo loop of
20000 random walk for each walk size. Below is the result for three runs of the
program.
Walk size Run 1 (atmost 3 step % ) Run 2 (atmost 3 step % ) Run 3 (atmost 3 step % )
1 100 100 100
2 100 100 100
3 100 100 100
4 76.55499999999999 76.74 76.74
5 87.83999999999999 87.805 88.16000000000001
6 61.19 60.695 61.419999999999995
7 76.535 76.545 76.465
8 49.95 50.895 50.075
9 67.415 66.995 66.96
10 43.625 42.425000000000004 43.085
11 59.685 60.02499999999999 59.94500000000001
12 37.335 38.224999999999994 37.13
13 53.574999999999996 54.230000000000004 53.769999999999996
14 33.305 33.61 33.22
15 48.475 48.88 48.39
16 29.815 30.005 30.005
17 44.25 44.67 44.36
18 26.935 26.8 26.71
19 40.2 41.075 40.875
20 24.535 25.145 24.795
It is clear from the table that the answer to our problem is 13. If you have
observed the table carefully you must have noticed that even steps have less
chances than corrsponding odd steps of ending 3 steps or fewer from origin.
The same is true if 3 is replaced by any other number. We encourage reader to
think about this.
4.2 Problem 2
The next problem we are going to look at is following
Two men start at the same spot on the planer lattice. One makes a random
walk of 70 unit steps, then stops. The other stops after a random walk of 30
7
unit steps. What is the expected distance between them at the finish? In the
code below we try to solve this problem.
import random
import math
def random_walk(n): # here n is the number of blocks the drunkard is walking
"""The function will return the coordinates after n block random walk"""
x=0
y=0
for i in range (n):
step=random.choice([’N’,’S’,’E’,’W’])
if step==’N’:
y=y+1
elif step ==’S’:
y=y-1
elif step==’E’:
x=x+1
else:
x=x-1
return (x,y)
avg=0
for i in range (1000):
(x1,y1)=random_walk(70)
(x2,y2)=random_walk(30)
d=math.sqrt((x1-x2)**2 + (y1-y2)**2)
avg=avg+d
print(avg/1000)
The output of this code gives the distance between two random walkers one
of which has taken 70 random steps of unit length and other one has taken 30
steps of unit length. We make a table in which the average distance of 1000
random walks is displayed.
Walk Distance
1 8.803002410103096
2 8.85965528268921
3 8.750006887004405
Can we solve this problem mathematically ? Let’s give it a try. If you
imagine one man reversing the direction of his walk until he returns to where
he started and then continuing along the other man’s path, you will see that
the question is the same as asking for the expected distance from the starting
spot of a single random walk of 100√steps. Boom ! We know how to find that
expected distance. That is simply 100 = 10 and we realise that the answer
8
our computer program gives us is well within the error limits that we set earlier
(12%, 11.5%, 12.5%)
5 Taking it further
We conclude this article with a problem to explore. Consider a random walk on
an infinite lattice with no barriers. If the walk continues an arbitrarily long time,
the proportion of visits the walker makes to any specified corner approaches zero
as a limit. On the other hand, if the walk continues long enough, the walker is
certain to touch every vertex, including a return visit to his starting spot. It is
logically possible that such a walker can travel forever without reaching a given
corner. To the mathematician, however, it has a practical probability of zero
even though the expected number of steps for reaching any specified corner is
infinite. The distinction is often encountered where infinite sets are concerned.
If a penny is flipped forever, for example, it is logically possible that heads
and tails will forever alternate, although the practical probability that this will
happen is zero. We can express it in this way: If you stand at an intersection on
the infinite lattice while a friend, starting at any other spot, wanders randomly
over the lattice, he will be practically certain to meet you if you are able to wait
an arbitrarily long time. The statement can be even stronger. After the first
meeting the probability is again 1 that if your friend continues wandering, he
will eventually return to you. In other words, it is practically certain that such
a walker, given enough time, will visit every intersection an infinity of times!
6 References
The following resources were used extensively and conatain useful extensions of
the results and problems discussed in this article
1. Martin Gardner - Mathematical Circus
2. George Gamow - One Two Three · · · Infinity
3. Chris H. Rycroft - Introduction to Random Walks and Diffusion
4. Socratica - A Random Walk & Monte Carlo Simulaton
5. MIT OCW - Random Walks