Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
82 views13 pages

Dirac Delta in Statistics

This document discusses applications of Dirac's delta function in statistics. It begins by reviewing properties of the delta function, including its sifting property and relationship to the Heaviside step function. It then provides three examples of how the delta function can be used in statistics: 1) To represent discrete probability distributions as a generalized function, 2) To derive moments of discrete distributions using integral notation, and 3) To represent transformations of random variables using the delta function and Jacobians.

Uploaded by

md kaif
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
82 views13 pages

Dirac Delta in Statistics

This document discusses applications of Dirac's delta function in statistics. It begins by reviewing properties of the delta function, including its sifting property and relationship to the Heaviside step function. It then provides three examples of how the delta function can be used in statistics: 1) To represent discrete probability distributions as a generalized function, 2) To derive moments of discrete distributions using integral notation, and 3) To represent transformations of random variables using the delta function and Jacobians.

Uploaded by

md kaif
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

See discussions, stats, and author profiles for this publication at: https://www.researchgate.

net/publication/233213407

Applications of Dirac's delta function in statistics

Article  in  International Journal of Mathematical Education · March 2004


DOI: 10.1080/00207390310001638313

CITATIONS READS
27 873

1 author:

Andre Khuri
University of Florida
115 PUBLICATIONS   2,932 CITATIONS   

SEE PROFILE

All content following this page was uploaded by Andre Khuri on 30 January 2016.

The user has requested enhancement of the downloaded file.


Applications of Dirac’s delta function in
statistics
ANDRÉ I KHURI
Department of Statistics, Griffin-Floyd Hall,
P.O. Box 118545, University of Florida,
Gainesville, Florida 32611-8545, USA; e-mail: [email protected]

The Dirac delta function has been used successfully in mathematical


physics for many years. The purpose of this article is to bring attention
to several useful applications of this function in mathematical statistics.
Some of these applications include a unified representation of the distribu-
tion of a function (or functions) of one or several random variables, which
may be discrete or continuous, a proof of a well-known inequality, and a
representation of a density function in terms of its noncentral moments.

1. Introduction

The Dirac delta function (δ-function) was introduced by Paul Dirac at the end
of the 1920s in an effort to create the mathematical tools for the development of
quantum field theory (see Dirac [2]). It has since been used with great success in
applied mathematics and mathematical physics.
The δ-function does not actually conform to the usual mathematical definition
of a function, and is therefore referred to as a generalized function. Dirac initially
called it an “improper function” (see Dirac [3], page 58), and he denoted it by δ(x),
−∞ < x < ∞. The following are some of the basic properties of δ(x); more details
can be found in Hoskins [6], Kanwal [8], Saichev and Woyczynski [11]:
R∞
(a) δ(x) = 0, if x 6= 0, and −∞
δ(x)dx = 1.

(b) xδ(x) = 0 for all x.

(c) If f (x) is any function which is continuous on a neighborhood of the point x0 ,


then Z ∞
f (x)δ(x − x0 )dx = f (x0 ). (1)
−∞

1
Formula (1) represents the so-called sifting, or sampling, property of the δ-
function. This formula can also be written as
Z b
f (x)δ(x − x0 )dx = f (x0 ) (2)
a

for any a, b such that a < x0 < b. The integral in (2) represents a so-called linear
functional, which assigns to the continuous function f (x) the value f (x0 ).

(d) If f (x) is any function with continuous derivatives up to the nth order in some
neighborhood of x0 , then
Z b
f (x)δ (n) (x − x0 )dx = (−1)n f (n) (x0 ), n ≥ 0 (3)
a

where a < x0 < b. In particular, we have


Z ∞
f (x)δ (n) (x − x0 )dx = (−1)n f (n) (x0 ), n ≥ 0. (4)
−∞

In both formulas, δ (n) (x) is a generalized function representing the so-called


generalized nth derivative of δ(x). This derivative defines a linear functional
which assigns to f (x) the value (−1)n f (n) (x0 ).

(e) If f (x) has simple zeros at x1 , x2 , . . . , xn and is differentiable at these points such
that f 0 (xi ) 6= 0 for i = 1, 2, . . . , n, then
n
X δ(x − xi )
δ[f (x)] = . (5)
i=1
|f 0 (xi )|

In particular, if f (x) has only one simple zero at x = x0 and f 0 (x0 ) 6= 0, then

δ(x − x0 )
δ[f (x)] = . (6)
|f 0 (x0 )|

In the event f (x) has higher-order zeros, no significance is attached to δ[f (x)]
(see Kanwal [8], page 49). For example, if f (x) = ax, where a is a nonzero
constant, then
δ(x)
δ(ax) = . (7)
|a|
Using a = −1 in (7) we conclude that δ(−x) = δ(x), which indicates that δ(x)
is an even function.

2
Formula (5) remains true for an infinite set of simple zeros, for example, for
−∞ < x < ∞, we have

X δ(x − nπ)
δ(sin x) =
n=−∞
| cos(nπ)|

X
= δ(x − nπ).
n=−∞

(f) A closely related function to the δ-function is the Heaviside function H(x) which
is defined as the unit step function,

0 x<0
H(x) = (8)
1 x≥0

The generalized derivative of H(x) is δ(x), that is,

dH(x)
δ(x) = (9)
dx
[see, for example, Hoskins ([6], page 34), Hsu ([7], page 58)]. From (9) it follows
that for any fixed x0 ,

dH(x − x0 )
δ(x − x0 ) =
dx
dH(x0 − x)
= − . (10)
dx

(g) The definition of the δ-function can be extended to Rn , the n-dimensional Eu-
clidean space. Thus, if x ∈ Rn and f (x) is a continuous function in a neigh-
borhood of x = x0 , then
Z
f (x)δ(x − x0 )dx = f (x0 ), (11)
Rn

where dx = dx1 dx2 . . . dxn . See, for example, Saichev and Woyczynski ([11],
page 28).

2. Applications of the δ-function

In this section, several applications of the δ-function in statistics will be dis-


cussed.

3
2.1. Representation of discrete distributions
Suppose that X is a discrete random variable that assumes Pn the values a1 , a2 , . . . , an
with corresponding probabilities p1 , p2 , . . . , pn such that i=1 pi = 1. The probability
mass function, p(x), of X can be represented as a generalized function of the form
n
X
p(x) = pi δ(x − ai ). (12)
i=1

For example, if X has the binomial distribution B(n, p), then


n  
X n
p(x) = pi (1 − p)n−i δ(x − i).
i=0
i

The moments of X can then be derived using the integral notation instead of sum-
mation. For example, the k th noncentral moment of X is written as
Z ∞ Z ∞ n
X
k k
x p(x)dx = x pi δ(x − ai )dx
−∞ −∞ i=1
n
X Z ∞
= pi xk δ(x − ai )dx
i=1 −∞

Xn
= aki pi ,
i=1

as can be seen from applying formula (1) to f (x) = xk . Formula (12) is still applicable
if the ai ’s are vector valued in Rm . Thus, if x and the ai ’s are in Rm , then
n
X
p(x) = pi δ(x − ai ).
i=1

2.2. Transformations of random variables


This particular application of the δ-function was discussed by Au and Tam [1]. If
X is a continuous random variable with a density function f (x), and if Y = g(X) is
a function of X, then the density function of Y , namely h(y), is given by
Z ∞
h(y) = f (x)δ[y − g(x)]dx. (13)
−∞

One interesting advantage of this application is that it does not require the function
g(·) to be one-to-one, nor does it involve the computation of the Jacobian, as is usually
the case with the conventional change-of-variable technique.

4
Formula (13) can be extended to a single transformation involving several random
variables. If Y = u(X1 , X2 , . . . , Xn ), where the Xi ’s are continuous random variables
with a joint density function f (x), where x = (x1 , x2 , . . . , xn )0 , then the density
function, λ(y), of Y is given by
Z ∞
λ(y) = f (x)δ[y − u(x)]dx, (14)
−∞

where the integral is n-dimensional. A couple of examples were given by Au and Tam
[1] to illustrate the usefulness of this representation. Au and Tam [1] also pointed
out that the integral in (14) is considerably easier to use and more direct than the
conventional approach which requires the introduction of n − 1 additional random
variables.
Another extension of formula (13) is the derivation of the joint distribution of
several functions of X1 , X2 , . . . , Xn . For example, if Y = u(X1 , X2 , . . . , Xn ) and
Z = v(X1 , X2 , . . . , Xn ) are two such functions, then the bivariate density function,
τ (y, z), of Y and Z is given by
Z ∞
τ (y, z) = f (x)δ[y − u(x)]δ[z − v(x)]dx. (15)
−∞

To show the validity of (15), let T (y, z) denote the cumulative bivariate distribution
function of Y and Z. Then,
T (y, z) = P [u(X1 , X2 , . . . , Xn ) ≤ y, v(X1 , X2 , . . . , Xn ) ≤ z]
Z ∞
= f (x)H[y − u(x)]H[z − v(x)]dx,
−∞

where H(·) is the Heaviside function defined in (8). It follows that


∂ 2 T (y, z)
τ (y, z) =
∂y∂z
Z ∞
∂H[y − u(x)] ∂H[z − v(x)]
= f (x) dx
−∞ ∂y ∂z
Z ∞
= f (x)δ[y − u(x)]δ[z − v(x)]dx, (16)
−∞

which results from applying formula (10). This particular application was not men-
tioned in Au and Tam [1]. Extensions to more than two multivariable transformations
is straightforward.

2.2.1. Examples
Consider the following examples that illustrate the application of the δ-function in
transforming random variables:

5
Example 1. One particular application of formula (14) is the derivation of the
density function of Y = X1 + X2 . In this case,
Z ∞Z ∞
λ(y) = f (x1 , x2 )δ(y − x1 − x2 )dx1 dx2
−∞ −∞
Z ∞ Z ∞
= dx2 f (x1 , x2 )δ[x1 − (y − x2 )]dx1
−∞ −∞
Z ∞
= f (y − x2 , x2 )dx2 , (17)
−∞

as can be seen from applying formula (1). If X1 and X2 are statistically independent
with marginal density functions f1 (x1 ) and f2 (x2 ), respectively, then
Z ∞
λ(y) = f1 (y − x2 )f2 (x2 )dx2 . (18)
−∞

We can similarly show that


Z ∞
λ(y) = f1 (x1 )f2 (y − x1 )dx1 . (19)
−∞

The integral in (18) [or (19)] is the convolution of f1 (x) and f2 (x).

Example 2. Let X1 and X2 be random variables distributed independently as


X1 ∼ G(α, 2), X2 ∼ G(β, 2), where G(·, ·) denotes the gamma distribution. To find
the joint density function of Y = X1X+X 1
2
and Z = X1 + X2 we can apply (16), which
gives the function
Z ∞Z ∞  
x1
τ (y, z) = f (x1 , x2 )δ − y δ(x1 + x2 − z)dx1 dx2
−∞ −∞ x1 + x 2
Z ∞Z ∞  
1 α−1 β−1 1 1
= x1 x2 exp − x1 − x2 ×
Γ(α)Γ(β)2α+β 0 0 2 2
 
x1
δ − y δ[x1 − (z − x2 )]dx1 dx2
x1 + x 2
Z ∞ Z ∞  
1 β−1 α−1 1 1
= x2 dx2 x1 exp − x1 − x2 ×
Γ(α)Γ(β)2α+β 0 0 2 2
 
x1
δ − y δ[x1 − (z − x2 )]dx1
x1 + x 2
Z ∞  
1 β−1 α−1 1
= x2 (z − x2 ) exp − z ×
Γ(α)Γ(β)2α+β 0 2
 
z − x2
δ − y dx2 . (20)
z

6
h i
z−x2
Now, by applying formula (6) to δ z
− y we obtain
 
z − x2 δ[x2 − (z − zy)]
δ −y =
z | z1 |
δ[x2 − (z − zy)]
= 1 ,
z

since z > 0. Making the substitution in (20), we get


 
1 1
τ (y, z) = y α−1 (1 − y)β−1 z α+β−1 exp − z .
Γ(α)Γ(β)2α+β 2

Example 3. Let X1 and X2 be distributed independently as standard normal


variates. To find the joint density function of Y = X12 + X22 and Z = X
X2
1
. In this case,
Z ∞ Z ∞   x 
1
τ (y, z) = f (x1 , x2 )δ x21
+ −y δ x22
− z dx1 dx2
−∞ −∞ x2
Z ∞ Z ∞   
1 1 2 
= dx2 exp − (x1 + x2 ) δ x21 + x22 − y ×
2
2π −∞ 2
  −∞
x1
δ − z dx1 .
x2
 
Using (6), δ xx12 − z can be written as
 
x1 δ(x1 − x2 z)
δ −z = .
x2 | x12 |

Hence,
Z ∞ Z ∞  
1 1 2
τ (y, z) = dx2 exp − (x1 + x2 ) δ(x21 + x22 − y) ×
2
2π −∞ −∞ 2
δ(x1 − x2 z)
dx1
| x12 |
Z ∞  
1 1 2 2
= |x2 | exp − (x2 z + x2 ) δ(x22 z 2 + x22 − y)dx2 .
2
(21)
2π −∞ 2

Now, applying formula (5), we get


h  1/2 i h  1/2 i
y y
δ x2 − 1+z 2 δ x 2 + 1+z 2
δ(x22 z 2 + x22 − y) =  1/2 +  1/2 .
2 y 2 y
2(1 + z ) 1+z 2 2(1 + z ) 1+z 2

7
Making the substitution in (21), we obtain
 
y
1 exp − 2
τ (y, z) = 2
,
2π 1 + z
which shows that Y and Z are independently distributed with Y ∼ χ22 and Z has the
1
Cauchy distribution with the density function π(1+z 2) .

2.2.2. The case of discrete random variables


The δ-function approach can also be applied to transformations involving discrete
random variables. For example, suppose that Y = g(X), where X is discrete with a
probability mass function p(x) as in (12), then the probability mass function of Y is
given by Z ∞
q(y) = p(x)δ[y − g(x)]dx. (22)
−∞

This representation can be easily verified using formula (12):


Z ∞ n
X 
q(y) = pi δ(x − ai ) δ[y − g(x)]dx
−∞ i=1
n
X Z ∞
= pi δ[y − g(x)]δ(x − ai )dx
i=1 −∞
n
X
= pi δ[y − g(ai )].
i=1

Formula (22) can be extended to transformations involving several discrete random


variables X1 , X2 , . . . , Xn with a joint probability mass function p(x), where x =
(x1 , x2 , . . . , xn )0 . If Y = w(X1 , X2 , . . . , Xn ), then the probability mass function of Y
is expressed as Z ∞
r(y) = p(x)δ[y − w(x)]dx. (23)
−∞

Thus, the δ-function provides a unified representation of the distribution of a trans-


formation involving several continuous or discrete random variables.

Example 4. Suppose that X1 , X2 , . . . , Xn are independent and identically dis-


tributed Bernoulli random variables
Pnwith probability of success = p. The purpose of
this example is to verify that Y = i=1 Xi has the binomial distribution B(n, p). This
can bePshown by mathematical induction: We have that X1 ∼ B(1, p). Suppose that
n−1
Y1 = i=1 Xi is B(n − 1, p); to show that Y ∼ B(n, p). Note that Y = Y1 + Xn , and
Y1 and Xn are independent. The probability mass functions of Y1 and Xn are given by

8
p1 (y1 ) = p1 (y − xn )
n−1  
X n−1 i
= p (1 − p)n−1−i δ(y − xn − i),
i=0
i
pn (xn ) = pδ(xn − 1) + (1 − p)δ(xn ),

respectively, as can be seen from applying formula (12). The probability mass function
of Y is therefore the convolution of p1 (y1 ) and pn (xn ), that is,

∞ n−1 
X  
n−1 i
Z
n−1−i
p(y) = p (1 − p) δ(y − xn − i) ×
−∞ i=0
i
[pδ(xn − 1) + (1 − p)δ(xn )]dxn
R∞ R∞
[see formula (18)]. Noting that −∞ δ(y − xn − i)δ(xn − 1)dxn = δ(y − 1 − i), −∞ δ(y −
xn − i)δ(xn )dxn = δ(y − i), we obtain
n−1  
X n−1
p(y) = pi+1 (1 − p)n−1−i δ(y − 1 − i)
i=0
i
n−1  
X n−1 i
+ p (1 − p)n−i δ(y − i).
i=0
i
n  
X n−1 i
= p (1 − p)n−i δ(y − i)
i=1
i − 1
n−1  
X n−1 i
+ p (1 − p)n−i δ(y − i)
i=0
i
n−1    
n
X n−1 n−1
= p δ(y − n) + + pi (1 − p)n−i δ(y − i)
i=1
i − 1 i
+(1 − p)n δ(y)
n−1  
n
X n i
= p δ(y − n) + p (1 − p)n−i δ(y − i) + (1 − p)n δ(y)
i=1
i
n  
X n i
= p (1 − p)n−i δ(y − i).
i=0
i

This is the probability mass function for a binomial B(n, p). 

9
2.3. Markov’s inequality
Let X be a random variable (discrete or continuous). If g(x) is a nonnegative
function, then
1
P [g(X) ≥ b] ≤ E[g(X)], (24)
b
provided that E[g(X)] exists, where b is a positive constant. This is known as
Markov’s inequality. Let us now prove this inequality using the δ-function approach.
The density function (or probability mass function) of Y = g(X) is given by (13),
where f (x) is the density function (or probability mass function of X). Then,
Z ∞ Z ∞
P [g(X) ≥ b] = dy f (x)δ[y − g(x)]dx
b −∞
Z ∞ Z ∞
= f (x)dx δ[y − g(x)]dy
−∞ b
1 ∞
Z Z ∞
≤ f (x)dx yδ[y − g(x)]dy
b −∞ b
1 ∞
Z
≤ f (x)g(x)dx (25)
b −∞
1
= E[g(X)].
b
Inequality (25) follows from the fact that

Z ∞  0, if g(x) < b
1
yδ[y − g(x)]dy = 2
g(x), if g(x) = b
b 
g(x), if g(x) > b
R∞
and hence, b yδ[y − g(x)]dy ≤ g(x).

2.4. Representation of density functions


Let µn denote the nth noncentral moment of a random variable X whose density
function is f (x). Then, f (x) can be represented as [see Kanwal ([8], Section 13.1)]

X (−1)i
f (x) = µi δ (i) (x), (26)
i=0
i!

where δ (i) (x) is the generalized ith derivative of δ(x) as in formula (4). Using (26) it
is easy to verify that

∞ ∞
(−1)i
Z Z X
n n
x f (x)dx = x µi δ (i) (x)dx
−∞ −∞ i=0
i!

10

(−1)i µi ∞
X Z
= xn δ (i) (x)dx
i=0
i! −∞

= µn , (27)

since by (4),
Z ∞ 
n (i) 0, i 6= n
x δ (x)dx =
−∞ (−1)n n!, i = n.

Thus (26) provides a representation of f (x) as a generalized function in terms of its


noncentral moments. Note that this representation is only meaningful when used in
an integral as in (27). For example, using (26), we can easily obtain the moment
generating function, φ(t), of X as a power series in t:
 
φ(t) = E eXt

(−1)n
Z ∞ X
= ext µn δ (n) (x)dx
−∞ n=0
n!

X (−1)n µn Z ∞
= ext δ (n) (x)dx (28)
n=0
n! −∞

(−1)n µn n xt

n d (e )
X
= (−1) , by (4)
n=0
n! dxn x=0

X µn
= tn . (29)
n=0
n!

The interchange of the order of integration and summation in (28) is permissible if the
power series in (29) is uniformly convergent (with respect to t) in some neighborhood
of the origin [see, for example, Fulks ([5], page 515].
Note that the moments of X uniquely determine the distribution of X if the power
series ∞
X µn n
τ (30)
n=0
n!
is absolutely convergent for some τ > 0 [see, for example, Fisz ([4], Theorem 3.2.1)].
This follows from the fact that absolute convergence of the series in (30) guarantees
uniform convergence of the series in (29) within the interval (−τ, τ ) [see, for exam-
ple, Khuri ([10], Theorem 5.4.4)], and hence the existence of the moment generating
function within the same interval.
The representation of f (x) as in (26) is instrumental in deriving an approximation
Rb
for the integral a ϕ(x)e−λΨ(x) dx using the method of Laplace, where λ is a large
positive constant, ϕ(x) is continuous on [a, b], and the first and second derivatives

11
of Ψ(x) are continuous on [a, b] [see Kanwal ([8], Section 13.2)]. This integral was
originally used by Pierre Laplace in his development of the central limit theorem. In
addition, Laplace’s approximation is useful in several areas in statistics, particularly
in Baysian statistics (see Kass, Tierney, and Kadane [9]).

3. Concluding remarks
The Dirac delta function provides a very helpful tool in mathematical statistics.
Several examples were presented in this manuscript to demonstrate its usefulness.
One of its main advantages is the provision of a unified approach for the treatment
of discrete and continuous distributions. This was demonstrated in the discussions
concerning random variables transformations. Given the generalized nature of the δ-
function and its derivatives, the δ-function approach has the potential of facilitating
the understanding and development of classical concepts in mathematical statistics.

References
[1] AU, C., and TAM, J., 1999, The American Statistician, 53, 270-272.
[2] DIRAC, P.A.M., 1927, Proceedings of the Royal Society of London, Series A, 113,
621-641.
[3] DIRAC, P.A.M., 1958, The Principles of Quantum Mechanics (London: Oxford
University Press).
[4] FISZ, M., 1963, Probability Theory and Mathematical Statistics, third edition
(New York: Wiley).
[5] FULKS, W., 1978, Advanced Calculus, third edition (New York: Wiley).
[6] HOSKINS, R.F., 1979, Generalized Functions (New York: Wiley).
[7] HSU, H.P., 1984, Applied Fourier Analysis (San Diego, CA: Harcourt Brace
Jovanovich).
[8] KANWAL, R.P., 1998, Generalized Functions Theory and Technique, second
edition (Boston, MA: Birkhäuser).
[9] KASS, R.E., TIERNEY, L., and KADANE, J.B., 1991, in Statistical Multi-
ple Integration, edited by N. Flournoy and R.K. Tsutakawa (Providence, RI:
American Mathematical Society), pp. 89-99.
[10] KHURI, A.I., (2003), Advanced Calculus with Applications in Statistics, second
edition (New York: Wiley).
[11] SAICHEV, A.I., and WOYCZYNSKI, W.A., 1997, Distributions in the Physical
and Engineering Sciences (Boston, MA: Birkhäuser).

12

View publication stats

You might also like