Beta distribution
In probability theory and statistics, the beta distribution is a family of continuous probability
distributions defined on the interval [0, 1] parametrized by two positive shape parameters,
denoted by and , that appear as exponents of the random variable and control the shape of
the distribution.
The beta distribution has been applied to model the behavior of random variables limited to
intervals of finite length in a wide variety of disciplines. For example, it has been used as a
statistical description of allele frequencies inpopulation genetics;[1] time allocation in project
management / control systems;[2] sunshine data;[3] variability of soil properties;[4] proportions of
the minerals in rocks in stratigraphy;[5] and heterogeneity in the probability of HIVtransmission.[6]
In Bayesian inference, the beta distribution is the conjugate prior probability distribution for
the Bernoulli, binomial,negative binomial and geometric distributions. For example, the beta
distribution can be used in Bayesian analysis to describe initial knowledge concerning
probability of success such as the probability that a space vehicle will successfully complete a
specified mission. The beta distribution is a suitable model for the random behavior of
percentages and proportions.
The usual formulation of the beta distribution is also known as the beta distribution of the
first kind, whereasbeta distribution of the second kind is an alternative name for the beta
prime distribution.
Characterization
The probability density function (pdf) of the beta distribution, for 0 x 1, and shape
parameters , > 0, is a power function of the variable x and of its reflection (1x) as follows:
where (z) is the gamma function. The beta function, , is a normalization constant to ensure
that the total probability integrates to 1. In the above equations x is a realizationan observed
value that actually occurredof a random process X.
This definition includes both ends x = 0 and x = 1, which is consistent with definitions for
other continuous distributions supported on a bounded interval which are special cases of the
beta distribution, for example the arcsine distribution, and consistent with several authors,
like N. L. Johnson and S. Kotz.[7][8][9][10] However, the inclusion of x = 0 and x = 1 does not work
for , < 1; accordingly, several other authors, including W. Feller,[11][12][13] choose to exclude the
ends x = 0 and x = 1, (such that the two ends are not actually part of the density function) and
consider instead 0 < x < 1.
Several authors, including N. L. Johnson and S. Kotz,[7] use the symbols p and q (instead of
and ) for the shape parameters of the beta distribution, reminiscent of the symbols
traditionally used for the parameters of the Bernoulli distribution, because the beta distribution
approaches the Bernoulli distribution in the limit when both shape parameters and
approach the value of zero.
In the following, a random variable X beta-distributed with parameters and will be denoted
by:[14][15]
Other notations for beta-distributed random variables used in the statistical literature are
[16]
and
Properties
Measures of central tendency
Mode[edit]
The mode of a Beta distributed random variable X with , > 1 is the most likely value of the
distribution (corresponding to the peak in the PDF), and is given by the following expression: [7]
When both parameters are less than one (, < 1), this is the anti-mode: the lowest point
of the probability density curve.[9]
Letting = , the expression for the mode simplifies to 1/2, showing that for = > 1 the
mode (resp. anti-mode when , < 1), is at the center of the distribution: it is symmetric in
those cases. See "Shapes" section in this article for a full list of mode cases, for arbitrary
values of and . For several of these cases, the maximum value of the density function
occurs at one or both ends. In some cases the (maximum) value of the density function
occurring at the end is finite. For example, in the case of = 2, = 1 (or = 1, = 2), the
density function becomes a right-triangle distribution which is finite at both ends. In several
other cases there is a singularity at one end, where the value of the density function
approaches infinity. For example, in the case = = 1/2, the Beta distribution simplifies to
become the arcsine distribution. There is debate among mathematicians about some of
these cases and whether the ends (x = 0, and x = 1) can be called modes or not.[12][14]