negative binomial distribution
In probability theory and statistics, the negative binomial distribution is a discrete probability distribution of the number of successes in a sequence of Bernoulli trials before a specified (non-random) number r of failures occurs. For example, if one throws a die repeatedly until the third time 1 appears, then the probability distribution of the number of non-1s that had appeared will be negative binomial.
TO understand the negative binomial distribution:
Suppose there is a sequence of independent Bernoulli trials, each trial having two potential outcomes called success and failure. In each trial the probability of success is p and of failure is 1 p. We are observing this sequence until a predefined number r of failures has occurred. Then the random number of successes we have seen, X, will have the negative binomial (or Pascal) distribution:
Paramete r:
r > 0 number of failures until the experiment is stopped (integer,
but the definition can also be extended to reals) p (0,1) success probability in each experiment (real)
Pdf:
CDF :
Mea E(X) n =
Varianc e:
MGF:
Example:
Pat is required to sell candy bars to raise money for the 6th grade field trip. There are thirty houses in the neighborhood, and Pat is not supposed to return home until five candy bars have been sold. So the child goes door to door, selling candy bars. At each house, there is a 0.4 probability of selling one candy bar and a 0.6 probability of selling nothing. What's the probability mass function for selling the last candy bar at the nth house? Recall that the NegBin(r, p) distribution describes the probability of k failures and r successes in k+r Bernoulli(p) trials with success on the last trial. Selling five candy bars means getting five successes. The number of trials (i.e. houses) this takes is therefore k+5 = n. The random variable we are interested in is the
number of houses, so we substitute k = n 5 into a NegBin(5, 0.4) mass function and obtain the following mass function of the distribution of houses (for n 5):
What's the probability that Pat finishes on the tenth house?
What's the probability that Pat finishes on or before reaching the eighth house? To finish on or before the eighth house, Pat must finish at the fifth, sixth, seventh, or eighth house. Sum those probabilities:
What's the probability that Pat exhausts all 30 houses in the neighborhood? This can be expressed as the probability that Pat does not finish on the fifth through the thirtieth house:
Application:
Application of negative binomial modeling for discrete outcomes: A case study in aging research Abstract :
We present a case study using the negative binomial regression model for discrete outcome data arising from a clinical trial designed to evaluate the effectiveness of a prehabilitation program in preventing functional decline among physically frail, community-living older persons. The primary outcome was a measure of disability at 7 months that had a range from 0 to 16 with a mean of 2.8 (variance of 16.4) and a median of 1. The data were right skewed with clumping at zero (i.e., 40% of subjects had no disability at 7 months). Because the variance was nearly 6 times greater than the mean, the negative binomial model provided an improved fit to the data and accounted better for overdispersion than the Poisson regression model, which assumes that the mean and variance are the same. Although correcting the variance and corresponding test statistics for overdispersion is a standard procedure in the Poisson model, the estimates of the regression parameters are inefficient because they have more sampling variability than is necessary. The negative binomial model provides an alternative approach for the analysis of discrete data where overdispersion is a problem, provided that the model is correctly specified and adequately fits the data.