Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
10 views28 pages

Cópia de Aula5 - Contagem

The document discusses count data models, primarily focusing on the Poisson regression model and its limitations due to overdispersion. It explains the need for alternative models like the Negative Binomial model and methods for testing overdispersion, as well as addressing issues related to truncated and zero-inflated data. Additionally, it highlights an application involving cross-sectional data from a Brazilian survey on food security and crime.

Uploaded by

Aldryn Dylan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views28 pages

Cópia de Aula5 - Contagem

The document discusses count data models, primarily focusing on the Poisson regression model and its limitations due to overdispersion. It explains the need for alternative models like the Negative Binomial model and methods for testing overdispersion, as well as addressing issues related to truncated and zero-inflated data. Additionally, it highlights an application involving cross-sectional data from a Brazilian survey on food security and crime.

Uploaded by

Aldryn Dylan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

Count Data Model

Greene 18.4
Cameron e Trivedi cap 20
Basic Idea
• Y = non-negative integer number or count of
events, in general, with few and small values
(0, 1, 2...)
• Ex: number of visits to a doctor in the year,
number of patentes registerd by a firm in a
year, number of times you are robbed in a
year
• Starting point: Poisson process
Rare Events Law
• The total number of events will approximately
follow the Poisson distribution if an event can
occur in any trial among a large number of them,
but the probability of occurrence in a given trial is
small.
• The Poisson distribution is the number of
occurrrences of the event, with density given by:
𝑦𝑖
exp(−𝜆𝑖 )𝜆𝑖
Pr 𝑌 = 𝑦𝑖 = , 𝑦 = 0, 1, 2, …
𝑦𝑖 !
Rare Events Law
•  = intensity parameter
• E[Y] = Var[Y] = 
• Poisson Propriety: E[Y] = Var[Y] =  →
equidispersion (mean = variance)
Poisson Regression Model
• Introducting subscript i for each observation
and the hypothesis that observations are
independente and identically distributed→
Poisson regression model.
• The model establishes that each yi is a trial
from a Poisson population with parameter i,
related to the regressors xi.
Poisson Regression Model
• The density of this distribution is given by:
𝑦𝑖
exp(−𝜆𝑖 )𝜆𝑖
Pr 𝑌 = 𝑦𝑖 =
𝑦𝑖 !

• The usual hypothesis for i parametrization is the log-linear


model:
i = exp(xi’) ou ln(i) = xi’

• So, E[yi|xi] = Var[yi|xi] = i = exp(xi’)


• Obs: Poisson regression is intrinsically heteroskedastic
Maximum Likelihood Estimation
𝑦𝑖
exp(−𝜆𝑖 )𝜆𝑖
Pr 𝑌 = 𝑦𝑖 =
𝑦𝑖 !
𝑦𝑖
Ln[Pr 𝑌 = 𝑦𝑖 ] = ln(exp −𝜆𝑖 + ln 𝜆𝑖 − ln(𝑦𝑖 !) =
= −𝜆𝑖 + 𝑦𝑖 ln 𝜆𝑖 − ln(𝑦𝑖 !)

But remember that 𝜆𝑖 = exp 𝑥𝑖′ 𝛽

So, we can write:

𝐿𝑛[𝐿 𝛽 ] = ෍[𝑦𝑖 𝑥𝑖′ 𝛽 − exp 𝑥𝑖′ 𝛽 − ln(𝑦𝑖 !)]


𝑖=1
Maximum Likelihood Estimation
𝑛

𝐿𝑛[𝐿 𝛽 ] = ෍[𝑦𝑖 𝑥𝑖′ 𝛽 − exp 𝑥𝑖′ 𝛽 − ln(𝑦𝑖 !)]


𝑖=1

𝜕𝐿𝑛[𝐿 𝛽 ]
= σ𝑛𝑖=1[𝑦𝑖 𝑥𝑖′ − exp 𝑥𝑖′ 𝛽 𝑥𝑖′ ] = 0
𝜕𝛽
𝜕𝐿𝑛[𝐿 𝛽 ]
= σ𝑛𝑖=1{[ 𝑦𝑖 − exp 𝑥𝑖′ 𝛽 ] 𝑥𝑖 } = 0
𝜕𝛽

𝑛
2
𝜕 𝐿𝑛[𝐿 𝛽 ]
= − ෍{exp 𝑥𝑖′ 𝛽 𝑥𝑖 𝑥𝑖′ } Hessian is negative
𝜕𝛽𝜕𝛽
𝑖=1 definite for all x and , so
𝐿𝑛[𝐿 𝛽 ] é globally
concave
Maximum Likelihood Estimator
• Newton method is generally used and
normally converges fast.
−1
 ˆ '
n
Variance-covariance estimator
 i xi xi  matrix for 𝛽෡
 i =1 
Conditional mean and variance
• Given the estimates, mean prediction for each
observation i is given by:

ˆ ' ˆ
E[ yi | xi ] = i = exp( xi  )
• Variance of prediction is given by:

𝑣𝑎𝑟 𝑦𝑖 𝑥𝑖 = 𝜆መ 2𝑖 𝑥𝑖′ 𝑉𝑥𝑖 ,


where V is the estimated asymptotic covariance

matrix for 𝛽.
Marginal Effects
• As stated, predicton for each observation i is
given by:
E[ yi | xi ] = ˆi = exp( xi' ˆ )

• So, marginal effects are given by:


𝜕𝐸(𝑦|𝑥)
𝜕𝑥𝑗
= 𝛽𝑗 exp(𝑥 ′ 𝛽)
• For exemple, if 𝛽෡𝑗 =0,25 and exp(𝑥 ′ 𝛽)=3,
መ then a
one-unit change in the j-th regressor increases
the expectation of y by 0.75 units
Marginal Effects
• As before, if you want a single response, it is
common to report the average marginal effects
calculated for all individuals in the sample:

𝜕𝐸(𝑦𝑖 |𝑥𝑖 )
𝑁 −1 ෍
𝑖 𝜕𝑥𝑖𝑗
• If 𝛽𝑗 is twice as large as 𝛽𝑘 , then the effect of
changing the j-th regressor by one unit is twice
that of changing the k-th regressor by one unit.
Important issue
• Poisson Distribution is usually too restrictive for
count data
• The distribution is parametrized in terms of a
single scalar (𝜆) so that we have equidispersion
(condicional mean = condicional variance).
• However, in many applications for Count data,
variance usually exceeds the mean
(overdispersion)
• So, we need to test to verify if there is
overdispersion. If this is true, we will need to
estimate other models.
Testing Overdispersion
• A statistical test of overdispersion is therefore
highly desirable after running a Poisson
regression.
• Most count models with overdispersion
specify overdispersion to be of the form:
𝑉 𝑦𝑖 𝑥𝑖 = 𝜆𝑖 + 𝛼𝑔(𝜆𝑖 )
• Where 𝛼 is a unknown parameter and g() is a
known function, most commonly 𝑔 𝜆𝑖 = 𝜆2𝑖
or 𝑔 𝜆𝑖 = 𝜆𝑖
Testing Overdispersion
• We assume that both under null and
alternative hypothesis, the mean is correctly
specified as, for example, exp(𝑥 ′ 𝛽).
• Under Ho: 𝛼=0 (equidispersion)
• An overdispersion test statistic can be
computed by estimating the Poisson model,
constructing fitted values 𝜆෡𝑖 = exp(𝑥𝑖 ′𝛽)

Testing Overdispersion
• And running the auxiliary OLS regression (without
constant):

(𝑦𝑖 − 𝜆෡𝑖 )2 −𝑦𝑖 𝑔(𝜆෡𝑖 )


=𝛼 + 𝑢𝑖
𝜆෡𝑖 𝜆෡𝑖

• Where 𝑢𝑖 is an error term. The t- statistic for 𝛼 is


asymptotically normal under the null hypothesis
of no overdispersion
Negative Binomial Model
• Overdispersion in count data may be due to
unobserved heterogeneity.
• Suppose the distribution of a random count y is
Poisson, conditional on the parameter 𝜆 , so that
exp(−𝜆)𝜆𝑦
𝑓(𝑦|𝜆) = 𝑦!
• In the negative binomial model, we assume that 𝜆 is
random. Let 𝝀 = 𝝁𝝊, where 𝜇 is a deterministic
function of x [for ex. exp(x𝛽)] and 𝜐>0 is iid with
density 𝑔(𝜐|α)
• So, different observations may have different 𝜆 , but
part of this difference is due to random component
𝜐.
Negative Binomial Model
• The distribution of 𝑦𝑖 conditional on 𝑥𝑖 and 𝜐𝑖 remains Poisson with
conditional mean and variance 𝜆𝑖 :

• The unconditional distribution f 𝑦𝑖 𝑥𝑖 is the expected value (over


𝜐𝑖 ) of f 𝑦𝑖 𝑥𝑖 , 𝜐𝑖 :
𝑦
∞ 𝑒 −(𝜇𝑖 𝜐𝑖 ) (𝜇𝑖 𝜐𝑖 ) 𝑖
f 𝑦𝑖 𝑥𝑖 = ‫׬‬0 g 𝜐𝑖 𝑑𝜐𝑖 (**)
𝑦𝑖 !

• In general, a gamma distribution is assumed for 𝜐𝑖 , so that:


𝜃𝜃 −𝜃𝜐
g 𝜐𝑖 = 𝑒 𝑖 𝜐𝑖 𝜃−1 (*)
Γ(𝜃)

Obs: If 𝜃 is a natural number (1, 2, 3,…), then Γ(𝜃) = (𝜃 − 1)!


Negative Binomial Model
• Substituting (*) in (**) and manipulating:

Γ 𝜃 + 𝑦𝑖 𝜇𝑖 𝑦 𝜇𝑖 𝜃
f 𝑦𝑖 𝑥𝑖 = { } 𝑖 {1 − }
Γ 1 + 𝑦𝑖 Γ 𝜃 𝜇𝑖 + 𝜃 𝜇𝑖 + 𝜃

• which is one form of the negative binomial


distribution. The distribution has conditional mean
𝜇𝑖 and conditional variance 𝜇𝑖 [1 + (1/ 𝜃) 𝜇𝑖 ] (NB2).
• Note that var > mean, since 𝜃 > 0 and 𝜇𝑖 > 0
Truncation
• In some studies, inclusion in the sample requires that sampled
individuals have been engaged in the activity of interest. Then
the count data are truncated, as the data are observed only
over part of the range of the response variable.
• Examples of truncated counts include the number of bus trips
made per week in surveys taken on buses, the number of
shopping trips made by individuals sampled at a mall, and the
number of unemployment spells among a pool of
unemployed.
• In all these cases we do not observe zero counts, so the data
are said to be zero-truncated, or more generally left-
truncated.
Truncation
• Truncation leads to inconsistent parameter estimates unless
the likelihood function is suitably modified. Consider the case
of zero truncation.
• Let f(y|θ) denote the density function and F(y|θ) = Pr[Y ≤ y]
denote the cumulative distribution function of the discrete
random variable, where θ is a parameter vector. If realizations
of y less than the positive integer 1 are omitted, the zero-
truncated density is given by:
f (y|θ, y ≥ 1) = f (y|θ)/ [1 − F(0|θ)], y = 1, 2, . . .
• This specializes in the zero-truncated Poisson case, for
example, to f (y|μ, y ≥ 1) = exp(-μ)μy/[y!(1 − exp(−μ))].
• It is possible to construct a log-likelihood based on this density
and to obtain maximum likelihood estimates.
Excess of zeros
• In some applications, we have lots of zeros.
• Ex: number of homicides in a municipality, number of times
you are robbed in a year
• The zero inflated-model models Pr[y = 0] = f1(0) through a
binary process and the count with a different density,
f2(y|y>0).
• If the binary process takes value 0, with probability f1 (0), then
y = 0. If the binary process takes value 1, with probability f1(1),
then y takes count values 0, 1, 2, . . . from the count density
f2(·).
• This lets zero counts occur in two ways: as a realization of the
binary process and as a realization of the count process when
the binary random variable takes value 1.
Excess of zeros
• The density is given by:

• Regression models let f1(·) be a logit model and f2(·) be a


Poisson or negative binomial density.
Application

https://www.anpec.org.br/encontro/2014/submissao/files_I/i12
-967168f1c1bc02480e2a256adddcd66b.pdf
Application
• Cross-sectional data from Special Supplements on Food
Security, Victimization and Justice included in the National
Household Sample Survey of 2009 (PNAD in Brazilian acronym)
carried out by the Brazilian Institute for Geography and
Statistics
• Advantages of this dataset as compared to official figures:
1) its coverage is nation-wide;
2) the response variable (i.e., crime) is free from bias caused by
measurement errors resulting from under-reporting;
3) it allows for the effects of household income, education, and
other factors on the number of crimes to be identified based on
non-victimized individuals
Application
• Dependent variable: the amount of times an individual was
victimized during one year.
• Models for four types of crimes were performed separately:
robbery (roubo), theft (furto), attempted theft/robbery, and
assault (agressão)
Application
• Dependent variable: the amount of times an individual was
victimized during one year.
• Models for four types of crimes were performed separately:
robbery (roubo), theft (furto), attempted theft/robbery, and
assault (agressão)
Application
They estimated a Negative Binomial Model and a Zeero-Inflated
Binomial Model. They reject the Poisson after testing for
equidisperson and also discuss the ZINB because there is lots of 0s
Men are 0,111
times more
victims of theft
than women,
ceteris paribus

In general,
small effects
but te
incidence is
also very low.

You might also like