Demand system estimation
Objective: Estimate of cross and own price elasticity-fundamental tool of analysis of competition
policy (focus is on price competition but the analysis can be extended/modified if non-price
variables are more important)
Types of model: Continuous (situation in which consumer decides ‘how much of a good to
consume’ e.g. electricity) vs. discrete choice model (consumer decides ‘whether or not to buy a
good’ e.g. particular type of car)
Continuous choice: Single product demand
Estimating demand for a homogeneous product: In some markets consumers do not care about
the brand of the product, so long as it fulfils certain standard specifications, at least to a
reasonable approximation (e.g. sugar, oil, corn, or steel). Such a market is effectively composed
of one homogeneous product. Here market demand function is not dependent upon the price of
any other substitute product.
a+ε −b
Demand equation (log-linear): Q t =D ( P t ) =e Pt
i
P = market price, we need to estimate a , b by making suitable assumption on ξ = component of
demand unexplained by the model, Note that P and ξ may be correlated because what is
unknown to a researcher may be known to the firms in the industry.
This specification provides following demand equation:
lnQt =a−bln Pt +ξ t
Own-price elasticity of demand:
∂ lnQ
η PED= =−b
∂ lnP
To estimate this equation we only need data on market prices and quantities sold.
Suppose we want to estimate the demand for sugar. We have data on the quantity of sugar sold in
millions of pounds and the price at which they were sold in cents per pounds for the period
1992–2006. In our model, only prices drive systematic variation in demand. However, we will
have a misspecified model since there are substantial quarterly variations in the level of
deliveries, without any corresponding variations in the level of prices.
Similarly, demand may be shifting for other nonprice reasons during this time. For example,
consumers may have become more health conscious during the period and as a result the demand
of sugar may have decreased. Demand may have increased perhaps as consumers got wealthier
or busier. If any of these effects are at work, then our model as currently written down will
incorrectly ascribe the increase in demand solely to the decrease in prices and as a result
incorrectly estimate the effect of price on the demand for sugar.
Objective of the analyst is to investigate the factors that are understood to affect demand under
the period of study and incorporate the substantial factors into the analysis. An analyst simply
1
cannot do that if she is only looking at regression results-she needs to look at the data and study
the industry.
Not only demand, but supply factors may also affect deliveries. We can retrieve the demand
function (quantity demanded as a function of price) from these data if the model is correctly
specified, i.e. it satisfies the assumptions required to justify our estimation technique. For OLS to
be justified we need our unobserved component of demand to be uncorrelated with our included
¿
regressors, i.e. E [ ξ t ( θ )| P t ¿=0 , where θ =( a¿¿ ¿ , b ) ¿.
¿ ¿
One method to examine validity of the condition suggests is to plot the estimated residuals
against (each of) the regressors and look to see if we can spot patterns. It is also useful to plot the
residuals over time and, in this example, we would see a seasonal pattern in the residuals plotted
over time which suggests a first potential avenue for improving on our initial specification.
Since the data present seasonality, we introduce quarterly indicator variables, omitting the fourth
quarter (otherwise the four quarterly indicators and the constant would be collinear). Our model
of demand becomes: lnQt =a−bln Pt +γ 1 q 1+ γ 2 q 2+ γ 3 q 3+ ξ t , where q i=1 for the respective period
and 0 otherwise.
We face the question of whether we should genuinely base policy on this estimate. For instance,
if we were applying a hypothetical monopolist test for market dentition, we would conclude on
the basis of this estimate that such a monopolist of the market for sugar could indeed profitably
increase its price by 5% above the competitive market price and so we should consider a market
which is no wider than sugar for the purposes of antitrust. Indeed, an estimate of the elasticity of
demand of -0.38 suggests an optimal gross margin for a monopolist of 268%.Of course, even a
monopolist would in reality need to be able to cover her fixed costs from that margin.
Instrumental Variable Estimation: OLS identification condition requires that at the true
¿
parameter values, E [ ξ t ( θ )|(1 , q1 , q2 , q 3 , lnPt ) ¿=0 can fail for numerous reasons.
First, non-price drivers of the unobserved component causing demand variation may also cause
market prices to rise. If so, then the “demand shocks” will cause variation in prices and therefore
demand shocks and prices will be correlated. Second, the model can be misspecified because of
omitted variables that are correlated with prices so that the misspecified model introduces a
correlation between the model’s error term and prices.
Each case results in an “endogeneity problem” where, the unobserved component of the demand
model and the price data will be correlated. As a result, the OLS estimate of the price coefficient
is likely to be biased. To address these concerns, we consider instrumental variable (IV)
techniques to control for endogeneity. Basic requirements for an instrumental variable: it is
correlated with the potentially endogenous regressor and uncorrelated with the unobserved
component of demand.
One popular estimator which addresses the endogeneity concern is the “two-stage least-squares”
(2SLS) estimator. If price is the endogenous variable then the two stages are (1) run a regression
of ln(prices) on the exogenous variables in the demand curve plus the instrument and (2) use
2
predicted ln(prices) instead of the actual ln(price) data to estimate the demand curve. The 2SLS
technique gets its name from the fact that the estimator can be obtained by using the predicted
explanatory variable from the first-stage regression in the estimation of the model instead of the
original variable.
The 2SLS estimator itself can also be obtained in one step, but it is usually helpful to look at the
output from both steps. The first necessary condition for an instrument to be valid can be tested
by running a regression of the endogenous explanatory variable (here prices) on the other
variables included in the demand model and treated as exogenous plus the instrument.
This is known as the “first-stage” regression because it is exactly the regression used as the first
step in constructing the 2SLS estimator. If the instrument appears to be statistically significant in
the first-stage regression, we conclude that it is conditionally correlated with prices in a way
which is potentially helpful for solving the endogeneity problem.
The second condition for an instrument to be valid is that it is not correlated with the demand
shock. Usually, the assessment of this second condition is harder, but one albeit imperfect
approach is to plot the error term against the instrument to check for correlation.
To illustrate, consider estimating sugar demand and suppose we consider using quarterly farm
wages as a potential instrument for prices. Farm wages are a cost of producing sugar and will
therefore ordinarily affect observed prices according to economic theory (and also farmers!). On
the other hand, given that farmers are a small minority of the population and that the increase in
their wages is not likely to translate into material increases in sugar consumption, farm wages are
unlikely to materially affect the aggregate demand for sugar.
The 2SLS estimation proceeds in two stages:
1st-stage regression:lnP t=a−bln W t + γ 1 q1 t + γ 2 q2 t + γ 3 q 3 t +ε t ,
2nd-stage regression:lnQt =a−b ^ln Pt + γ 1 q 1t +γ 2 q 2 t +γ 3 q3 t + ν t ,
where Wt is the farm wage at time t and ln ^ Pt is the estimated log of price obtained from the first-
stage regression (Stata command ivreg).
The quarterly dummies are also included in the first-stage regression since the requirement for an
instrument to be valid is that it is correlated with an endogenous variable conditional on the
included exogenous variables. Demand is itself seasonal, so that the quarterly dummies are not
correlated with prices conditional on the included exogenous variables and hence are not valid
instruments for prices, even if they are valid instruments for themselves, i.e., can be treated as
exogenous.
IV estimation results should be carefully scrutinized. They will only be reliable if the instrument
chosen for the first-stage regression satisfies the conditions E [ ξ t|( X t , W t )¿=0and
E [ lnPt|(X t , W t ) ¿=0, where X t =(1 ,q 1 , q 2 , q3 ) are the exogenous regressors in the demand
equation and Wt is the instrument, farm wages.
The first of these conditions is difficult to test. One way to evaluate whether it holds is to
examine a picture of the estimated residuals against the regressors. We should see no systematic
3
patterns in the graphs-whatever the value of Xt or Wt the error term on average around those
values should be mean zero. Such tests can be formalized (RESET). But there are limits to the
extent to which this assumption can be tested since the model will impose this assumption on the
data in order to best derive the IV estimates obtained.
A variety of potential IV results can certainly be tested against each other and against
specifications which use more instruments than strictly necessary to achieve identification. But in
reality the first assumption is difficult to test convincingly. We likely rely on economic theory- to
the extent that the theory robustly tells us that, for example, a cost driver will generally not affect
consumer demand behavior and so will have no reason to be correlated with the unobserved
component of demand.
To evaluate the second condition we run a regression of the potentially endogenous variable
(here lnPt) on all the exogenous explanatory variables in the demand equation and the
instruments (here lnWt).
1st-stage regression: lnP t=a−bln W t + γ 1 q1 t + γ 2 q2 t + γ 3 q 3 t +ε t
We want b^ robustly and significantly different from zero for farm wage to be a good instrument.
Even with “good instruments”, we will expect the coefficient of an instrumented variable in an
IV regression to be less precisely estimated (have a higher standard error) than the analogous
coefficient estimated using OLS (with the latter a meaningful comparison only if in fact the OLS
estimate is a valid one). IV estimation relaxes the assumptions required to get valid estimates but
it does so at a price: lower precision. There will be cases where the OLS estimates cannot be
rejected when compared with the IV estimates and may as a result be preferred.
Differentiated Products Demand Systems:
Most markets do not consist of a single homogeneous product but are rather composed of
similar but differentiated goods that compete for customers. For instance, in the market for
shampoos there is not a single type of generic shampoo. Rather there is a variety of brands and
types of shampoo which consumers do not consider absolutely equivalent. We must take such
demand characteristics into account when attempting to estimate demand in differentiated
product markets.
In particular, we need to take account of the fact that consumers are choosing among different
products for which they have different relative preferences and which will usually have different
prices. Differentiated product demand systems are therefore estimated as a system of individual
product demand equations, where the demand for a product depends on its own price but also
on the price of the other products in the market.
Log-Linear Demand Models:
One popular differentiated product demand system is the log-linear demand system, which is
simply a set of log-linear demand functions, one for each product available in the market. In each
case, the quantity of the good purchased potentially depends on the prices of all the goods in the
4
market and also income y (Deaton and Muellbauer 1980b). Formally, we have the following
system of J equations:
lnQ1 t =a1 −b11 ln P1 t +b 12 ln P2 t +…+ b1 J ln PJt + γ 1 ln y t + ξ1 t ,
lnQ2 t =a 2−b21 ln P 1t + b22 ln P2 t + …+b2 J ln P Jt + γ 2 ln y t +ξ 2 t ,
….
lnQ Jt =a J −b J 1 ln P1 t + bJ 2 ln P2 t +…+ bJJ ln P Jt + γ J ln y t + ξ Jt .
Maximizing utility subject to a budget constraint will generically provide demand equations
which depend on the set of all prices and income. Clearly, with aggregate data we might use
aggregate income as the relevant variable for the demand equations (e.g., GDP). However, since
many studies focus on a particular sector of the economy, the consumer’s problem is often recast
and considered as a two-stage problem.
At the first stage, we posit that consumers decide how much money to spend on a category of
goods-for example, beer-and at the second stage we posit that the chosen level of expenditure is
allocated across the various products that the consumer must choose between, perhaps the
different brands of beer. Under particular assumptions on the shape of the utility function, this
two-stage process can be shown to be equivalent to solving a single one-stage utility-
maximization problem (see Deaton and Muellbauer 1980b; Gorman 1959; Hausman et al. 1994).
Using the two-stage interpretation, “expenditure” may be used instead of income in the demand
equations but the demand equations will then be termed “conditional” demand equations as we
are conditioning on a given level of expenditure.
Hausman et al. (1994) estimate a three-level choice model where consumers choose (1) the level
of expenditure on beer (2) how to allocate that expenditure between three broad categories of
beer (respectively termed premium beer, popular beer, and light beer) which marketing studies
had identified as market segments, and (3) how to allocate expenditure between the various
brands of beer within each of the segments.
At level (3), we use the observed product level price and quantity data to estimate our
differentiated product demand system. However, in fact, since level (3) is modeled as a choice of
brands (Coors, Budweiser, Molsen, etc.), at levels (1), (2), and (3) we would need to use price
and quantity indices constructed from underlying product-level data to give measures of price
and quantity for each of the brands or segments of the beer industry. For example, we might use
a price index with expenditure share weights for the underlying prices within each segment s to
produce a segment-level price index Pst =∑ w jt pJt . Similarly, we might choose to use volumes
j
of liquid to help aggregate over the brands to give segment-level quantity indices.
At the second level of the choice tree, the demand system is a conditional demand system
because the amount of money to be spent on beer has already been chosen at stage 1. In a log-
linear model, the bjj coefficients provide estimates of the own-price elasticity of demand while
the bjk (j≠k) parameters provide estimates of the cross-price elasticities of demand. If we are
using segment-level data, we must be careful to place the correct interpretation on the elasticities.
5
These price elasticities could be used as important evidence toward a formal test of the
hypothesis that each beer segment is a market in itself by performing a SSNIP test. That said,
generally, the price elasticity relevant for such a test would include the indirect effect of prices
through their effect on the total amount of expenditure on beer. If the price of premium beer goes
up, some consumption will be reallocated to other beer segments but the total consumption of
beer might also fall as people either switch to other products such as wine or reduce consumption
altogether. The estimated elasticities from the equation are conditional on the level of
expenditure on beer.
Thus for market definition, if we use expenditure levels and price indices to perform market
definition tests, we must be careful to trace through the effect of a price change back through its
effect on total expenditure on beer. To do so, Hausman et al. (1994) also estimates a single top-
level equation so that the demand for beer in total is expressed as a function of prices and
income. In this case, the equation estimated depended on income (GDP) and also a price index
constructed to capture the general price of beer as well as demographics, Zt:
ln Q tBeer= β0 + β 1 ln y tGDP + β 2 ln Pt Beer + δ Z t + ε t
The choice of instruments in differentiated product demand systems is generically difficult. First,
we may need a lot of them. We need at least one instrument for every product whose price is
considered potentially endogenous in a demand function (although sometimes a given
instrument may in fact be used to estimate more than one equation). Second, a natural source of
instruments involves cost data. However, since products are often produced in a very similar
way, and cost data are often recorded less frequently than prices are set, at least in financial or
management accounts, we are often unable to find cost variables that are genuinely sufficiently
helpful for identification of each of the demand curves. Data such as exchange rates and wages
are often useful in homogeneous product demand estimation, but fundamentally such data are
not product (or here segment) specific and so will face difficulties as instruments in the
differentiated product context.
The reality is that there are no entirely persuasive solutions to this problem. Hausman et al.
(1994) use prices in other cities as instruments for the prices in a given city. The logic is that if,
and it is often a very big “if,” (1) demand shocks are city specific and independent across cities
and (2) cost shocks are correlated across markets, then any correlation between the price in this
market and the prices in other markets will be due to cost movements. In that case, the prices in
other cities will be valid instruments for the price in this city.
Obviously, these are strong assumptions. For example, there must not be any effect of, say,
national advertising campaigns in the demand shocks since then they would not be independent
across cities. Alternatively, another potentially satisfactory instrument would be the price of a
good that shares the costs but which is not a substitute or complement. For example, if a product
under study had costs that were each heavily influenced by the oil price, then the price of
another good also subject to a similar sensitivity might be used. Of course, in such a situation it
would be easier to use the oil price so examples where this approach would genuinely be useful
are perhaps hard to think of.
6
Indirect Utility and Expenditure Shares Models
A log-linear demand system is easy to estimate because all the equations are linear in the
parameters. However, they also impose considerable assumptions on the nature of consumer
preferences. For example, they impose constant own-and cross-price elasticities of demand. In
addition, there is a potentially serious internal consistency issue that we face when estimating
log-linear demand functions using aggregate data. Namely, the aggregate demand function may
well depend on more than aggregate income. If we only include an aggregate income variable,
estimates may suffer from “aggregation bias.” e.g. even if there is no heterogeneity across
individuals other than in their income, estimating a log-linear demand equation using aggregate
data will involve estimating a misspecified model.
The economics profession searched for models which were internally consistent in the sense that
they either only depended on exactly the aggregate analogous data, or in a weaker sense that they
only depended on aggregate data-perhaps the aggregate income but also the variance of income
in the population. This study of “aggregability conditions” provided the motivation for many of
the most popular demand system models that are in use today-they satisfy these “aggregability”
conditions. One such example is the almost ideal demand system (AIDS) due to Deaton and
Muellbauer (1980a).
Almost Ideal Demand System (AIDS) is perhaps the most commonly used differentiated product
demand system. It satisfies a nice aggregability condition: if we take a lot of consumers behaving
as predicted by an AIDS model and aggregate their demand systems, the result is itself an AIDS
demand system. The relevant parameters of an AIDS specification are also quite easy to estimate
and the estimation process requires data that are normally available to the analyst, namely prices
and expenditure shares.
In practice, an AIDS system can be implemented in the following way
1. Calculate wjt, the expenditure share of a good j at time t, using the price of j at time t, pjt,
J
the quantity demanded of j at time t, qjt, and total expenditure y t =∑ p jt q jt .
j=1
J
2. Calculate the Stone price index: lnP t=∑ w jt p jt
j=1
J
yt
3. Run the following regression:w jt =α j +∑ γ jk ln p kt + β j ln
k=1
( )
Pt
+ξ jt , where pkt is the own
price and the price of the goods that are substitutes and ξ jt is the error term.
4. Retrieve the J+2 parameters of interest(α ¿ ¿ j , γ j 1 , … , γ jJ , β j) ¿.
The own and cross price elasticities can be retrieved from the AIDS parameters by noting that
lnw j=lnp j+ ln q j−lny ⇔ ln q j=lnw j −lnp j+ lny
Parameter Restrictions on Demand Systems: to be included-general discussion and application:
Hausman paper
7
Demand System Estimation: Discrete Choice Models
Discrete choice demand models represent choice situations in which consumers choose (one
option) from a list of options. For example, a consumer may choose which type of car to buy.
Discrete choice models typically impose considerable structure on consumers’ preferences. The
main advantage of reduces the number of parameters we need to estimate in markets with a
multitude of products.
For example, in the AIDS model before the restrictions of choice theory are imposed there are a
total of J2 parameters on prices (J per equation) to estimate. To be clear, a demand system with
200 products such as that needed for a product-level demand system of a market like the car
market would generate a base model with 40,000 parameters on prices that we would need to
estimate. Analogously, there are 40,000 own-and cross-price elasticities to be estimated. This is
clearly impossible with the kinds of data sets we usually have and so it became clear that some
structure would need to be placed on those 40,000 own-and cross-price elasticities.
The multilevel model used by Hausman et al. (1994) is one way to impose structure on the set of
elasticities. An alternative is to use “characteristics” based models. Historically, the discrete
choice demand literature followed the characteristics approach while the continuous choice
demand literature followed the “product”-level approach, although there are some recent
exceptions, most notably Slade et al. (2002).
There is no obvious practical reason why we cannot have “characteristics” and “product”-level
models of both continuous choice and discrete choice varieties. In the future, therefore, the main
distinguishing feature of these classes of models may revert to the only real source of difference:
the nature of consumer choice. For the moment, however, most of the discrete choice literature is
characteristics based while the continuous choice models are product-level models. In this
section we discuss the most popular discrete choice models currently in use.
Discrete Choice Demand Systems
The foundation of discrete choice demand functions is not fundamentally different from our
usual utility maximization framework with the exception that in this context our consumer faces
constraints on her choice set: discrete goods can only be consumed as 0,1 choices. For each of
these discrete goods, consumers either buy one or they do not buy one. Below, we follow the
literature in building such models by first considering an individual’s choice problem and then
deriving a model of aggregate demand by aggregating over individuals.
Chaudhuri, Goldberg and Jia (CGJ) Paper
Question: The WTO has imposed rules on patent protection (both duration and enforcement) on
member countries. There is a large debate on should we allow foreign multinationals to extent
their drugs patents in poor countries such as India, which would raise prices considerably.
8
Increase in IP rights raises the profits of patented drug firms, giving them greater
incentives to innovate and create new drugs (or formulations such as long shelf life which
could be quite useful in a country like India).
Lower consumer surplus dues to generic drugs being taken off the market.
To understand the tradeoff inherent in patent protection, we need to estimate the magnitude of
these two effects.
Market
Indian Market for antibiotics.
Foreign and Domestic, Licensed and Non-Licensed producers.
Different types of Antibiotics, in particular CGJ look at a particular class: Quinolones.
Different brands, packages, dosages etc.
Question: What would prices and quantities look like if there were no unlicensed firms
selling this product in the market? (One of the reasons I.O. economists use structural
models is that there is often no experiment in the data, i.e. a case where some markets
have this regulation and others don’t.)
Data
The Data come from a market research firm. This is often the case for demand data since
the firms in this market are willing to pay large amounts of money to track how well they
are doing with respect to their competitors. However, prying data from these guys when
they sell it for 10 000 a month to firms in the industry involves a lot of work and
emailing.
Monthly sales data for 4 regions, by product (down to the SKU level) and prices.
The data come from audits of pharmacies, i.e. people go to a sample of pharmacies and
collect the data.
Problem for the AIDS model: Over 300 different products, i.e. 90000 cross product
interaction terms to estimate. CGJ need to do some aggregating of products to get rid of
this problem: they will aggregate products by therapeutic class into 4 of these, interacted
with the nationality of the producer (not if they are licensed or not).
Results
CGJ estimate the AIDS specification with the aggregation of different brands to product
level.
We can get upper and lower bounds on marginal costs by assuming either that firms are
perfect competitors within the segment (i.e. p=mc) or by assuming that firms are
operating a cartel which can price at the monopoly level (i.e. p=mc/(1+1/ηjj).
Use estimated demand system to compute the prices of domestic producers of unlicensed
products that make expenditures on these products 0 (this is what “virtual prices” mean).
Figure out what producer profits would be in the world without unlicensed firms (just (p-
c)q in this setup).
Compute the change in consumer surplus (think of integrating under the demand curve).
9
Characteristic Space Approaches to Demand Estimation
Basic approach:
Consider products as bundles of characteristics
Define consumer preferences over characteristics
Let each consumer choose that bundle which maximizes their utility. We restrict the
consumer to choosing only one bundle. You will see why we do this as we develop the
formal model, multiple purchases are easy to incorporate conceptually but incur a big
computational cost and require more detailed data than we usually have. Working on
elegant ways around this problem is an open area for research.
Since we normally have aggregate demand data we get the aggregate demand implied by
the model by summing over the consumers.
Formal Treatment
Utility of the individual: Uij=U( xj,pj,vi;θ ) for j=( 0,1,2,3,...,J) .
Good 0 is generally referred to as the outside good. It represents the option chosen when
none of the observed goods are chosen. A maintained assumption is that the pricing of the
outside good is set exogenously, J is the number of goods in the industry, xj are non-price
characteristics of good j, pj is the price, vi are characteristics of the consumer I, θ are the
parameters of the model
Note that the product characteristics do not vary over consumers, this most commonly a
problem when the choice sets of consumers are different and we do not observe the
differences in the choice sets.
Consumer i chooses good j when Uij>Uik ∀ k (note that all preference relations are
assumed to be strict)
This means that the set of consumers that choose good j is given by
Sj(θ) =( v|Uij>Uik ∀ k) and given a distribution over the v’s, f( v) , we can recover the
share of good j as sj( x,p|θ ) =∫ν ∈ Sj(θ) f( dν )
Obviously, if we let the market size be M then the total demand is sj( x,p|θ ) .
This is the formal analog of the basic approach outlined above. The rest of our discussion
of the characteristic space approach to demand will consider the steps involved in making
this operational for the purposes of estimation.
Aside on utility functions
Recall from basic micro that ordinal rankings of choices are invariant to affine
transformations of the underlying utility function. More specifically, choices are invariant
to multiplication of U( · ) by a positive number and the addition of any constant.
10
This means that in modeling utility we need to make some normalization - that is we need
to bolt down a zero to measure things against. Normally we do the following:
Normalize the mean utility of the outside good to zero.
Normalize the coefficient on the idiosyncratic error term to 1.
Horizontally Differentiated vs. Vertically Differentiated - Horizontally differentiated means that,
setting aside price, people disagree over which product is best. Vertically differentiated means
that, price aside, everyone agrees on which good is best, they just differ in how much they value
additional quality.
Pure Horizontal Model
This is the Hotelling model (n ice-cream sellers on the beach, with
consumers distributed along the beach)
Utility for a consumer at some point captured by νi is Uij=u-pj+θ (δj-νi)2,
where the (δj-νi)2 term captures a quadratic "transportation cost".
It is a standard workhorse for theory models exploring ideas to do with
product location.
Pure Vertical Model
Used by, Shaked and Sutton, Mussa-Rosen (monopoly pricing, slightly
different), Bresnahan (demand for autos) and many others
Utility given by Uij=u-νipj+δ j
This model is used most commonly in screening problems such a Mussa-
Rosen where the problem is to set (p,q) tuples that induce high value and
low value customers to self-select (2nd degree price discrimination). The
model has also been used to consider product development issues, notably
in computational work.
Logit
This model assumes everyone has the same taste for quality but have
different idiosyncratic taste for the product. Utility is given by Uij=δ j+ϵ ij
ϵ ij∼ extreme value type II [F(ϵ) =e-e-ϵ ]. This is a very helpful assumption
as it allows for the aggregate shares to have an analytical form.
This ease in aggregation comes at a cost, the embedded assumption on the
distribution on tastes creates more structure than we would like on the
aggregate substitution matrix.
Nested Logit
As in the AIDS Model, we need to make some “ex-ante” classification of
goods into different segments, so each good j ∈ S(j).
Probabilities are given by: F(·) = exp(-∑s=1 S (∑j ∈ S(j) e-ϵnj/λk)λk)
For two different goods in different segments, the relative choice
probabilities are:
(Pni)/(Pnm)=(eVni λk ( ∑j ∈ Sk (i) eVnj λkλk-1)/(eVnm λl ( ∑j ∈ Sl (m) eVnj λlλl-1)
11
The best example of using Nested-Logit for an IO application is Golberg
(1995) Econometrica. One can classify goods into a hierarchy of nests (car
or truck, foreign or domestic, Nissan or Toyota, Camry or Corrola).
Problems with Estimates from Simple Models
Each model has its own problems and they share one problem in common:
Vertical Model:
Cross-price elasticities are only with respect to neighboring goods - highly
constrained substitution matrix.
Own-price elasticities are often not smaller for high priced goods, even though we
might think this makes more sense (higher income → less price sensitivity).
Logit Model:
Own price derivative is (∂ s)/(∂ p)=-s(1-s) . That is, the own price derivative only
depends on shares, which in turn means that if we see two products with the same
share, they must have the same mark-up, under most pricing models.
Cross-price elasticities are sjsk. This means that the substitution matrix is solely a
function of shares and not relative proximity of products in characteristic space.
This is a bit crazy for products like cars. This is a function of the IIA assumption.
Note: if you run logit, and your results do not generate these results you have bad
code. This is a helpful diagnostic for programming.
Simultaneity: No way to control for endogeneity via simultaneity.
Dealing with Simultaneity
The problem formally is that the regressors are correlated with an unobservable (we can’t
separate variation due to cost shocks from variation due to demand shocks), so to deal with this
we need to have an unobservable component in the model.
Let product quality be δ j=∑kβ kxkj-αp+ξj, where the elements of ξ are unobserved product
characteristics
Estimation Strategy
Assume n large, So sjo=sj( ξ 1,...,ξ J|θ )
For each θ there exists a ξ such that the model shares and observed shares are equal.
Thus we invert the model to find ξ as a function of the parameters.
This allows us to construct moments to drive estimation (we are going to run everything
using GMM)
Note: inversion is not always easy.
Example: The Logit Model
Logit is the easiest inversion to do, since
ln (sj/s0) = δ j=∑kβ kxkj-α p+ξ j, ξ j = (ln sj/s0) -( ∑kβ kxkj-α p+ξ j)
12
Note: As far as estimation goes, we now are in a linear world where we can run things in the
same way as we run OLS or IV. The precise routine to run will depend, as always, on what we
think are the properties of ξ.
More on Estimation
Regardless of the model we now have to choose the moment restriction we are going to
use for estimation.
This is where we can now properly deal with simultaneity in our model.
Since consumers know ξj we should probably assume the firms do as well. Thus in
standard pricing models we will have pj=p( xj,ξ j,x-j,ξ -j)
Since p is a function of the unobservable, ξ, we should not use a moment restriction
which interacts p and ξ . This is the standard endogeneity problem in demand estimation.
It implies we need some instruments.
There is nothing special about p in this context, if E(ξ,x) ≠ 0, then we need an instrument
for x as well.
Some assumptions used for identification in literature:
E( ξ|x,w) =0 x contains the vector of characteristics other than price and w contains
cost side variables. Note that they are all valid instruments for price so long as the
structure of the model implies they are correlated with pj.
Multiple markets: here assume something like ξjr=ξj+ujr and put assumptions on ujr.
Essentially treat the problem as a panel data problem, with the panel across region not
time.
13