UNIT 3: INTRODUCTION TO SIMULTANEOUS EQUATION
3.0 AIMS AND OBJECTIVES
The purpose of this unit is to introduce the student very briefly about the concept of
simultaneous dependence of economic variables. Thus, when the student have completed this
unit he/she will:
understand the concept of simultaneous equation
distinguish between endogenous and exogenous variables in a model
be able to derive reduced form equation from structural equations
understand the concept of under identified, identified and over identified equations
be able to conduct test of simultaneity
3.1 INTRODUCTION
The application of least squares to a single equation assumes, among others, that the
explanatory variables are truly exogenous, that there is one-way causation between the
dependent variable (Y) and the explanatory variables (X). That is, the function cannot be
treated in isolation as a single equation model but belongs to a wider system of equations which
describes the relationship among all the relevant variables. In such cases we must use a multi
equation model which would include separate equations in which y and x would appear as
endogenous variables. A system describing the joint dependence of variables is called a system
of simultaneous equations.
3.2. SIMULTANEOUS DEPENDENCE OF ECONOMIC VARIABLES
In a single equations discussed in the previous units the cause and effect relationship is
unidirectional where the explanatory variables are the cause and the dependent variable is the
effect.
However, there are situations where there is a two-way flow of influence among economic
variables; that is, one economic variable affects another economic variable(s) and is, in turn,
affected by it (them).
1
In such case we need to consider two equations and thus come up with simultaneous equation
models in which there is more than one regression equations for each independent variable. The
first thing we need to answer is the question of “what happens if the parameters of each
equation are estimated by applying, say, the method of OLS, disregarding other equations in
the system? Recall that one of the crucial assumptions of the method of OLS is that the
explanatory variables X’s are either non stochastic or if stochastic (random are distributed
independently of the stochastic distribution term. If neither of these conditions is met, then, the
least-squares estimators are not only biased but also inconsistent; that is, as the sample size
increases indefinitely, the estimators do not converge to their true (population) values.
For example, consider the following hypothetical system of equation
Y1i = 10 + 12Y2i + 11X1i + U1i…………………….. (3.1)
Y2i = 20 + 21Y1i + 21X1i + U2i……………………... (3.2)
Where Y1 and Y2 are mutually dependent or endogenous, variables (i.e. whose value are
determined with in the model) and X 1 an exogenous variable (whose value are determined
outside the model) and where U1 and U2 are stochastic disturbance terms, the variables Y1 and
Y2 are both stochastic. Therefore, unless it can be shown that the stochastic explanatory
variable Y2 in (3.1) is distributed independently of U 1and the stochastic explanatory variable Y 1
in (3.2) in distributed independently ofU 2, application of classical OLS to these equations
individually will lead to inconsistent estimates.
Example. Recall that price of a commodity and the quantity (bought and sold) are determined
by the intersection of the demand and supply curves for that commodity. Consider the
following linear demand and supply models.
Demand function
Q dt = + P + U ……………………………... (3.3)
0 1 t 1t
Supply function
Q ts = + P + U ………………………………… (3.4)
0 1 t 2t
Equilibrium Condition
Q dt = Qts ……………….……………………….. (3.5)
d s
Where Q t = Quantity demanded, Q t = Quantity supplied, P = price and t = time
2
Note that P and Q are jointly dependent variables. If U 1 changes because of changes in other
d
variables affecting Q t (such as income and tastes) the demand shifts.
Recall that such shift in demand changes both P and Q. Similarly, a change in U 2 t (because of
changes in weather and the like) will shift (affect) supply, again affecting both P and Q.
Because of this simultaneous dependence between Q and P, U 1 and Pt in (3.3) and U2t and Pt is
(3.4) cannot be independent. Therefore a regression of Q on P as in (3.3) would violate an
important assumption of the classical linear regression model, namely, the assumption of no
correlation between the explanatory variable(s) and the disturbance term. In summary, the
above discussion reveals that in contrast to single equation models, in simultaneous equation
models more than one dependent, or endogenous, variable is involved, necessitating as many
equations as the number of endogenous variables. As a consequence such an endogenous
explanatory variable becomes stochastic and is usually correlated with the disturbance term of
the equation in which it appears as an explanatory variable.
Recall that the variable entering a simultaneous equation model are of two types: They are
called endogenous and predetermined variables. Endogenous variables are those variables
whose values are determined inside the model. Predetermined variables on the other hand, are
those whose values are determined outside the model. Predetermined variables are divided into
exogenous and lagged endogenous variables. Although non-economic variables such as rainfall
and weather are clearly exogenous or predetermined, the model builder must exercise great care
in classifying economic variables as endogenous or predetermined. Consider the Keynesian
model of income determination.
Consumption function: Ct = 0 + 1Yt + U 1 0 < 1 < 1 ………………… (3.6)
Income identity: Y t = Ct + It …………………………………..….. (3.7)
In this model C (consumption) and Y (income are endogenous variables. Investment (I) on the
other hand is treated as exogenous variable. Note that if there were lagged values of
consumption and income variables (i.e., C t-1 and Yt-1) they would have been treated as lagged
endogenous and hence predetermined variables.
3
Consider the problem of estimating the consumption function, regressing consumption on
income. Suppose the disturbance in the consumption function jumps up. This directly increases
consumption, which through the equilibrium condition increases income. But income is the
independent variable in the consumption function, (3.6).
Thus, the disturbance in the consumption function and the regressor are positively correlated.
An increase in the disturbance term (directly implying an increase in consumption) is
accompanied by an increase in income (also implying an increase in consumption) when
estimating the influence of income on consumption, however, the OLS technique attributes
both of these increases in consumption (instead of just the latter) to the accompanying increase
in income. This implies that the OLS estimator of the marginal propensity to consume (1) is
biased upward, even asymptotically.
Both equation 3.6 and 3.7 are structural or behavioral equations because they are portraying the
structure of an economy, where equation (3.7) being an identity. The ’s are known as the
structural parameters or coefficients. From the structural equations one can solve for the
endogenous variables and derive a reduced-form equations and the associated reduced form
coefficients. A reduced form equation is one that expresses an endogenous variable solely in
terms of the predetermined variables and the stochastic disturbances.
If equation (3.6) is substituted into equation (3.7), and solve for Y we obtain the following
β0 1 Ut
Yt
= 1−β 1 + 1−β 1 It + 1−β 1
= 0 + 1It + Wt …………………………….. (3.8)
β0 1 Ut
Wt
Where 0 = 1−β 1 , 1 = 1−β 1 and = 1−β 1
Equation (3.8) is a reduced-form equation; it expresses the endogenous variable Y solely as a
function of the exogenous (or predetermined) variable I and the stochastic disturbance term U.
0 + and 1 are the associated reduced form coefficients.
Substituting the value of Y from equation (3.8) into Y t of equation (3.6), we obtain another
reduced-form equation given by
Ct = 2 + 3It + Wt …………………………….. (3.9)
4
β0 β1 Ut
Where 2 = 1−β 1 , 3 = 1−β 1 and Wt = 1−β 1
The reduced form coefficients, (the ’s) are also known as impact, or short run multipliers,
because they measure the immediate impact on the endogenous variable of a unit change in the
value of the exogenous variable.
If in the preceding Keynesian model the investment expenditure (I) is increased by, say $1 and
if the marginal propensity to consume (i.e., 1) is assumed to be 0.8, then from 1 of (3.8) we
1
obtain 1 = 1−0 . 8 = 5. This result means that increasing the investment by $1 will immediately
(i.e., in the current time period) lead to an increase in income of $, that is, a fire fold increase.
Notice an interesting feature of the reduced-form equations. Since only the predetermined
variables and stochastic disturbances appear on the right side of these equations, and since the
predetermined variables are assumed to be uncorrelated with the disturbance terms, the OLS
method can be applied to estimate the coefficients of the reduced-form equations (the ’s). This
will be the case if a researcher is only interested in predicting the endogenous variables, only
wishes to estimate the size of the multipliers (i.e. the ’s)
Note that since the reduced form coefficients can estimated by the OLS method and these
coefficients are combinations of the structural coefficients, the possibility exist that the
structural coefficients can be “retrieved” from the reduced-form coefficients, and it is in the
estimation of the structural parameters that we may be ultimately interested. Unfortunately,
retrieving the structural coefficients from the reduced form coefficients is not always possible;
this problem is one way of viewing the identification problem.
3.3. THE IDENTIFICATION PROBLEM
By the identification problem we mean whether numerical estimates of the parameters of a
structural equation can be obtained from the estimated reduced-form coefficients. If this can be
done, we say that the particular equation is identified. If this cannot be done, then we say that
the equation under consideration is unidentified, or under identified.
5
Note that the identification problem is a mathematical (as opposed to statistical) problem
associated with simultaneous equation systems. It is concerned with the equation of the
possibility or impossibility of obtaining meaningful estimate of the structural parameters. An
identified equation may be either exactly (or fully or just) identified or over identified.
It is said to be over identified if more than one numerical value can be obtained for some of the
parameters of the structural equations. The circumstances under which each of these cases
occurs will be shown in the following discussion.
a) Under Identification
Consider the demand-and-supply model (3.3) and (3.4), together with the market clearing or
equilibrium, condition (3.5) that demand is equal to supply. By the equilibrium condition (i.e.,
Q dt =Qts ) we obtain,
0 + 1Pt + U1t = 0 + 1Pt + U2t …………………………… (3.10)
Solving (6.10) using the substitution technique employed in (6.8) and (6.9), we obtain the
equilibrium price
Pt = 0 + Vt ………………………………………….. (3.11)
β0 +α 0
Where 0 = α 1 −β 1
U 2t −U 1t
V1 = α 1−β 1
Substituting Pt from (6.1) into (6.3) or (6.4) we obtain the following equilibrium quantity:
Qt = 1 + Wt …………………………….. (3.12)
α 1 β 0 −α 0 β 1
Where 1 = α 1 −β 1
α 1 U 2t −β 1 U 1 t
Wt = α 1−β 1
Note that 0 and 1, (the reduced-form-coefficients) contain all four structural parameters; 0,
1, 0 and 1. But, there is no way in which the four structural unknowns can be estimated from
only two reduced form coefficients. Recall from high school algebra that to estimate four
unknowns we must have four (independent) equations, and in general, to estimate k unknowns
6
we must have R (independent) equations. What all this means is that, given time series data on
p(price) and Q(quantity) and no other information, there is no way the researcher guarantee
whether he/she is estimating the demand function or the supply function. That is, a given Pt and
Qt represent simply the point of intersection of the appropriate demand and supply curves
because of the equilibrium condition that demand is equal to supply.
b) Just or Exact Identification
The reason we could not identify the preceding demand function or the supply function was
that the same variables P and Q are present in both functions and there is no additional
information. But suppose we consider the following demand and supply model.
Demand function: Qt = 0 + 1Pt + 2It + U1t 1 < 0, 2 > 0......................... (3.13)
Supply function: Qt = 0 + 1Pt + 2Pt-1 + U2t …1 > 0, 2 > 0...................... (3.14)
Where I = income of the consumer, an exogenous variable
P t-1 = Price lagged one period, usually incorporated in the model to explain the supply of
many agricultural commodities.
Note that Pt-1 is a predetermined variable because its value is known at time t.
By the market-clearing mechanism we have
0 + 1Pt + 2It + U1t = 0 + 1Pt + 2Pt-1 + U2t …………...............… (3.15)
Solving this equation, we obtain the following equilibrium price
Pt = 0 + 1It + 2Pt-1 + V t ……………….............................. ………… (3.16)
β 0−α 0 α2
−
Where 0 = α 1−β 1 , 1 = α 1 −β 1
β2 U 2t −U 1t
Vt
2 = α 1 −β 1 , = α 1−β 1
Substituting the equilibrium price (6.16) into the demand or supply equation of (6.13) or (6.14)
we obtain the corresponding equilibrium quantity:
Qt = 3 + 4It + sPt-1 + Wt …......................................…….. (3.17)
Where the reduced-form coefficients are
α 1 β 0 −α 0 β 1 α2 β 1
3 = α 1 −β 1 , 4 = α 1 −β 1
7
α1 β 2 α 1 U 2t −β 1 U 1 t
5 = α 1 −β 1 , Wt = α 1−β 1
The demand-and-supply model given in equations (3.13) and (3.14) contain six structural
coefficients 0, 1, 2, 0, 1, and 2 – and there are six reduced form coefficients - 0, 1, 2,
3, 4 and 5 – to estimate them.
Thus, we have six equations in six unknowns, and normally we should be able to obtain unique
estimates. Therefore, the parameters of both the demand and supply equations can be identified
and the system as a whole can be identified.
c) Over identification
Note that for certain goods and services, wealth of the consumer is another important
determinant of demand. Therefore, the demand function (3.13) can be modified as follows,
keeping the supply function as before:
Demand function: Qt = 0 + 1Pt + 2It + 3Rt + U1t ………………. (3.18)
Supply function: Qt = 0 + 1Pt + 2Pt-1 + U2t ………………………. (3.19)
Where R represents wealth
Equating demand to supply, we obtain the following equilibrium price and quantity
Pt = 0 + 1It + 2Rt + 3Pt-1 + V t ……………………………..….. (3.20)
Qt = 4 + ❑5It + 6Rt + 7Pt-1 + Wt ……………………………….... (3.21)
β 0−α 0 α2
Where 0 = α 1−β 1 , 1 = α 1 −β 1
α3 β2
2 = α 1 −β 1 , 3 = α 1 −β 1
α 1 β 0 −α 0 β 1 α2 β 1
4 = α 1 −β 1 , 5 = α 1 −β 1
α3 β 1 α1 β 2
6 = α 1 −β 1 , 7 = α 1 −β 1
α 1 U 2t −β 1 U 1 t U 2t −U 1t
Vt
Wt = α 1−β 1 , = α 1−β 1
8
The demand and supply model in (3.18) and (3.19) contains seven structural coefficients, but
there are eight equations to estimate them – the eight reduced form coefficients given above
(i.e., 0 … 7). Notice that the number of equations is greater than the number of unknowns. As
a result, unique estimation of all the parameters of our model is not possible. For example, one
can solve for 1 in the following two ways.
π6 π5
1 = π 2 or 1 = π 1
That is, there are two estimates of the price coefficient in the supply function, and there is no
guarantee that these two values or solutions will be identical. Moreover, since 1 will be
transmitted to other estimates. Note that the supply function is identified in the system (3.13)
and (3.14) but not in the system (3.18) and (3.19), although in both cases the supply function
remains the same. This is because we have “too much” or an over sufficiency of information to
identify the supply curve. The over sufficiency of the information results from the fact that in
the model (3.18) and (3.19) the exclusion of the income variable form the supply function was
enough to identify it, but in the model (3.18) and (3.19) the supply function excludes not only
the income variable but also the wealth variable. In other words, in the latter model we put “too
many” restrictions on the supply function by requiring it to exclude more variables than
necessary to identify it. However, this situation does not imply that over identification is
necessarily bad since the problem of too much information can be handled.
Notice that the situation is the opposite of the case of under identification where there is too
little information. The only way in which the structural parameters of unidentified (or under
identified) equations can be identified (and thus be capable of being estimated) is through
imposition of further restrictions, or use of more extraneous information. Such restrictions, of
course, must be imposed only if their validity can be defended.
In a simple example such as the forgoing, it is easy to check for identification; in more
complicated systems, however, it is not so easy. However this time consuming procedure can
be avoided by resorting to either the orders condition or the rank condition of identification.
Although the order condition is easy to apply, it provides only a necessary condition for
identification. On the other hand the rank condition is both a necessary and sufficient condition
for identification. [Note: the order and rank conditions for identification will not be discussed
9
since the objective of this unit is to briefly introduce and inform the reader about simultaneous
equation. For detailed and advanced discussion readers can refer the reference list stated at the
end of this unit].
3.4 A TEST OF SIMULTANEITY
If there is no simultaneous equation, or simultaneous problem, the OLS estimators produce
consistent and efficient estimators.
On the other hand, if there is simultaneity, OLS estimators are not even consistent so that other
testing methods are looked for. If we apply these alternative methods when there is in fact no
simultaneity, the result will not be efficient. This suggests’ that we should check for the
simultaneity problem before we discord OLS in favor of the alternatives.
A test of simultaneity is essentially a test of whether (an endogenous) regresor is correlated
with the error term. If it is, the simultaneity problem exists, in which case alternatives to OLS
must be found: if it is not, we can use OLS. To find out which is the case in a concrete
situation, we can use houseman’s specification error test.
Houseman Specification Test
Consider the following two-equation model
Demand function Qt = 0 + 1Pt + 2It + 3Rt + U1t ………………… (3.22)
Supply function Qt = 0 + 1Pt + U2t ……………………………….. (3.23)
Assume that I and R are exogenous of course, P and Q are endogenous.
Now consider the supply function (3.23). If there is no simultaneity problem (i.e., P and Q are
mutually independent), Pt and U2t should be uncorrelated on the other hand, if there is
simultaneity, Pt and U2t, will be correlated. To find out which is the case, the houseman test
procedure as follows:
First, from (6.22) and (6.23) we obtain the following reduced form equations
Pt = 0 + 1It + 2Rt + Vt ……..........................………………...... (3.24)
Qt = 3 + 4It + 5Rt + Wt ………...................................……….. (3.25)
where V and W are the reduced form error terms Estimating (6.24) by OLS we obtain
P^ t = π^ 0 + π^ 1 I + π^ 2 R ……………………….. (3.26)
t t
10
Therefore Pt =
P^ t + V^ t …………………………….…. (3.27)
Where
P^ t are estimated P , and t are estimated residuals. Substituting (3.27) into (3.23) we
V^
t
get:
P^ V^
Qt = 0 + 1 t + 1 t + U2t …………………… (3.28)
Now under the null hypothesis that there is no simultaneity, the correlation between
V^ t and U2t
should be zero, asymptotically. Thus if we ran the regression (3.28) and find that the coefficient
of Vt in (3.28) is statistically zero, we can conclude that there is no simultaneity problem.
3.5 APPROACHES TO ESTIMATION
At the outset it may be noted that the estimation problem is rather complex because there are a
variety of estimation techniques with varying statistical properties. In view of the introductory
nature of this unit we shall consider very briefly the following techniques.
a) The method of Indirect Least Squares (ILS)
For just or exactly identified structural equation, the method of obtaining the estimates of the
structural coefficients from the OLS estimators of the reduced form coefficients is known as the
method of indirect least squares (ILS). ILS involves the following three steps
Step I: - We first obtain the reduced form equations.
Step II: - Apply OLS to the reduced form equations individually.
Step III: - Obtain estimates of the original structural coefficients from the estimated reduced
form coefficients obtained in step II.
b) The method of two stage least squares (2SLS)
This method is applied in estimating an over identified equation. Theoretically, the two stages
least squares may be considered as an extension of ILS method. The 2SLS method boils down
to the application of ordinary list squares in two stages. That is, in the first stage, we apply least
squares to the reduced form equations in order to obtain an estimate of the exact and random
components of the endogenous variables appearing in the right hand side of the equation with
their estimated value and then we apply OLS to the transformed original equation to obtain
estimates of the structural parameters.
11
Note, however, that since 2SLS is equivalent to ILS in the just-identified case, it is usually
applied uniformly to all identified equations in the system. [For a detailed discussion of this
method readers may refer the reference list stated at the end of this unit].
3.9 REFERENCE
Harris R. (1995), “using cointegration analysis in econometric modeling” Prentice Hall
Gujarati D. (1995) “Basic Econometric”, McGraw-hill Inc 3rd ed.
Kennedy P. (1998), “A Guide to Econometrics”, Black well Pub., 4th ed.
12