Journel1985 PDF
Journel1985 PDF
1, 1985
A. G. Journel 2
The probabilistic approach is but one language used by geostatisticians to characterize spatial
variability and to express a very simple criterion for goodness of estimation. Notions such as
stationarity and ergodicity are important for the consistency of the probabilistic language
but are irrelevant to the real problem, that of estimating a well-defined deterministic spatial
average. The kriging algorithm is established without any recourse to probabilistic modeling
or notation.
INTRODUCTION
The probabilistic interpretation of a natural phenomenon known to be unique
and the related hypothesis of stationarity do not always appeal to geologists'
intuition, even though the positive aspects of geostatistical applications may be
convincing. In fact the probabilistic approach and modeling are but one language
(and possibly not the simplest) to express very simple deterministic criteria for
estimation. Stationarity is not an intrinsic property of the deposit but is a prop-
erty of the probabilistic model. A model choice and its properties must be judged
on its efficiency in capturing and solving the problem at hand. Engineers are
correct in judging geostatistics by efficiency and practical records rather than by
the intricacy of its theoretical developments; if there existed a simpler language
leading to the same algorithms and results there is little doubt that it would be
adopted.
Probabilistic hypotheses such as stationarity and isotropy are shown to be
equivalent to an experimenter's decision to look at average spatial characteristics
over particular subareas and/or directions. Kriging is shown to be equivalent to a
deterministic process of averaging errors of a same type over a predefined field
where such averaging makes physical (geological) sense.
(i) The chemistry (or physics) of the phenomenon studied: oxide and sulfide
Cu mineralizations should be considered separately not only because they will
require different milling processes, but also because they correspond to clearly
different geological geneses. Conversely, for sulfide mineralization, it is not
essential to separate bornite from chalcopyrite mineralization.
(ii) The amount and type of information available: if many populations are con-
sidered, many more data are needed to characterize each of them. At an early
stage of exploration, the geologist-geostatistician has no choice but to consider
a single population; as more information becomes available, he/she can start
separating mineralizations. The distinction between bornite (very high Cu
grade) and chalcopyrite may always remain inaccessible if data are defined
on a large support, such as a core of 5 m length; moreover such distinction is
irrelevant because no mining or milling will be able to separate these two
sulfide mineralizations.
(1) since Prob (Z(x)~< z} = F(z), for all x ~ A, the distribution function F(z)
can be estimated by the cumulative histogram of data z(x'cOtaken at differ-
ent locations x~ ~ A.
The Deterministic Side of Geostatistics 3
(2) the variogram 27(h) = E([Z(x + h)- Z(x)] 2} can be estimated by pooling
together pairs of data z(x~ + h), z(xa) taken at different locations x~ C A.
Stationarity is a property of the probabilistic model, hence it is a decision
(a choice) of the experimenter, not an intrinsic property of the real phenomenon
studied. As indicated before, the choice of a stationary model may change de-
pending on the scale of the observation made and the amount of data available.
A statement such as "the sulfide mineralization is stationary" is incorrect and
should be reworded as "given the scale of the study, which does not require nor
allow differentiation of sulfide types, a stationary random function modelZ(x)
is considered to represent the spatial variability of copper grade within the pool
of all sulfide mineralizations."
Stationarity being a property of a model chosen by the experimenter and
not an intrinsic property of reality, it cannot be proved or rejected from data.
But as with any model choice, stationarity can be validated a posteriori by judg-
ing whether this choice has proven to be efficient in solving the particular prob-
lem at hand.
experimental variance
,, 1 N
O~=l
variance within A :
These spatial averages have a clear physical meaning and the corresponding mo-
ments of the stationary RF model ZA(X ) are identified to them.
ZA(X ) such that
E{ZA(X)} = mA Var {ZA(X)} -- s~
(3)
F(z) = Prob {Z(x) ~<z} - FA(Z)
The identification process (3) allows considering the experimental statistics (1)
as estimates of the moments of the stationary model ZA(x).
Remarks
Existence of the moments OfZA(X ) is ensured by definitions (3).
Questions about the ergodicity of the RF model ZA(X ) are irrelevant, for
the moments of Za(x ) are defined from the spatial averages (2) not from their
"hypothetical" limits when (A}-+ oo. It is recalled, see, for example, Breiman
(1969), that a stationary RF Z(x) is said to be "ergodic" if the limits of spatial
averages of type (2), when {A}-+ 0% exist and identify the corresponding RF
moments, e.g.
lim ~ ~-~
{A)-~ 1 z(x). dx = ~(Z(x))
1 nO)
2.~(h) = 2 ~ ~ [z(x~ + h) - z(x~)l ~
(4)
F(h;z,z')= 1 nO) i(xa +h;z)'i(xa;z')
The Deterministic Side of Geostatisfics 5
with n(h) being the number of pairs of data, within A, approximately separated
by the vector h.
The limits of these h-bivariate statistics depend not only on the area A but
also on the vector h. Indeed, when N ~ oo (exhaustive sampling of A)
=-FA(h;z,z' )
Remarks
As opposed to the spatial averages (2), the spatial averages (5) a that is the
bivariate characteristics (6) of the RF model Z A (x) are not representative of the
whole area A but only of the subarea A(h) C A.
The spatial averages over A(h) and A(-h) are usually different. For example:
FA(h; z , z') :~ FA(-h; z, z') = FA(h; z', z); however: "YA(h)= 'yA(-h).
Thus the variogram 27A(h) is representative of the area A(h) U A(-h); see
hatched area on Fig. 1.
If the area A is rectangular as in Fig. 1, and the vector h is parallel to one
of the sides of A with modulus Ih/less than one-half of this side denoted by
lA (h), then:
A(h) U A ( - h ) ~ A for ]h[~<½ (lA(h)). (7)
For the spatial average 7A(h) to be representative of the whole area A, the inter-
distance vector h must not be too large. Indeed, geostatisticians never consider
experimental variograms beyond one-half of the maximum experimental inter-
distance available, cf., Journel and Huijbregts (1978, p. 194).
Regrouping Statistics
By restricting the area of stationarity, the experimenter can better focus on
the specific properties of that area, provided he has enough data to do so. By
extending the area of stationarity, the amount of data is increased and the esti-
mates of the spatial averages m A , FA(h), ~/A(h) are improved, provided that such
averaging makes sense.
A preliminary decision to split a deposit D, say into two stationary areas,
D = A t_)A', may be reconsidered in view of the two local statistics. For example,
the two areas may have been defined on rock type considerations, but it may
turn out that the corresponding two local statistics of Cu grades do not differ
sensibly; these local statistics can then be regrouped with due respect to their
respective zones of influence. If the two areas A and A' do not overlap, and D =
A U A', the spatial averages over D are written
1
m D = -~
fDz(x)" dx =
(A}'mA+(A'}'mA'
{A) + {A'}
(8)
1
~D [z(x + h) - z(x)] 2" d x = (A(h)}" 7A(h) + (A'(h)} • 7A'(h)
"YD(h) - {D(h)} (h) {A(h)} + {A'(h)}
with n(h) and n'(h) being the number of pairs of data used for ~'A (h) and 7A'(h).
If the N + N' data available are not regularly located over D, weighting by
areas of influence of the type (8) is warranted. Hence it appears that a split D ---
A t3 A' may be required, not for reasons due to the homogeneity of the under-
lying phenomenon but because the data set is clustered; if clusters happen to be
in area A, this area being more sampled, naive weighted statistics of the type
(9) would overrepresent area A.
The Deterministic Side of Geostatistics 7
Grouping Directions
In addition to grouping areas, one may want to group directional vario-
grams. Consider an area A and the following variogram in direction w
the vector h being of modulus ]hi and direction w. Regrouping over all direc-
tions of the three-dimensional space R 3 would provide the so-called "omni-
directional" variogram
L
Z n(lhE,w)'*A({h[, w)
2@A([hl) = w=l 1 n(lhl)
L -n(Ihl) Iz(xo + h ) z(x)l 2
y" n([h[,w)=n(lh[)
W=I
(11)
Grouping Distances
The last step in grouping is that of grouping distances. Consider the omni-
directional semi-variogram 7A(thi) defined by relation (10) and proceed to a
grouping of all distances [h I such that (x, x + [h l) E A.
8 Journel
= i0c;z) for a l l x E A
The corresponding nonstationary mean and variance are
1,, (14)
Var (Z(x ]A)} = [ z 2 "dAz(x)(Z ) - z2(x) = 0 for all x E A
J
Tlle Deterministic Side of Geostatistics 9
Similarly, for any pair of locations x, x ' E A, the corresponding pair of con-
ditioned random variables Z(x]A) and Z(x']A) has a Dirac bivariate cdf with
parameters z(x), z(x')
Prob (Z(x [A) ~<z and Z(x'[A) <~ z'} : &z(x), z(x')( z, z')
1, if z(x)<.z and z(x')<.z' (15)
i(x; z) . i(x'; z')
0, otherwise
The corresponding covariance and nonstationary variogram are
Cov {Z(x ]A), Z(x' [A)} : ; f zz' . d2Az(x), z(x,)(z, z') - z(x) . z(x')
=0 (16)
Remarks
If the two RF models YA(x) and ZA(X ) share a common bivariate cdf, they
necessarily share common univariate cdf and moments:
10 Journel
Ordinary kriging is thus the process of minimizing the average squared error
[z(x) - z*(x)] 2 when the whole configuration of estimation at location x is
moved throughout the area A.
The Deterministic Side of Geostatistics 11
Remarks
The configuration of estimation is the relative geometry of the location x
and the locations x~ of the N data (a = 1, N) used for the estimate z*(x). Strictly
speaking x can take only those locations x' ~ A such that all corresponding data
t t
location xc~ = x + (x - x~), a = 1, N, still remain within A. Hence the area of
integration of formula (19) is not A b u t : A * ( x , N ) = {x' E A: x~ = x' + (x - x~)
A,a=I,N}.
Just like the compositing process defining the RF model YA(X), this moving
process over A can be considered whether the initial RF Z(x) is stationary or
not. More precisely, stationarity (and ergodicity) of the initial RF model Z(x) is
irrelevant to the definition of that moving process.
The preceding process of moving the configuration of estimation over the
area of interest was used by Matheron in his early work (1964, p. 83) to charac-
terize the error of estimation of a surface by a grid of holes, cf. Fig. 2. For each
location xo of the grid origin within the grid cell ( a l , a 2 ) of size ala2, an estimate
S*(xo) of the surface S is obtained by summing the positive cell areas. By vary-
ing the origin Xo within the basic cell (a~, a2) a variance, called "transitive esti-
mation variance of S " is defined
02(al, a 2 ) = 1( [S*(xo)-Sl2"dxo
ala2 J (a 1,a2)
02 02
1o, I i ! Io,
i I !
Fig. 2. Estimation of the same surface from two distinct origins of the data grid.
12 Journel
and in estimating the first terms of that expansion. In a more recent work "Es-
timer et Choisir" (1978), Matheron further developed his "transitive approach"
to provide a broad and philosophical discussion of the usage of probability in
the earth sciences.
An important question should be asked, "Does the previous minimization
criterion (19) make sense?" In particular, if A contains various mineralizations
and/or rock types, the configuration of estimation (x, xa, a = 1, n) may have to
be limited to homogeneous subareas of A. This is the deterministic wording of
nonstationarity within A.
In the presence of selective mining where only the rich grades z ( x ) > z c
(cutoff) will be recovered, it would make more sense to move the configuration
of estimation only over those rich subareas A*(zc) C A. The problem is that the
locations of these rich areas A*(zc) are not known beforehand.
n(xo)
z*(xo) = ~ xa(x0) . z(x,~) (2o)
~=1
where [n(xo)] C (N) is the subset of data locations actually used in the estima-
tion neighborhood of Z(Xo).
The weights Xa(Xo) depend on each point x o being estimated and on the
particular configuration of the n(Xo) data used. For convenience, this expression
can be rewritten with the origin of the coordinate system centered at the point
Xo being estimated
n(xo)
z*(Xo) = ~ Xa(Xo) "Z(Xo + ho~) with
~=1 (21)
The estimation configuration Ix, x + hc~, a = 1, n(xo)] can thus be moved within
A* without any datum spilling out of the area A. This moving process over A
generates the following average error characteristics
(22)
In the following we will make the approximation that A~ ~ A , that is, that
the dimensions h a of the estimation neighborhood are much smaller than the di-
mensions of A. In other terms, border effects are ignored in the present analysis.
A first step consists of setting the average error EA(xo) to zero, the deter-
ministic equivalent of ensuring unbiasedness.
Whatever the usually known value mA, the average error is zero if the sum
of the weights Xc~(x0) equals one
n(xo)
EA(Xo) = 0 if ~ ~(Xo) = 1 (23)
~=1
The second step consists of minimizing the average squared error VA(xo)
under the previous constraint. Under condition (23), the squared error can be
developed as
[z(x) - z*(x)] 2 = [z*(x) - m4 ] - 2~--~, Xc~(x) " [z(x). z(x + ha) - m~4]
O~
+ h~) dx - m£J
c~ t3
n(xo)
Z X~(Xo). ~.4(h. - he) - ~(Xo) = yah.) ~ = 1, n(Xo)
(24)
n(xo)
E X~(xo)= 1
~3=1
.(xo)
Min (VA(Xo) } - ~ X.(Xo) " 7A(h.) - ~(xo) (25)
The system (24) is none other than the standard ordinary kriging system.
In practice, the spatial average 7A(h) is replaced by a model fitted from the dis-
crete estimate ~A(h) using whatever data are available over the "homogeneous"
area A.
This kriging algorithm having been established without any recourse to
probabilistic modeling or notation, it appears that the probabilistic approach is
The Deterministic Side of Geostatistics !5
but one language to express the actual estimation criterion. Such notions as sta-
tionarity or ergodicity are important for the consistency of the language, but are
irrelevant to the real problem addressed.
CONCLUSIONS
The probabilistic language used by geostatisticians may induce them into
nonrewarding debates over questions of language and mask some real problems
that deserve more of their attention.
It has been shown that repetitiveness of the error of estimation can be ob-
tained by moving the configuration of estimation throughout a predefined area,
with the variogram model being identified to the corresponding spatial average
of squared differences. In probabilistic language, this moving process corresponds
to a double process of conditioning then compositing over the area.
This analysis did not provide any new estimation algorithm, but may give a
more physical appreciation of
REFERENCES
Breiman, L., 1969, Probability and stochastic processes: Houghton Mifflin Co., Boston,
324 p.
Doob, J. L., 1953, Stochastic processes: John Wiley & Sons, New York, 654 p.
Journel, A. G. and Huijbreghts, Ch. J., 1978, Mining geostatistics: Academic Press, London,
600 p.
Matheron, G., 1964, La theorie des variables regionalisees et ses applications: Masson, Paris,
305 p.
Matheron, G., 1978, Estimer et choisir: Les Cahiers du CGMM, Fontainebleau, 175 p.
Omre, H., 1984. The variogram and its estimation: in "Geostatistics for Natural Resources
Characterization" ed. Verly et al. puN. Reidel, Dordrecht, Holland, Part I, pp. 107-
I25.