Representation
of
Geographic Data
The Nature of Spatial Variation/
Areal Differentiation
• Three principles of the nature of spatial variation:
• proximity effects are key to understanding spatial variation
• issues of geographic scale and level of detail are key to
building appropriate representations of the world
• different measures of the world co-vary, and understanding
the nature of co-variation can help us to predict
Representation
• Representing spatial and temporal phenomena in the real world:
• since the real world is complex, this task is difficult and error prone!
• small things (i.e. human lives) are very intricate in detail
• viewed in aggregate human activity exhibits structure across
geographic spaces
• Deciding what data/information can be discarded as the inessential
while retaining the salient characteristics of the observable world
• Distinguishes between controlled variation, which oscillates around a
steady state, and uncontrolled variation:
• controlled variation - like utility management
• uncontrolled variation - climate change
• Informally, it is the similarity between observations as a function of
the time lag between them
• Our behavior in space often reflects past patterns of behavior - thus it
is one-dimensional, need only look in the past
• However, spatial events can potentially have consequences anywhere
in two-dimensional or even three-dimensional space
• How and why does spatial and temporal context affect what we do?
Tobler's First Law of Geography
• Everything is related to everything else, but near things are
more related than distant things (Tobler, 1970)
Spatial Autocorrelation
• Autocorrelation is the similarity between observations as a function
of the time
• Spatial autocorrelation is similarity in the location of spatial objects
and their attributes, i.e., manifestation of Tobler's Law!
• Is a measure of the degree to which a set of spatial features and their
associated data values tend to be clustered (positive spatial
autocorrelation) or dispersed (negative autocorrelation)
• Understanding spatial variation, the scale of spatial variation, and the
way in which geographic phenomena co-vary tells us:
• how we should represent the real world in our digital GIS?
• Spatial autocorrelation is determined both by similarities in position,
and by similarities in attributes:
• positive, zero, or negative
• Distance-based Spatial Autocorrelation
• (A) linear distance decay
• (B) negative power distance decay
• (C) negative exponential distance decay
Representation
• All representation:
• are needed to convey information
• fit information into a standard form or model
• almost always simplify the truth that is being represented
• Digital representation:
• digital & binary (1s and 0s)
• The basis of almost all modern human communication
The Fundamental Problem
• Geographic data are built up from atomic elements, or facts
about the geographic world
• At its most primitive, an atom of geographic data (strictly, a
datum) links a place, often a time, and some descriptive
property
• The fundamental problem: “the world is infinitely complex, but
computer systems are finite"
• Discrete Objects - the world is empty, except where it is
occupied by objects with well-defined boundaries that are
instances of generally recognized categories:
• objects can be counted
• objects have dimensionality:
• 0-dimension - points
• 1-dimension - lines
• 2-dimensions - areas
• Continuous Field - a finite number of variables, each one
defined at every possible position:
• omnipresent, everywhere dense
• can be distinguished by what varies, and how smoothly
• In this perspective, value (A) is a function of location (X):
• A = f (X)
• Contrast with the discrete object view - define the location of the
boundary of objects, or X = f (A)
• Representation of Geographic Data Jake K.
Rasters vs Vectors
• There are two methods that are used to reduce geographic
phenomena to forms that can be coded in computer databases
• Each can be used to represent both fields and discrete objects:
• usually raster is used to represent fields and vector for discrete
objects
• “Raster is faster, but vector is correcter"
Raster
• In a raster representation geographic space is divided into an
array of cells, each of which is usually square, but sometimes
rectangular:
• all geographic variation is then expressed by assigning
properties or attributes to these cells
• cells are called pixels (short for picture elements)
• In the raster data model, individual grid cells have one value
that represent a single phenomenon
• Raster accuracy is limited by the resolution of the cell
Vector
• In a vector representation, features are captured as a series of
points or vertices connected by straight lines:
• areas are often called polygons
• lines are often called polylines
• In the vector data model, discrete features can have many
different attributes representing numerous phenomena
• Representation
Generalization
• Simplifying the view of the world:
• describe entire areas, attributing uniform characteristics to
them, even when areas are not strictly uniform
• identify features on the ground and describe their
characteristics, again assuming them to be uniform
• some degree of generalization is almost inevitable in all
geographic data
• A geographic database cannot contain a perfect description;
instead, its contents must be carefully selected to fit within the
limited capacity of computer storage devices!