EM4 Lesson 3
EM4 Lesson 3
GEOGRAPHIC
INFORMATION
SYSTEM
9/4/2025
MA. ANELI A. AUGUIS, M.Sc.
1
GEOSPATIAL DATA TYPES AND
FORMATS
9/4/2025 2
Geospatial data can be represented in two primary forms:
vector and raster
Understanding these data types is fundamental to using GIS effectively, as
each type has its strengths and is suited for different kinds of analysis and
representation.
9/4/2025 3
3. Data
• refer to a spatially referenced dataset, typically
composed of two types - geometric data and
attribute data
• There are also two kinds of data structures -
vector and raster
9/4/2025 4
• geometric represents the spatial
component of a geographic feature (e.g,
shape and position)
• are two or three-dimensional coordinates
that define the spatial distribution of
points, lines, and areas
9/4/2025 5
• attribute describes the properties of a
feature (e.g color, size, information)
• descriptive information about the
spatial data
9/4/2025 6
vector graphics
are comprised of
9/4/2025
vertices and
paths
Vector Data
9/4/2025
surface.
• Each vector feature can have
associated attribute data that provides
Vector Data
9/4/2025
Each point can have attributes like species type, height, and health status.
Lines: Line (or arc) data is used to represent linear features having
one dimension and, therefore can only be used to measure length
from the starting to ending point.
Roads, rivers, or trails within a forest. Lines can represent pathways, water
flow, or even migration routes of animals.
9/4/2025
usually regularly spaced and
square
class
11
Raster data is represented as a grid
of cells or pixels, each with a
specific value that represents
information such as temperature,
9/4/2025
elevation, or land cover type.
Raster Data
9/4/2025 13
Types of Rasters
• Continuous rasters are grid cells with gradually changing data
(e.g. digital elevation model or DEM, temperature)
• Discrete rasters have distinct themes or categories (e.g. land
cover)
9/4/2025 14
Satellite Imagery: Images from satellites are stored
as raster data, with each pixel representing a specific
area on the ground and containing information like
vegetation cover, moisture content, or thermal
9/4/2025
properties.
Digital Elevation Models (DEMs): These are raster
datasets that represent the elevation of the land
surface, useful for terrain analysis, watershed
Raster Data
9/4/2025 16
What are the disadvantages of using
vector?
• Continuous data is poorly stored and
displayed as vectors
• Any feature edits require updates on
topology
• With a lot of features, vector
manipulation algorithms are
complex
9/4/2025 17
What are the advantages of using
raster?
• raster grid format is the data model for
satellite data and other remote sensing
data
• raster positions are simple
• Each cell position can be inferred with
cell size and a bottom-left coordinate
• Data analysis is usually quick and easy to
perform
• Quantitative analysis is intuitive
9/4/2025 18
What are the disadvantages of using
raster?
• cell size contributes to graphic quality; hence, it can have a
pixelated look and feel
• linear features and paths are difficult to display
• cannot create network datasets or perform topology rules
• As resolution increases, the grid size decreases and comes
at a cost for speed of processing and data storage
9/4/2025 19
Shapefile (.shp)
Vector GIS Formats KML/KMZ (.kml/.kmz):
9/4/2025
GDB (File Geodatabase):
Layer (LYR):
OpenStreetMap (OSM):
20
Vector GIS Formats
1. SHP (Shapefile)
• The most common geospatial file type in all commercial and
open-source GIS software.
• becomes the industry standard
• Three mandatory files - SHP is the feature geometry, SHX is the
shape index position, and DBF is the attribute data
• optional but are not necessary - PRJ is the projection system,
XML is the associated metadata, SBN is the spatial index for
optimizing queries, and SBX helps with loading times
9/4/2025 21
Vector GIS Formats
2. KMZ/KML (Keyhole Markup Language)
• is an XML-based format and is primarily used for Google Earth
• KML was developed by Keyhole Inc, which was later acquired
by Google in 2004
• KMZ (KML-Zipped) is a compressed version of the file
• the longitude and latitude components (decimal degrees) are
defined by the World Geodetic System of 1984 (WGS84)
9/4/2025 22
Vector GIS Formats
3. GDB (File Geodatabase)
• ESRI created it to be a container for storing multiple attribute
tables, vector, and raster datasets
• offers structural and performance advantages
• has 1TB of file storage
• Within a geodatabase, shapefiles are referred to as feature
classes
9/4/2025 23
Vector GIS Formats
4. MDB (Personal Geodatabase)
• manages multiple attribute tables, vectors and raster datasets
• is a Microsoft Access-based personal GDB
• has 2GB file storage
9/4/2025 24
Vector GIS Formats
5. LYR (Layer)
• are used for displaying a set of symbology in a map
• it doesn’t contain the geographic data itself, it simply specifies
how data will be displayed
• can represent polygons, polylines, points or raster datasets
9/4/2025 25
Vector GIS Formats
6. OSM (OpenStreetMap)
• is the largest crowdsourced GIS data project on Earth
• OSM is an XML-based file format, and the more efficient,
smaller PBF (Protocolbuffer Binary Format) is an alternative to
the XML-based format
• data interoperability in QGIS can load native OSM files
9/4/2025 26
Raster GIS Formats
ESRI Grid
9/4/2025
GeoTIFF (Geographic Tagged Image
File Format)
9/4/2025 28
Raster GIS Formats
2. GeoTIFF (Geographic Tagged Image File Format)
• has become an industry image standard file for GIS and
satellite remote sensing applications
• can be accompanied by other files
TFW is the world file that is required to give your raster
geolocation.
XML is your metadata,
AUX stores projections and other information
9/4/2025 29
Raster GIS Formats
3. JPEG 2000 (Joint Photographic Experts Group)
• typically have a JP2 file extension and can give an option for
lossy or lossless compression
• is an optimal choice for background imagery because of lossy
compression
• can achieve a compression ratio of 20:1
9/4/2025 30
• Lossless compression retains raster values
during compression and file size is also reduced
(e.g. LZ77).
9/4/2025 31
Raster GIS Formats
4. ASCII (American Standard Code for
Information Interchange)
• uses a set of numbers between 0 and
255 for information storage and
processing
• ASCII text files store GIS data in a
delimited format - this could be a
comma, space or tab delimited format
9/4/2025 32
Multi-Temporal GIS Formats
2. GRIB (GRIdded Binary or General
Regularly distributed) Information in Binary form)
• commonly used in meteorology to store historical and
forecast weather data
• has advantages of self-description, flexibility, and expandability
• GRIB is standardized by the World Meteorological
Organization’s Commission and has been in operation since
1985
9/4/2025 33
Multi-Temporal GIS Formats
3. HDF (Hierarchical Data Format)
• designed by the National Center for Supercomputing
Applications (NCSA) to manage extremely large and complex
scientific data
• a versatile data model with no limit on the size of data objects in
the collection
• ArcGIS is capable of reading HDF4 and HDF5 data
9/4/2025 34
9/4/2025 35
Files Associated with a Shapefile
• Mandatory
.shp • Description: This file contains the actual geometry data of the
features (points, lines, or polygons). It defines the spatial
(Shapefile Format) representation of the vector data, such as the vertices of polygons.
Without this file, there is no spatial data to display or analyze.
• Mandatory
.shx • Description: This is an index file that stores the offset of the feature
geometry in the .shp file. It allows the GIS software to quickly locate
(Shape Index Format) and display the individual features from the .shp file, improving
performance when rendering maps or accessing specific records.
• Mandatory
• Description: The .dbf file stores attribute data in a tabular format,
.dbf with each row corresponding to a feature (e.g., a polygon) in the .shp
file. This file contains descriptive data (e.g., forest type, area, tree
(Attribute Data File) species) and allows users to perform queries and analyses based on
9/4/2025 attributes. 40
Files Associated with a Shapefile
.prj • Optional but Highly Recommended
• Description: This file contains information about the coordinate system and
(Projection map projection used by the shapefile. Having a .prj file ensures that the
shapefile aligns correctly with other spatial data sets and is essential for
Format) accurate geographic analysis and map production.
.qix • Optional
• Description: This file is used to speed up spatial queries by creating a spatial
(Quadtree Spatial index. It is particularly useful for large shapefiles with numerous features, as
it optimizes spatial searches, making data display and query operations
Index) more efficient.
. cpg
• Optional
• Description: This file specifies the character encoding used by the .dbf file,
ensuring that special characters and non-English alphabets are displayed
(Code Page File) correctly. It is particularly useful for international datasets.
9/4/2025 42
Mandatory vs. Optional Files
Mandatory files Optional Files:
• .shp: Contains the geometry • .prj: Highly recommended for coordinate system
of the spatial features. information but not strictly necessary for
• .shx: Provides an index to the shapefile functionality.
geometry. • .qix, .sbn, .sbx: Useful for performance
• .dbf: Stores attribute data optimization in large datasets but not essential
associated with each feature. for basic shapefile use.
• .xml: Provides metadata, helpful for
documentation and data management.
• .cpg: Ensures correct character encoding for
attribute data, especially in multilingual
contexts.
9/4/2025 43
Practical Considerations
File Management: When sharing or
transferring shapefiles, it is important to Projection Consistency: Always include a
include all mandatory files (.shp, .shx, .prj file to maintain consistency across
.dbf) and any optional files that are different GIS projects. This helps avoid
relevant (.prj for projection information). spatial misalignment when combining
Missing files can lead to errors or data from various sources.
incomplete data representation.
9/4/2025 44
Sources of Geospatial Data
9/4/2025 45
Remotely Sensed Data
•Aerial Photography •Satellite Imagery
9/4/2025 56
• FAO Geonetwork is an open-source geospatial data platform
developed by the Food and Agriculture Organization (FAO) of the
United Nations. It provides access to a wide range of datasets related
to agriculture, forestry, fisheries, land use, climate, and natural
resources.
9/4/2025 57
NASA’s Socioeconomic Data and Applications Center (SEDAC)
• One of NASA’s Distributed Active Archive Centers (DAACs), specializing in
integrating remote sensing data with socioeconomic and environmental
information. It is managed by the Center for International Earth Science
Information Network (CIESIN) at Columbia University.
• gridded population of the world
9/4/2025 58
• UNEP Environmental Data Explorer
• The United Nations Environment Programme (UNEP) Environmental Data
Explorer was the main database for environmental statistics, providing
global datasets on a wide range of environmental topics. However, it has
been discontinued, and UNEP has shifted its focus to newer platforms like
the UNEP Data and Maps portal and the Global Environment Outlook
(GEO) data platform.
• Holds more than 500 variables such as freshwater, climate and health.
9/4/2025 59
• NASA Earth Observations (NEO) is a freely accessible
platform that provides global satellite imagery and
environmental data for scientific research, education, and
decision-making. It is managed by NASA’s Goddard Space
Flight Center (GSFC).
• 50+ global datasets, mostly climate-related
9/4/2025 60
• Sentinel is a series of Earth observation satellites developed by the
European Space Agency (ESA) as part of the Copernicus program,
aimed at providing free and open access to global environmental
monitoring data. The Sentinel satellites carry a variety of instruments
to capture high-resolution imagery and data for diverse applications,
including land use, vegetation monitoring, oceanography, and
climate change.
9/4/2025 61
• Sentinel-1 (Radar Imaging; Radio Detection and Ranging)
• Sentinel-2 (Optical Imaging)
• Sentinel-3 (Ocean and Land Monitoring)
• Sentinel-4 (Air Quality Monitoring)
• Sentinel-5 (Atmospheric Monitoring)
• Sentinel-6 (Global Ocean Monitoring)
9/4/2025 62
SPATIAL REFERENCE
SYSTEMS AND MAP
PROJECTIONS
9/4/2025 63