The file geodatabase
Presentation overview
Introduction
Comparisons and capabilities
Storage
File system operations, data types, storage limits and
requirements, configuration keywords
Compression Performance vs. other formats
Display, query, load, calculate Performance tips
Migrating to the file geodatabase Additional Information
File geodatabase
How many of you have already seen or used them?
9.2 geodatabase options
Personal GDB File GDB ArcSDE GDB
Personal Workgroup Enterprise
Embeds ArcSDE & database engine
ArcSDE & RDBMS required
Personal GDB
File GDB
ArcSDE Personal
ArcSDE Workgroup
ArcSDE Enterprise
Increasing size and/or functionality
Three geodatabase types
Personal GDB File GDB ArcSDE GDB (3 levels)
Storage format Storage capacity Supported O/S platform Number of users
MS Access 2 GB Windows
Folder of binary files No limits Any platform
RDBMS Depends on server Depends on RDBMS Multiple editors & readers Versioning, replication, archiving
Single editor Multiple readers None (check in / checkout replication only)
Single editor* Multiple readers None (check in / checkout replication only)
Versioning support
Why create a new format?
Users asked for alternative to personal geodatabase Reduce storage requirements Eliminate personal gdb 2 GB limit
Personal gdbs slow after ~ 500 MB
Make available to non-Windows platforms
Eliminate dependency on JET engine
Add ability to lock geodatabase data to ArcReader
Introducing the file geodatabase
New geodatabase format Stores a geodatabase in a folder of files
Like a folder of shapefiles No access to individual datasets via file system
Alternative to Access-based personal gdbs
High Performance Reduced memory requirements Removes database size limits Works on additional operating systems (cross platform)
Similar to other geodatabases
Supports the full Geodatabase model
Features, Annotation, Dimensions, Raster Networks, Topology Terrain, Geocoding, Representations
Work with file gdbs as you would personal gdbs
Designated with a different extension (.gdb vs .mdb) Single editor, no support for versioning New Locking mechanism
Advantages over personal geodatabase
No storage size limit Improved performance Reduced storage requirements Customize storage
Compression of vector data Configuration keywords (similar to ArcSDE)
Additional raster data management functionality More platforms supported
Windows and UNIX (solaris and linux)
Migration from personal to file
Most users will migrate to take advantage of benefits
Personal geodatabases are not going away Only move if it helps Three reasons some may not migrate
Comfort with Personal and have small databases < 500mb Some use Microsoft Access to perform operations Some store mature, historical, or archives in Personal
Editing file geodatabases
Like personal gdbs, Single-user editing
Does not support versioning Access locks at the database level
mydata.mdb and mydata.ldb
Locking
File gdbs have a new locking model Not a database-wide lock
More than one editor at a time, but on different data
Lock and entire feature dataset
All feature classes in the feature dataset are locked
Lock a standalone feature class Lock a standalone table
Stores as a folder of files
Lock file
Other files may also be present, ex. files that start with d if you are editing.
Contents deliberately vague to discourage use of file system
Datafile: consists of at least a .gdbtable and .gdbtablx Attribute index Also present if there are indexes Spatial index Signature file
File geodatabase at 9.2
Introduction
Comparisons and capabilities
Storage
File system operations, data types, storage limits and
requirements, configuration keywords
Compression Performance vs. other formats
Display, query, load, calculate Performance tips
Migrating to the file geodatabase Additional Information
File system operations
Always use ArcGIS tools, not to the file system Possible folder operations (but discouraged)
Copy geodatabase to another location Rename the geodatabase Delete the geodatabase No one else should be connected
Individual file operations
No operation is valid likely Possible data loss or render the data unusable For example, if you move files to another file geodatabase, you
wont be able to access the data
File geodatabase and permissions
In 9.2, there are no file gdb authentication / authorization capabilities You should not set permissions on individual files If you access a file geodatabase on a CD, the data is read-only You can share a file geodatabase folder as read-only
The read-only user will be able to query and display Users with write access can modify the data even when others
are currently reading the data
Permissions example
Reader, looking at the Roads feature class
Writer, starts editing the Roads Writer, adds some new roads while reader queries Writer, saves their work Reader, does not see new roads Reader, does a refresh and then sees new roads
Storage limits
No database size limit Per table limit: 1 TB (default) Per table limit: 256 TB
Available as a configuration keyword Provided for large rasters
Same data in a file geodatabase takes up less disk space than personal gdbs, shapefiles
Amount of reduction varies by dataset Storage on disk generally reduces by 50 to 75%
Storage comparisons
Shapefile US rivers and streams California roads US census block centroids US traffic analysis zones US counties 2.19 GB 1.23 GB 838 MB 249 MB 3.2 MB Personal gdb Exceeds 2 GB limit 684 MB 1.8 GB 295 MB 3.2 MB File gdb 878 MB 329 MB 705 MB 68 MB 1.6 MB 50%
Raster data
Unmanaged rasters stored like a pgdb
C:\ Student C:\ Student
Riley.mdb
Riley_catalog 1 2 3 F:\Images\R01.sid F:\Images\R02.tiff F:\Images\R03.img 1 2 3
Riley.gdb
Riley_catalog F:\Images\R01.sid F:\Images\R02.tiff F:\Images\R03.img
Riley.idb F:\ Images R01.sid, R02.tiff, R03.img F:\ Images
Riley.idb
R01.sid, R02.tiff, R03.img
Managed rasters in a pgdb
Stored as ERDAS Imagine files IDB folder One subfolder per raster Not really inside mdb file However it works like that
ArcCatalog copy, delete, or move
Student Manhattan.mdb
MillerRanch MillerDRG
Manhattan.idb c1 m_1.img c2 m_1.img
Managed rasters in a fgdb
New fgdb Empty
No GIS data
But lots of files
New File Geodatabase.gdb
Add a raster to the empty fgdb
Stored in the gdb folder Really inside gdb folder Hard to tell which files are raster
New File Geodatabase.gdb
erDRG
Configuration keywords
Predetermined keywords stored within the geodatabase, cannot be customized Compared to ArcSDE: very few options
None for specific datasets
Vast majority of users should use DEFAULTS DEFAULTS
1 TB per table UTF8 text attribute storage, optimal for latin alphabets
TEXT_UTF16
Use when lots of text in non-latin alphabet
MAX_FILE_SIZE_4GB, MAX_FILE_SIZE_256TB
File geodatabase at 9.2
Introduction
Comparisons and capabilities
Storage
File system operations, data types, storage limits and
requirements, configuration keywords
Compression Performance vs. other formats
Display, query, load, calculate Performance tips
Migrating to the file geodatabase Additional Information
Compression
Compress
Entire gdb Feature dataset Standalone feature class Table Vector data (raster is usually maximally compressed already)
Advantage: Further reduce storage requirements Lossless compression Based on Smart Data Compression (SDC)
Direct access format No uncompressing required
Compression tools
Compress / Uncompress tools
Right-click context menu commands Geoprocessing tools: Data Management Toolbox > File
Geodatabase toolset
Compression ratios
Feature class compression varies
Minimal amount to ratios exceeding 4:1 Key factor: average number of vertices per feature Attribute fields: text, integer and dates compress better than
floats and doubles
For tables, redundancy is the most important factor
Up to ratios exceeding 4:1 More redundancy, the greater compression
Finds and removes redundancy Repeating values, like run length encoding Store value once and a count of how many times it occurs
Compression comparison
Uncompressed Compressed US census block centroids California roads Calgary buildings US rivers and streams Mexico roads 705 MB 329 MB 48 MB 878 MB 3.5 MB 162 MB 83 MB 20 MB 442 MB 2.7 MB
Ratio 4.4 3.9 2.4 2.0 1.3
Less vertices / feature = more compression
Compression implications on editing
Editing not allowed on a compressed dataset Mixed state - compressed and uncompressed feature classes in one feature dataset
Compress a feature dataset Then make new feature class New feature class is uncompressed But you cant edit it
If a feature dataset or relationship class contains a compressed feature class, participating feature classes cannot be edited
Post-compression
Properties that cannot be modified after compress
Coordinate system information, tolerance Subtypes, domains, default values Fields (add, delete, modify properties) Spatial index Representations
Properties that can be modified after compress
Alias (for feature class / table name) Attribute indexes Metadata
More post-compression rules
Properties of a compressed feature dataset cannot be modified
Coordinate system information Cannot create topology or geometric network from compressed
feature classes Cannot modify relationship class, topology, geometric network, network dataset properties
Properties of a compressed geodatabase can be modified:
Domains
File geodatabase at 9.2
Introduction
Comparisons and capabilities
Storage
File system operations, data types, storage limits and
requirements, configuration keywords
Compression Performance vs. other formats
Display, query, load, calculate Performance tips
Migrating to the file geodatabase Additional Information
Display and Query performance
Compared to shapefiles
Generally comparable Shapefiles store geometries separate from attributes
sometimes faster for non-symbolized drawing
Compared to personal GDB
Faster, both locally and over the network 20% to > 10x faster is common Especially true for personal geodatabases over ~ 500 MB
Uncompressed vs. Compressed
Generally comparable
Load performance
Loading shapefiles into file geodatabases is faster than loading into any other type of geodatabase
1.5 -2 x faster than loading into personal geodatabase 2-2.5 x faster than loading into ArcSDE
Copy / Paste into a file gdb is also faster than into any other gdbs
Performance: tips
Defrag disk occasionally Leave sufficient disk space Spatial index grid sizes
in rare cases may need adjustment
Compact the geodatabase
on a regular basis if you frequently add / delete data after any large-scale change
XY resolution
If data is not as accurate as the default, set a larger resolution
when you create the data
File geodatabase at 9.2
Introduction
Comparisons and capabilities
Storage
File system operations, data types, storage limits and
requirements, configuration keywords
Compression Performance vs. other formats
Display, query, load, calculate Performance tips
Migrating to the file geodatabase Additional Information
Migrating reason review
Reasons to migrate from personal gdbs
No size limit Improved performance Reduce storage UNIX
Reasons not to migrate
Very small datasets only, no advantage to moving Require ability to leverage Access Have mature data already in pgdb
Most users will benefit from migrating
Standard data conversion tools
From a personal geodatabase
Copy/Paste (for feature datasets, classes, and tables) Export to XML Workspace Document (for geodatabases) Existing GP conversion tools
From shapefiles, coverages or other formats
Right-click and Export Existing GP conversion tools
Models or scripts for moving many datasets Creating new datasets
Works the same as for personal geodatabases
SQL statement syntax differences
FGDB SQL similar to shapefile, coverage FGDB SQL differs from personal geodatabase
supports a subset of features and functions syntax differs slightly
Dialogs you create SQL expressions with help you with the correct syntax
appropriate delimiters for fields and values relevant keywords and operators
SQL statements for a personal gdb layer may not work after migration
Definition queries, saved queries, label queries FGDB does not have some functions, Distinct, GroupBy,
OrderBy
Syntax differs from personal geodatabases
Delimit fields with field, not [field] Precede dates with date, not #
[Birth] = #04-11-1963# Birth = date 1963-04-11 PGDB syntax FGDB syntax
String searches case sensitive UPPER and LOWER, not UCASE and LCASE
[Name] = redlands LOWER(Name) = redlands
Wildcards are _ and %, not ? and *
[Name] Like ?edlands Name Like _edlands
Migrating ArcObjects applications
Update WorkspaceFactory to get the app working on file gdb
Change AccessWorkspaceFactory to FileGDBWorkspaceFactory Change extenstion from .mdb to .gdb
Update any SQL syntax Use load only mode to maximize data transfer performance
Dim pFeatureClassLoad As IFeatureClassLoad Set pFeatureClassLoad = pFeatureClass pFeatureClassLoad.LoadOnlyMode = True
No other differences in ArcObjects
File geodatabase at 9.2
Introduction
Comparisons and capabilities
Storage
File system operations, data types, storage limits and
requirements, configuration keywords
Compression Performance vs. other formats
Display, query, load, calculate Performance tips
Migrating to the file geodatabase Additional Information
For more information
More useful on-line help topics Use the Search tab to search for the topics
Try these:
Types of geodatabases Migrating to the file geodatabase How raster data is stored in a geodatabase Configuration keywords for file geodatabases Setting spatial indexes About compressing file geodatabase data Compacting file and personal geodatabases