Jurnal 2 Risaldo
Jurnal 2 Risaldo
Abstract
Graph database models can be characterized as those where data
structures for the schema and instances are modeled as graphs or gen-
eralizations of them, and data manipulation is expressed by graph-
oriented operations and type constructors. These models flourished
in the eighties and early nineties in parallel to object oriented mod-
els and their influence gradually faded with the emergence of other
database models, particularly the geographical, spatial, semistructured
and XML.
Recently, the need to manage information with inherent graph-like
nature has brought back the relevance of the area. In fact, a whole
new wave of applications for graph databases emerged with the devel-
opment of huge networks (e.g. Web, geographical systems, transporta-
tion, telephones), and families of networks generated thanks to the
automation of the process of data gathering (e.g. social and biological
networks).
The main objective of this survey is to present in a single place
the work that has been done in the area of graph database model-
ing, concentrating in data structures, query languages and integrity
constraints.
1
Contents
1 Introduction 3
1.1 Database Models Evolution – Brief Historical Overview . . . . . . . 5
1.2 Graph Database Models – Brief Historical Overview . . . . . . . . . 6
1.3 Scope and Organization of this Survey . . . . . . . . . . . . . . . . . 9
2
1 Introduction
The term data model has been used in the information management com-
munity with different meanings and in diverse contexts. In its most general
sense, a data[base] model (db-model )1 is a concept that describes a collection
of conceptual tools for representing real-world entities to be modeled and
the relationships among these entities [116]. Often this term denotes simply
a collection of data structure types, or even a mathematical framework to
represent knowledge [90].
From a database point of view, the conceptual tools defining a db-model
should address at least the structuring and description of the data, its main-
tainability and the form to retrieve or query the data. According to these
criteria, a db-model is defined as a combination of three components, first
a collection of data structure types, second a collection of operators or in-
ference rules and third a collection of general integrity rules [31]. Note that
several proposals of db-models define only the data structures, omitting
sometimes operators and/or integrity rules.
Due to the importance of modeling conceptually, philosophically and in
practice, db-models have become essential abstraction tools. Among the
purposes of a db-model are: Tool for specifying the kinds of data permis-
sible; general design methodology for databases; coping with evolution of
databases; development of families of high level languages for query and
data manipulation; focus in DBMS architecture; vehicle for research into
the behavioral properties of alternative organizations of data [31].
Since the emergence of database management systems, there has been an
ongoing debate about what the db-model for such a system should be. The
evolution and diversity of existent db-models show that there is no silver
bullet for data modeling. The parameters influencing their development are
manifold, and among the most important we can mention the characteris-
tics or structure of the domain to be modeled, the type of intellectual tools
that appeals the user, and of course, the hardware and software constraints
imposed. Additionally, each db-model proposal is grounded on certain the-
oretical tools, and serves as base for the development of related models.
Figure 1 sketches these influences.
1
In the database literature the terms “data model” and “database model” (and some-
times even “model”) usually denote the same concept. In the scope of this survey we will
consider them as synonyms and use the abbreviated expression db-model.
3
Year
.... Mathemathical Hypertext
Logic
Knowledge
Representation
Hierarchical
1970 Relational
Network
Logic Semantic
Programing
Object Oriented
Programing
Deductive
1980
Graph
Object Oriented
Statistical
Databases
1990 WWW
Multidimentional
Semistructured
Theoretical Basis
Data Model XML
2000 Influence
4
1.1 Database Models Evolution – Brief Historical Overview
In the beginnings of the design of db-models, physical (hardware) constraints
were one of the fundamental parameters to be considered. Before the advent
of the relational model, most db-model focused essentially in the specifica-
tion of the structure of data in actual file systems. Kerschberg et al.c̃ite50130
developed a taxonomy of db-models prior to 1976, comparing essentially
their mathematical structures and foundation, and the levels of abstraction
used.
Two representative db-models are the hierarchical [126] and network [122]
models, which emphasize the physical level, and offer the user the means to
navigate the database at the record level, thus providing low level operations
to derive more abstract structures.
The relational db-model was introduced by Codd [30, 32] and highlights
the concept of level of abstraction by introducing the idea of separation
between physical and logical levels. It is based on the notions of sets and
relations. Due to its simplicity of modeling, it gained a wide popularity
among business applications.
Semantic db-models [100] allow database designers to represent objects
and their relations in a natural and clear manner to the user (as opposed
to previous models). They intended to provide the user with tools that
could capture faithfully the semantics of the information to be modeled. A
well-known example is the entity-relationship model [28].
Object oriented db-models [75] appeared in the eighties, when most of the
research was concerned with so called “advanced systems for new types of ap-
plications [17]. These db-models are based on the object-oriented paradigm
and their goal is representing data as a collection of objects that are orga-
nized in classes and have complex values associated with them.
Semistructured db-models [23] are designed to model data with a flexi-
ble structure, e.g., documents and Web pages. Semistructured data (also
called unstructured data) is neither raw nor strictly typed as in conventional
database systems. Additionally, data is mixed with the schema, a feature
which allows extensible exchange of data. These db-models appeared in the
nineties and are currently in evolution.
The XML (eXtended Markup Language) [21] model did not originate in
the database community. Although originally introduced as a standard to
exchange and model documents, soon it became a general purpose model,
with focus on information with tree-like structure. Similar to semistructured
model, scheme and data are mixed. See Section 2.3 for a more in depth
comparison among these models.
5
Other Models and Frameworks. There are other important db-models
designed for particular applications, as well as modeling frameworks not di-
rectly focusing in database issues, which indirectly concern graph database
modeling. Among the db-models are Spatial databases [43, 98, 109], Ge-
ographical Information Systems (GIS) [112, 15], Temporal db-models [121,
29], Multidimensional db-models [130]. Frameworks related to our topic, but
not directly focusing in database issues are Semantic Networks [119, 107, 56],
Conceptual Graphs [117, 118], and Knowledge Representation Systems: G-
Net Model [40], Topics Maps [101, 102, 1, 89], Hypertext [33]. Due to the
size limitations of this survey they are not covered here.
6
1975 R&M
........
1984 LDM
......
1987 G−Base
1988 O2
1989 Tompa
1991 GROOVI
1996 GRAS
.......
1999 GOQL
.......
2002 GDM
7
influential graph-oriented object model, intended to be a theoretical basis
for a system in which manipulation as well as representation are transpar-
ently graph-based. Among the subsequent developments based on GOOD
are: GMOD [13] that proposes a number of concepts for graph-oriented
database user interfaces; Gram [11] which is an explicit graph db-model
for hypertext data; PaMaL [48] which extends GOOD with explicit rep-
resentation of tuples and sets; GOAL [69] that introduces the notion of
association nodes; G-Log [99] which facilitates working with end user inter-
faces; and GDM [68] that incorporates features from object-oriented, Entity-
Relationship and Semistructured models.
There were proposals that used generalization of graphs with data mod-
eling purposes. Levene and Poulovassilis [84] introduced a db-model based
on nested graphs, called the Hypernode Model, on which subsequent work
was developed [104, 83]. The same idea was used for modeling multi-scaled
networks [86] and genome data [54]. GROOVY [85] is an object oriented
db-model which is formalized using hypergraphs. This generalization was
used in other contexts: query and visualization in the Hy+ system [34];
modeling of data instances and access to them [132]; representation of user
state and browsing [125];
There are several other proposals that deal with graph data models.
With a motivation coming from managing information in transport net-
works, Güting proposed the model GraphDB [58] intended for modeling
and querying graphs in object-oriented databases. Another general pur-
pose project is Database Graph Views [57], that proposes an abstraction
mechanism to define and manipulate graphs stored in either relational ob-
ject oriented or file systems. The project GRAS [73] uses attributed graphs
for modeling complex information from software engineering projects. Fi-
nally, the well known OEM [97] model aims at providing integrated access
to heterogeneous information sources, focusing in information exchange.
8
1.3 Scope and Organization of this Survey
The objective of this survey is to present in a single place the work that has
been done in the area of graph database modeling. We stress the fact that
the goal of the survey, rather than making a balance of the area, is to present
in a comprehensive way the different developments and relevant pointers to
facilitate a researchers to go to the sources. This obvious goal of a survey
is highlighted in our case by the fact that the area has been overlooked and
lately is quickly gaining relevance.
We concentrate in presenting the main aspects of modeling, that is, data
structures, query languages and integrity constraints. Considering that the
area has not yet an identity by itself, we spend Section 2 surveying different
existing views on graph db-models, and the motivations and application
that drive this developments. Section 3 surveys the most relevant graph
db-models and describe in detail each of them.
There is a substantial amount of work dealing with query languages and
graph interfaces. In fact, we think that transformation and query languages
for graphs are topics that deserves a thorough survey by itself. On the
other hand, not all the db-models treated in Section 3 consider the topic
of constraints, and several of them do not define a proper query language.
Section 4 gives a brief overview of these topics with the sole purpose of giving
a flavor of the area while a survey comes up.
We would like to warn the reader that there are several related areas that
fall out of the scope of this survey. Among the most important we can men-
tion graph visualization, graph data structures and algorithms for secondary
memory, graph methods for databases, and in general graph database sys-
tem implementation. Table 19 indicates which models covered in this survey
were implemented.
9
2 Graph Data Modeling
2.1 What is a Graph Data Model?
Although almost all papers on db-models cited in the previous section use
the term “graph data[base] model”, few of them define the notion explicitly.
Nevertheless their views on what a graph db-model is do not differ sub-
stantially. Usually the implicit definition is given by comparing the model
against other models where graphs are involved, like the semantic, object-
oriented and semi-structured models.
In what follows we will conceptualize the notion of graph db-model ac-
cording to the three basic components of a db-model, namely data struc-
tures, transformation language, and integrity constraints. A graph db-model
is characterized by:
10
On top of these descriptions, one could add the fact that sometimes the
schema and the data (instances) are difficult to differentiate in these
models, a fact that resembles closely semi-structured models. But in
most cases the schema and the instances are separated.
11
Type of Abstract. Base data Main Data complex.
Model level structure Focus homogeneity.
Network physical point + rec. records simple/hom.
Relational logical relations data/attributes simple/hom.
Semantic user graphs schema/relations medium/hom.
Object-O logical/physical objects object/methods high/het.
Semistruct. logical tree data/components. medium/het.
Graph logical graph data/relations medium/het
can define some part of the database explicitly as a graph structure [58],
allowing encapsulation and context definition [84].
Second, queries can refer directly to this graph structure. Associated
with graphs are specific graph operations in the query language algebra,
such as finding shortest paths, determining certain subgraphs, and so forth.
Explicit graphs and graph operations allow a user to express a query at a
very high level. To some extent, this is in contrast to graph manipulation
in deductive databases, where often fairly complex rule programs need to
be written [58]. Last but not least, for purposes of browsing it may be
convenient to forget the schema [24].
Third, as far as implementation is concerned, graph databases may pro-
vide special storage graph structures for the representation of graphs and the
most efficient graph algorithms available for realizing specific operations [58].
Although the data may have some structure, the structure is not as rigid,
regular or complete as traditional DBMS. It is not important to require full
knowledge of the structure to express meaningful queries [4]. The system
can use efficient graph algorithms designed to utilize the special graph data
structures [58].
12
Physical db-models. They were the first ones to offer the possibility
to organize large collections of data. Among the most important ones are
the hierarchical [126] and network [122] models. These models lack good
abstraction level and are very close to physical implementations. The data-
structuring is not flexible and not apt to model non-traditional applications.
For our discussion they do not have much relevance.
Relational db-model [30, 32] was introduced by Codd to highlight the
concept of level of abstraction by introducing a clean separation between
physical and logical levels. Gradually the focus shifted to modeling data as
seen by applications and users [93]. This is the emphasis and the achieve-
ment of the relational model, in a time where the domain of application were
basically simple data (banks, payments, commercial and administrative ap-
plications).
The relational model was a landmark development because it provided
a mathematical basis to the discipline of data modeling. It is based on the
simple notion of relation, which together with its associated algebra and
logic, made the relational model a primary model for database research. In
particular, its standard query and transformation language, SQL, became a
paradigmatic language for querying.
The differences between graph db-models and the relational db-model
are manifold. Among the most relevant ones are: the relational model was
directed to simple record-type data with a structure known in advance (air-
line reservations, accounting, inventories, etc.). The schema is fixed and
extensibility is a difficult task. Integration of different schemes is not easy
nor automatizable. The query language does not support paths, neighbor-
hoods and several other graph operations, like connectivity (an exception is
transitivity). There are no objects identifiers, but values.
Semantic db-models [100] have their origin in the necessity to provide
more expressiveness and incorporate a richer set of semantics into the database
from the user point of view. They allow database designers to represent ob-
jects and their relations in a natural and clear manner (similar to the way
the user view an application) by using high-level abstraction concepts such
as aggregation, classification and instantiation, sub- and super-classing, at-
tribute inheritance and hierarchies [93]. A well-known example is the entity-
relationship model [28]. It has become a basis for the early stages of database
design, but due to lack of preciseness cannot replace models like relational
or Object Oriented. Other examples of semantic db-models are IFO [3]
and SDM [63]. For graph db-models research, semantic db-models are rele-
vant because they are based on a graph-like structure which highlights the
relations between the entities to be modeled.
13
Object oriented (O-O) db-models [75] appeared in the eighties, when the
database community realized that the relational model was inadequate for
data intensive domains (Knowledge base, engineering applications). O-O
databases were motivated by the emergence of non-conventional database
applications consisting of complex objects systems with many semantically
interrelated components as in CAD/CAM, computer graphics or information
retrieval. According to the O-O programming paradigm on which they are
based, their objective is representing data as a collection of objects that are
organized in classes and have complex values and methods associated with
them. Although O-O db-models permit much richer structures than the
relational db-model, they still require that all data conform to a predefined
schema [4].
O-O db-models have been related to graph db-models due to the explicit
or implicit graph structure in their definitions [85, 13, 59]. Nevertheless,
there remain important differences rooted in the form that each of them
models the world. O-O db-models view the world as a set of complex objects
having certain state (data) and interacting among them by methods. On
the contrary, graph db-models, view the world as a network of relations,
emphasizing the interconnection of the data, and the properties of these
relations. The emphasis of O-O db-models is on the dynamics of the objects,
their values and methods. In contrast, graph db-models emphasizes the
interconnection while maintaining the structural and semantic complexity
of the data. A detailed comparison between these db-models may be founded
in [17, 72, 93, 116].
Semistructured db-models [23, 2]. The need for semistructured data (also
called unstructured data) was motivated by: the increased existence of un-
structured data, data exchange and, data browsing [23]. In semistructured
data the structure is irregular, implicit and partial; the schema does not
restrict the data, only describes it, is very large and rapidly evolving; the
information associated with a schema is contained within the data (data con-
tains data and its description, so it is self-describing) [2]. Among the most
representative models are OEM [97], Lorel [4], UnQL [24], ACeDB [120]
and Strudel [44]. Generally, semistructured data is represented by a tree-
like structure. Nevertheless cycles between data are possible, establishing in
this way a structural relation with graph db-models. Some authors charac-
terize semistructured data as rooted directed connected graphs [24].
14
2.4 Graph Data Model Motivations and Applications
Graph db-models are motivated by real-life applications where information
about interconnectivity of its pieces is a salient feature. We will divide these
application areas in Classical and Complex networks.
Classical Applications. The applications that motivated the introduction
of the notion of graph databases were manifold:
2. On the same direction, the observation that graphs have been integral
part of the database design process in semantic and object-oriented
db-models, brought the idea of introducing a model in which both,
data manipulation and data representation were graph based [59].
15
9. Lately, the emergence of hypertext on-line made evident the need for
other db-models [125, 132, 11]. Together with hypertext, the Web
created the need for a model more apt than classical ones for informa-
tion exchange. This was one of the main motivation of semistructured
models.
16
ways [41], a tutorial on Graph Data Management for Biology [96], and a
model for Chemistry [18].
It is important to stress that classical query languages offer little help
when dealing with the type of query needed in the above areas. As exam-
ples, data processing in GIS include geometric operations (area or boundary,
intersection, inclusions, etc), topological operations (connectedness, paths,
neighbors, etc) and metric operations (distance between entities, diameter
of the network, etc). In genetic regulatory networks examples of measures
are connected components (interactions between proteins) and degrees of
nearest neighbors (strong pair correlations). In social networks, distance,
neighborhoods, clustering coefficient of a vertex, clustering coefficient of a
network, betweenness, size of giant connected components, size distribution
of finite connected components [42]. Similar problems arise in the Semantic
Web, where querying RDF data increasingly needs graph features [14].
17
3 Representative Graph Database Models
In this section we describe in some detail the most representative graph
db-models, choosing those that define and use explicitly graph structures
or generalizations of them. Additionally we describe other related models
that use graphs, do not fit properly as graph db-models. In them, graphs
are used, for example, for navigation, for defining views, or as language
representation.
For each proposal, we present their data structures and, when available,
their query languages and integrity constraint rules. In general, there are few
implementations and no standard benchmarks, hence we avoid surveying this
issue. For information about the existence of implementations see Figure 19.
To give a flavor of the modeling in each proposal, we will run the following
example about a toy genealogy shown in Figure 3.
18
Schema Instance
PP
I (N) I (L) I (NL) I (PP)
Person−Parent
l val ( l ) l val ( l ) l val ( l ) l val ( l )
Figure 4: Logical Data Model. The schema (on the left) uses two basic type
nodes for representing data values (N and L), and two product type nodes
(NL and PP) to establish relations between data values in a relational style.
The instance (on the right) is a collection of tables, one for each node of the
schema. Note that internal nodes use pointers (names) to make reference to
basic and set data data values defined by other nodes.
19
With the objective of avoiding cyclicity at the instance level, the model
proposes to keep a distinction between memory locations and their content.
Thus, instances consist of a set of l-values (the address space), plus an r-
value (the data space) assigned to each of them. These features allow to
model transitive relations like hierarchies and genealogies.
Over this structure a first order many-sorted language is defined. With
this language, a query language and integrity constraints are defined. Fi-
nally, and algebraic language –equivalent to the logical language– is pro-
posed, providing operations for node and relation creation, transformation
and reduction of instances, and other operations like union, difference and
projection.
LDM is a complete db-model (i.e. data structures plus query languages
and integrity constraints) The model supports modeling of complex relations
(e.g. hierarchies, recursive relations). The notion or virtual records (pointers
to physical records) proves useful to avoid redundancy of data by allowing
cyclicity at the schema and instance level. Due to the fact that the model is a
generalization of other models (like the relational model), their techniques or
properties can be translated into the generalized model. A relevant example
is the definition of integrity constraints.
20
Schema Instance
PERSON_5 PERSON_3 PERSON_1
21
use compounded statements to produce HNQL programs.
The Hypernode is a complete db-model. It has a unique basic data struc-
ture which is simple and extensible allowing different levels of abstraction
(nesting levels) and modularity. It allows representation of flat, hierarchical,
composite, and cyclic objects, as well as functions (mappings) and relations
(records). Has a simple representation of multi-valued attributes. Has a
simple and intuitive representation of nested and composition relations, in
the form of complex objects and sets of objects (objects represented as hy-
pernodes). The hypernode model can also be regarded an object-oriented
db-model supporting object identity (unique labels), complex objects, en-
capsulation (nesting of graphs), inheritance (structural), query completeness
and persistence.
On the less positive aspects, there are some issues that deserve mention.
Redundancy of data that can be generated by its basic value labels. The
restrictions in the scheme level are limited, for example the specification of
restrictions for missing information or multivalued relations is not possible.
Nesting levels increase complexity of processing. Hyperlog programs are
intractable in the general case.
22
Abstraction Level 1 Abstraction Level 2
name lastname
name PERSON_1 PERSON_2
P1 P2 Ana
lastname parent parent parent
parent
lastname name PERSON_3
PERSON_3 Jones P3 Julia
parent parent
parent parent
PERSON_5 PERSON_6
David name
P5 lastname Deville P6 name Mary parent parent
lastname
PERSON_5 lastname
PERSON_6
parent P4 parent PERSON_4
name
James PERSON_4
Figure 6: Simatic-XT. Here schema and instance are mixed. The relations
Name-Lastname and Person-Parent are represented in two abstraction lev-
els. In the first level (the most general), the graph contains the relations
name and lastname to identify people (P1, ..., P6 ). In the second level we
use the abstraction of Person, to compress the attributes name and lastname
and represent only the relation parent between people.
23
A sequel paper [81] presents a set of graph operators divided into three
classes: Basic operators, managing the notion of abstraction (the Develop
and Undevelop operators); Elementary operators, managing the notion of
graph and sub-graph (Union, Concatenation, Selection and, Difference) and;
high level operators (Paths, Inclusions and Intersections).
This proposal allows simple modeling and abstraction of complex objects
and paths, and encapsulation at node and edge levels. It improves the
representation and querying of paths between nodes, and the visualization
of complex nodes and paths. At its current state, it lacks definition of
integrity constraints.
24
Instance
Figure 7: GGL. Schema and instances are mixed. Packaged graph vertices
(Person1, Person2, ...) are used to encapsulate information about the graph
defining a Person. Relations between these packages are established using
edges labeled with parent.
25
WEB [52], that defines queries as graphs with the same structures of the
model and return the graphs in the database which match the query-graph.
Finally, the model defines two database-independent integrity constraints:
Labels in a graph are uniquely named, and edges are composed of the labels
and vertices of the graph in which the edge occurs.
The model was designed to support the requirements to model genome
data, but also is generic enough to support complex interconnected struc-
tures. The distinction between schema and instance are blurred. Its nesting
levels increase the complexity of modeling and processing.
26
Schema Instance
CHILD−PARENT
CHILD−PARENT PERSON PERSON
PERSON
3 5
PERSON 1 NAME LASTNAME NAME LASTNAME
NAME LASTNAME
Julia Jones David Deville
NAME LASTNAME George Jones
PERSON PERSON
PERSON
2 6
4 NAME LASTNAME
NAME LASTNAME NAME LASTNAME Mary Deville
Ana Stone James Deville
VAL(3) VAL(4)
PARENTS PARENTS
PARENTS
27
is a set of value functional dependencies over N , and S is a set of subsets of
N (sub-object schemas) including N itself.
The previous structure is defined in terms of hypergraphs, establish-
ing a one-to-one correspondence between each object schema < N, F, S >
and a hypergraph interpreting N as nodes, F as directed hyperedges, and
S as undirected hyperedges. Note that at the instance level, objects over
object and class schemas can be represented as labeled hypergraphs.led hy-
pergraphs.
In addition, class schemas are defined to introduce the notions of class
and inheritance. A class schema corresponding to an object schema <
N, F, S > is an hypergraph < N, F, H >, where the H component indi-
cates all the super-class schemas of the class-schema. A class over a class
schema is just an instance of an object schema.
An hypergraph manipulation language (HML) for querying and updating
hypergraphs is presented. It has two operators for querying hypergraphs by
identifier or by value, and eight operators for manipulation (insertion and
deletion) of hypergraphs and hyperedges.
The use of hypergraphs has several advantages. Introduces a single
formalism for both sub-object sharing and structural inheritance, avoid-
ing redundancy of data (values of common sub-objects are shared by their
super-objects). Hypergraphs allow the definition of complex objects (using
undirected hyperedges) and functional dependencies (using directed hyper-
edges). Allows supports for object-ID and (multiple) structural inheritance.
Value functional dependences establish semantic integrity constraints for
object schemas.
The notion of hypergraphs is also used in other proposals:
28
Schema Instance
CP CP parent CP
parent
parent parent child child child
parent
Pe Pe Pe Pe Pe Pe Pe
n n n n n n
ln ln ln ln ln ln
Pe
n child N N N N N N
ln Ana George Julia James David Mary
N L CP L L L
n = name ln=lastname Stone Jones Deville
29
pattern to describe subgraphs in a object base instance.
GOOD presents other features like macros (for more succinct expression
of frequent operations), computational-completeness of the query language,
and simulation of object-oriented characteristics like encapsulation and in-
heritance.
The model presented introduced several useful features. The notion of
printable and non-printable nodes is relevant for design of graphical inter-
faces, although introduces additional information obscuring the semantics
of relations. It has a simple definition of multivalued relations and allows
recursive relations. Solves in a balanced way the redundancy of data prob-
lem. Nevertheless, the db-model is incomplete, currently lacking integrity
constraints.
30
Schema Instance
parent string name
name string
"George"
lastname Person parent Person "David"
Person string lastname
lastname parent parent string
"Jones" name
name lastname name "James"
string Person Person
string "Julia" lastname string
parent parent
string name "Deville"
"Ana" parent lastname
Person Person
name string
string lastname
"Mary"
"Stone"
Figure 10: GMOD. In the schema, nodes represent abstract objects (Person)
and labeled edges establish relations with primitive objects (properties name
and lastname) and other abstract objects (parent relation). For building an
instance, we instantiate the schema for each person by assigning values to
oval nodes.
31
simple modeling. Also it allows incomplete information, and permits avoid-
ing redundancy of data. The issue of property-dependent identity and a not
completely transparent notion of object-ID incorporates some complexities.
32
Schema Reduced Instance Graph
’George’ ’Jones’ ’Julia’ ’David’
Person lastname name
P2 P4 parent P6
val val
name lastname
name lastname val name
name lastname lastname
string
’Ana’ ’Stone’ ’James’ ’Deville’ ’Mary’
Figure 11: PaMaL. The example shows all the nodes defined in PaMaL:
basic type (string), class (Person), tuple (⊗), set (⊛) nodes for the schema
level, and atomic (George, Ana, etc.), instance (P1, P2, etc), tuple and set
nodes for the instance level. Note the use of edges ∈ to indicate elements
in a set, and the edge typ to indicate the type of class Person (these edges
are changed to val in the instance level).
33
an object, an edge labeled val is used and represents the edge typ in the
schema.
PaMaL presents operators for addition, deletion (of nodes and edges)
and an special operation that reduces instance graphs. It incorporates loop,
procedure and program constructs that makes it a computationally complete
language. Among the highlights of the model are the explicit definition of
sets and tuples, the multiple inheritance, and the use of graphics to describe
queries.
34
Schema Instance
Figure 12: GOAL: The schema presented in the example shows the use of the
object node Person with properties Name and Lastname. The association
node Parent and the double headed edges parent and child allow to express
the relation Person-Parent. At the instance level, we assign values to value
nodes (string) and create instances for object and association nodes. Note
that nodes with same value were merged (e.g. Deville).
35
properties (single headed edges) and multi-valued properties (double headed
edges), as well as ISA relations (double unlabeled arrows). An instance in
GOAL assigns values to value nodes and creates instances for object and
association nodes.
GOAL introduces the notion of consistent schema to enforce that ob-
jects only belong to the class they are labeled with and its super-classes.
In addition GOAL presents a graph data manipulation language with oper-
ations for addition and deletion based on pattern matching. The addition
(deletion) operation adds (deletes) nodes and/or edges at the instance level.
A finite sequence of additions and deletions is called a transformation.
There are several novelties introduced by this model. Association nodes
allow simple definition of multi-attribute and multi-valued relations. In con-
trast to the Entity Relationship model, GOAL supports relations between
associations. Properties are optional, therefore it is possible to model incom-
plete information. Additionally, GOAL defines restrictions that introduce
notions of consistent scheme and weak instance.
36
Schema Instance
parent string name
name string
"George"
lastname Person parent Person "David"
Person string lastname
lastname parent parent string
"Jones" name
name lastname name "James"
string Person Person
string "Julia" lastname string
parent parent
string name "Deville"
"Ana" parent lastname
Person Person
name string
string lastname
"Mary"
"Stone"
Figure 13: G-Log. The schema defines people as objects Person, each one
identified by their properties name, lastname and parent. The latter estab-
lishes the relation child-parent. The instance is got in a similar way as in
GMOD.
37
Schema Well formed graph
Figure 14: GDM. In the schema each entity Person (object node repre-
sented as a square) has assigned the attributes name and lastname (basic
value nodes represented round and labeled str ). We use the composite-value
node PC to establish the relationship child-Parent. Note the redundancy in-
troduced by the node PC. The instance is built by instantiating the schema
for each person.
38
ends in n and starts in a class-labeled node; and (I-REA) composite-value
nodes have either exactly one incoming edge or are labeled with exactly
one class name, but not both. In addition the model considers the notion
of consistency defining extension relations which are many-to-many rela-
tions between the nodes in the data graph and nodes in the schema graph,
indicating correspondence between entities and classes.
The proposal includes a graph-based update language called GUL, that is
based on pattern matching. GUL permits addition and deletion operations,
plus a reduction operation that reduces well-formed data graphs to instance
graphs by merging similar basic-value nodes and similar composite-value
nodes.
The GDM model presents the following benefits. The independence of
the definition of the notions of schema and instance permits that instances
can exist without a schema, allowing representation of semi-structured data.
Permits the explicit representation of complex values, inheritance (using ISA
edges) and definition of n-ary symmetric relationships. The composite-value
nodes allow simple definition of multi-attribute and multi-valued relations.
Finally, let us remark that this model introduces notions of consistency and
well-formed graphs.
39
Schema Instance
name Ana
George name
parent PERSON_1 PERSON_2
lastname lastname Stone
parent
Jones lastname parent
name James
PERSON
PERSON_3 PERSON_4 lastname
name lastname Julia name Deville
parent parent lastname
NAME LASTNAME PERSON_5 name David
parent parent lastname
PERSON_6 name
Mary
Figure 15: Gram. At the scheme level we use generalized names for definition
of entities and relations. At the instance level, we create instance labels (e.g.
PERSON 1) to represent entities, and use the edges (defined in the schema)
to express relations between data and entities.
40
3.13 Related Data Models
Besides the models reviewed, there are other proposals that present graph-
like features, although not explicitly designed to model the structure and
connectivity of the information. In this section we will describe the most
relevant of these.
3.13.1 GraphDB
Güting [58] proposes an explicit model named GraphDB, which allows simple
modeling of graphs in an object oriented environment. The model permits
an explicit representation of graphs by defining object classes whose objects
can be viewed as nodes, edges and explicitly stored paths of a graph (which
is the whole database instance).
A database in GraphDB is a collection of object classes partitioned into
three kinds of classes: simple, link and path classes. Also there are data
types, object types and tuple types. A simple class object has an object
type, object identity and attributes whose values are either of a datatype
(e.g. integer, string) or of an object type. An attribute may contain a
reference to another object. Object classes are organized in a hierarchy of
classes and there are related notions of subtyping among tuple, object and
data types.
There are four types of operators to query GraphDB data: Derive state-
ments: selection, join, projection and function operators; Rewrite operations:
allow to replace objects or subsequences by other (new) objects; Union op-
erator: designed for transforming heterogeneous sets of objects into a ho-
mogeneous one; Graph operations: Shortest path search.
The idea of modeling graphs using object oriented concepts is presented
in other proposals, generically called object-oriented graph models. A typical
example is GOQL [113], a proposal of graph query language for modeling
and querying of multimedia application graphs (represented as DAGs). This
proposal defines a object oriented db-model (similar to GraphDB) that de-
fines four types of objects: node, edge, path and graph. GOQL uses an
SQL-like syntax for construction, querying and manipulation of such ob-
jects.
41
OEM Syntax OEM Graph
{ person : &p1 { name : "George" ,
lastname : "Jones" } &pp
person : &p2 { name : "Ana" , per
son per son
son
n
per
lastname : "Stone" } per rso son
pe
per
son
person : &p3 { name : "Julia" ,
lastname : "Jones" ,
parent
parent : &p1 , &p1 &p2 parent &p3 parent &p4 &p5 parent &p6
parent : &p2 } parent parent
person : &p4 { name : "James" ,
lastname
lastname
lastna
lastnam
e
name
lastname
lastname
nam
nam
nam
nam
nam
lastname : "Deville" }
me
person : &p5 { name = "David",
e
lastname : "Deville" ,
parent : &p3 ,
"George""Jones" "Ana" "Stone" "Julia" "Jones" "James" "Deville" "David" "Deville" "Mary" "Deville"
parent : &p4 }
person : &p6 { name = "Mary" ,
lastname : "Deville" ,
parent : &p3 ,
parent : &p4 } }
Figure 16: Object Exchange Model (OEM). Schema and instance are mixed.
The data is modeled beginning in a root node &pp, with children person
nodes, each of them identified by an Object-ID (e.g. &p2). These nodes
have children that contain data (name and lastname) or references to other
nodes (parent). Referencing permits to establish relations between distinct
hierarchical levels. Note the tree structure obtained if one forgets the point-
ers to OIDs, a characteristic of semistructured data.
42
OEM define objects with the structure < OID, Label, T ype, V alue >,
where: OID is an unique identifier for the object (or null), Label is a
character string that describes the object (expected to be human under-
standable), Type is the datatype of the object’s value (atomic or set type)
and, Value is a variable-length value for the object (either an atomic value
or a set of objects). Data represented in OEM can be thought of as a
graph with Object-IDs representing node-labels and OEM-labels represent-
ing edge-labels. Atomic objects are leaf nodes where the OEM-value is the
node value.
The main feature of OEM data is that is self-describing, in the sense
that it can be parsed without recurring to an external schema and uses
human understandable labels that add semantic information about objects.
Due to the fact that there is no notion of schema or object class (although
each object defines its own schema), OEM offers the flexibility needed in
heterogeneous dynamic environments.
OEM-QL is a declarative query language design to request OEM objects.
The basic construct in OEM-QL is an SQL-like SELECT-FROM-WHERE
expression.
43
"George" "Jones" "Ana" "Stone"
Figure 17: RDF. Schema and instance are mixed together. In the example,
the edges labeled type disconnect the instance from the schema. The instance
is built by the subgraphs obtained by instantiating the nodes of the schema,
and establishing the corresponding parent edges between these subgraphs.
44
etc., play a central role in this model.
Currently there is research work on storing information expressed in
RDF, but none of these works define a graph db-model or even a db-model.
In addition several languages for querying RDF data has been proposed and
implemented, which follow the lines of database query languages like SQL,
OQL, and XPath. A discussion of aspects related to querying RDF from a
graph database perspective is presented in [14].
SPARQL [105] is a proposal of Protocol and Query Language designed
for easy access to RDF stores. It defines a query language with a SQL-like
style, where a simple query is based on query patterns, and query processing
consists of binding of variables to generate pattern solutions (graph pattern
matching).
45
4 Query Languages and Integrity Constraints
4.1 Graph Query Languages
A query language is a collection of operators or inferencing rules which can
be applied to any valid instances of the data structure types of the model,
with the objective of manipulating and querying data in those structures in
any combinations desired [31].
A great deal of papers discuss the problems concerning the definition of a
query language for a db-model [129, 70, 106, 66, 2, 5]. Also a variety of query
languages and formal frameworks for studying them have been proposed and
developed, including the relational db-model [27], semantic databases [16,
12], object-oriented databases [74], semistructured data [24, 2, 4] and the
Web [5, 47].
Among graph db-models, there is substantial work focused in query lan-
guages, the problem of querying graphs and the visual presentation of re-
sults. Particular emphasis has been given to graphical query languages (See
Figure 18 for an example). Following, we describe the most representative
graph query languages.
• The Logical Database Model [79, 80] presents a logic very much in the
spirit of relational tuple calculus, which uses fixed sort variables and
atomic formulas to represent queries over a schema using the power of
full first order languages. The result of a query consists of those objects
over a valid instance that satisfy the query formula. In addition the
model presents an alternative algebraic query language proven to be
equivalent to the logical one.
• Cardelli et al. [26] introduced a spatial logic for reasoning about graphs
and define a query language based in pattern matching and recursion.
This Graph Logic combines first-order logic with additional structural
connectives. A query ask for a substitution of variables such that a
satisfaction relation determines which graph satisfy which formulae.
The query language is based on queries that build new graphs from
old and transducers that relates input graphs with output graphs.
46
Query A
grandparent
Query B
maternal−grandmother
• Oriented to search the Web, Flesca and Greco [45] show how to use
partially ordered languages to define path queries to search databases
and present results on their computational complexity. In addition, a
query language based on the previous ideas is proposed in [46].
47
Additionally, GOAL includes the notion of fixpoints in order to han-
dle the recursion derived from a finite list of additions and deletions.
PaMaL proposed the inclusion of Loop, Procedure and Programs con-
structs, and PaMaL and GUL presented an operator that reduces in-
stance graphs by deleting repeated data. Note that graph-oriented
manipulation formalisms based on patterns allow a syntax-directed
way of working much more natural than text-based interfaces.
• Glide [50] is a graph query language where queries are expressed using
a linear notation formed by labels and wild-cards (regular expressions).
Glide use a method called GraphGrep based on subgraph matching to
solve the queries.
48
operators for manipulation (addition an deletion) of hypergraphs and
hyperedges. Watters and Shepherd [132] presents a framework for
general data access based in hypergraphs that include operators for
creation of edges and set operators like intersection, union and differ-
ence. In a different context, Tomca [125] introduces basic operations
over hypergraph structures representing user state and views in page-
oriented hypertext data.
• The literature also include proposals of query languages that deal with
hypernode structures. The Hypernode model [84] defines a logic-based
query and update language, which is based in the expression of queries
as sets of hypernode rules (h-rules) that are called an hypernode pro-
gram. The query language defines an operator which infers new hyper-
nodes from the instance using the set of rules in a hypernode program.
This query language was extended by Hyperlog [104, 103] including
deletions as well as insertions, and discussing in more detail the im-
plementation issues. A full Turing-machine capability is obtained by
adding composition, conditional constructs and iteration.
In a procedural style, HNQL [83] defines a set of operators for declara-
tive querying and updating of hypernodes. It also includes assignment,
sequential composition, conditional (for making inferences), for loop,
and while loop constructs.
• In the area of Geographic information Systems, the Simatic-XT model [81]
defines a query language. It includes basic operators that deal with
encapsulate data (nesting of hypernodes), set operators (union, con-
catenation, selection and difference) and high level operators (paths,
inclusion and intersections).
• WEB [52, 53] is a declarative programming language based on a graph
logic and oriented to querying genome data. WEB programs de-
fine graph templates for creating, manipulating and querying objects
and relationships in the database. These operations are answered by
matching graphs in valid instances.
• Models like Gram [11] and GOQL [113] propose SQL-Style query lan-
guages with explicit path expressions. Gram presents a query algebra
where regular expressions over data types are used to select walks
(paths) in a graph. It uses a data model where walks are the ba-
sic objects. A walk expression is a regular expression without union,
whose language contains only alternating sequences of node and edge
49
types, starting and ending with a node type. The query language is
based on a hyperwalk algebra with operations closed under the set of
hyperwalks.
• Models like DGV [57] and GraphDB [58] define special operators for
functional definition and querying of graphs. For example, a query in
GraphDB consists of several steps, each of one computes operations
that specify argument subgraphs in the form of regular expressions
over edges that extend or restrict dynamically the database graph.
GraphDB includes a class of objects called path class, which are used
to represent several paths in the database.
50
The utilization, specification, and complexity of constraints is deter-
mined by the richness of the db-model. Integrity constraints have been
studied for the relational [38, 123], semantic [133, 124], object oriented [110],
and semistructured [25, 10] db-models. Thalheim [124] presents a unifying
framework for integrity constraints.
In the case of graph db-models, examples of integrity constraints include
identity and referential integrity constraints, functional and inclusion depen-
dencies, and schema-instance consistency. Next we describe some notions of
integrity constraints defined in graph database models.
LDM [80] defines a logic that use variables to express well-formed formu-
las over a schema. Integrity constraints are expressed in terms of satisfaction
of these LDM formulas.
The Hypernode Model [104] defines two integrity constraints: Entity
Integrity enforces that each hypernode is a unique real world entity identified
by their content; Referential Integrity requires that only existing entities be
referenced. In [83] the notion of semantic constraints were considered. The
concept of Hypernode functional dependency, denoted by A → B, where
A and B are sets of attributes, leave us express that the set of attributes
A determines the value of the set of attributes B in all hypernodes of the
database.
GGL [54] defines two integrity constraints: (1) labels in a graph are
uniquely named; (2) edges are composed of the labels and vertices of the
graph in which the edge occurs. These constraints are similar to primary key
and foreign key (referential) integrity constraints in the relational db-model.
GROOVY [85] uses directed hyperedges to represent Value Functional
Dependencies (VFDs), which are used in the value schema level to establish
semantic integrity constraints. A VFD asserts that the value of a set of
attributes uniquely determines the value of other attribute.
DGM [68] defines conditions enforcing that primitive nodes are only
leaves, the real value of a node depends of its domain, and two constraints
regarding edges with class nodes (these were described in Section 3).
The notion of Schema-Instance consistency is explicitly considered in
GOAL [69], G-Log [99], and GDM [68]. In the case of graph-based ob-
ject oriented db-models this notion is translated into the creation of Valid
Instances, for example GMOD [13] and PaMaL [48].
51
52
√
Figure 19: Main proposals on Graph Database Models and their characteristics (“ ” indicates support and “±”
partial support). LDM [79, 80], Hypernode [84], GOOD [59, 60], GROOVI [85], GMOD [13], Simatic-XT [86],
Gram [11], PaMaL [48], GOAL [69], Hypernode2 [104], Hypernode3 [83], GGL [54], G-Log [99], GDM [68].
References
[1] International Standard ISO/IEC 13250 Topic Maps, December 1999.
[2] S. Abiteboul. Querying Semi-Structured Data. In Proc. of the 6th
Int. Conf. on Database Theory (ICDT), volume 1186 of LNCS, pages
1–18. Springer, Jan 1997.
[3] S. Abiteboul and R. Hull. IFO: A Formal Semantic Database Model.
In Proc. of the 3th Symposium on Principles of Database Systems
(PODS), pages 119–132. ACM Press, 1984.
[4] S. Abiteboul, D. Quass, J. McHugh, J. Widom, and J. L. Wiener. The
Lorel query language for semistructured data. International Journal
on Digital Libraries (JODL), 1(1):68–88, 1997.
[5] S. Abiteboul and V. Vianu. Queries and Computation on the Web. In
Proc. of the 6th Int. Conf. on Database Theory (ICDT), volume 1186
of LNCS, pages 262–275. Springer, Jan 1997.
[6] R. Agrawal and H. V. Jagadish. Efficient Search in Very Large
Databases. In Proc. of the 14th Int. Conf. on Very Large Data Bases
(VLDB), pages 407–418. Morgan Kaufmann, Aug - Sept 1988.
[7] R. Agrawal and H. V. Jagadish. Materialization and Incremental Up-
date of Path Information. In Proc. of the 5th Int. Conf. on Data En-
gineering (ICDE), pages 374–383. IEEE Computer Society, Feb 1989.
[8] R. Agrawal and H. V. Jagadish. Algorithms for Searching Massive
Graphs. IEEE Transactions on Knowledge and Data Engineering
(TKDE), 6(2):225–238, 1994.
[9] R. Albert and A.-L. Barabasi. Statistical mechanics of complex net-
works. Reviews of Modern Physics, 74:47, Jan 2002.
[10] N. Alechina, S. Demri, and M. de Rijke. A Modal Perspective on Path
Constraints. Journal of Logic and Computation, 13(6):939–956, 2003.
[11] B. Amann and M. Scholl. Gram: A Graph Data Model and Query
Language. In European Conference on Hypertext Technology (ECHT),
pages 201–211. ACM, Nov - Dec 1992.
[12] M. Andries and G. Engels. A Hybrid Query Language for an Extended
Entity-Relationship Model. Technical Report TR 93-15, Institute of
Advanced Computer Science, Universiteit Leiden, May 1993.
53
[13] M. Andries, M. Gemis, J. Paredaens, I. Thyssens, and J. V. den Buss-
che. Concepts for Graph-Oriented Object Manipulation. In Proc. of
the 3rd Int. Conf. on Extending Database Technology (EDBT), volume
580 of LNCS, pages 21–38. Springer, March 1992.
[14] R. Angles and C. Gutierrez. Querying RDF Data from a Graph
Database Perspective. In Proc. 2nd European Semantic Web Con-
ference (ESWC), number 3532 in LNCS, pages 346–360, 2005.
[15] M.-A. Aufaure-Portier and C. Trépied. A Survey of Query Languages
for Geographic Information Systems. In Proc. of the 3rd Int. Workshop
on Interfaces to Databases, pages 431–438, July 1976.
[16] M. Azmoodeh and H. Du. GQL, A Graphical Query Language for Se-
mantic Databases. In Proc. of the 4th Int. Conf. on Scientific and Sta-
tistical Database Management (SSDBM), volume 339 of LNCS, pages
259–277. Springer, June 1988.
[17] C. Beeri. Data Models and Languages for Databases. In Proc. of the
2nd Int. Conf. on Database Theory (ICDT), volume 326 of LNCS,
pages 19–40. Springer, Aug - Sept 1988.
[18] G. Benkö, C. Flamm, and P. F. Stadler. A Graph-Based Toy Model of
Chemistry. Journal of Chemical Information and Computer Sciences
(JCISD), 43(1):1085–1093, Jan 2003.
[19] C. Berge. Graphs and Hypergraphs. North-Holland, Amsterdam, 1973.
[20] U. Brandes. Network Analysis. Number 3418 in LNCS. Springer-
Verlag, 2005.
[21] T. Bray, J. Paoli, and C. M. Sperberg-McQueen. Extensible Markup
Language (XML) 1.0, W3C Recommendation 10 February 1998.
http://www.w3.org/TR/1998/REC-xml-19980210.
[22] A. Broder, R. Kumar, F. Maghoul, P. Raghavan, S. Rajagopalan,
R. Stata, A. Tomkins, and J. Wiener. Graph structure in the Web. In
Proc. of the 9th Int. World Wide Web conference on Computer net-
works : the international journal of computer and telecommunications
networking, pages 309–320. North-Holland Publishing Co., 2000.
[23] P. Buneman. Semistructured Data. In Proc. of the 16th Symposium on
Principles of Database Systems (PODS), pages 117–121. ACM Press,
May 1997.
54
[24] P. Buneman, S. Davidson, G. Hillebrand, and D. Suciu. A Query Lan-
guage and Optimization Techniques for Unstructured Data. SIGMOD
Record., 25(2):505–516, 1996.
[25] P. Buneman, W. Fan, and S. Weinstein. Path Constraints in
Semistructured and Structured Databases. In Proc. of the 17th Sym-
posium on Principles of Database Systems (PODS), pages 129–138.
ACM Press, June 1998.
[26] L. Cardelli, P. Gardner, , and G. Ghelli. A Spatial Logic for Querying
Graphs. In Proc. of the 29th Int. Colloquium on Automata, Lan-
guages, and Programming (ICALP), LNCS, pages 597–610. Springer,
July 2002.
[27] A. K. Chandra. Theory of Database Queries. In Proc. of the 7th
Symposium on Principles of Database Systems (PODS), pages 1–9.
ACM Press, March 1988.
[28] P. P.-S. Chen. The Entity-Relationship Model - Toward a Unified View
of Data. ACM Transactions on Database Systems (TODS), 1(1):9–36,
1976.
[29] J. Chomicki. Temporal Query Languages: A Survey. In Proc. of the
First Int. Conf. on Temporal Logic (ICTL), pages 506–534. Springer-
Verlag, 1994.
[30] E. F. Codd. A Relational Model of Data for Large Shared Data Banks.
Communications of the ACM, 13(6):377–387, 1970.
[31] E. F. Codd. Data Models in Database Management. In Proc. of
the 1980 Workshop on Data abstraction, Databases and Conceptual
Modeling, pages 112–114. ACM Press, 1980.
[32] E. F. Codd. A Relational Model of Data for Large Shared Data Banks.
Communications of the ACM, 26(1):64–69, 1983.
[33] J. Conklin. Hypertext: An Introduction and Survey. IEEE Computer,
20(9):17–41, 1987.
[34] M. Consens and A. Mendelzon. Hy+: a Hygraph-based query and
visualization system. SIGMOD Record, 22(2):511–516, 1993.
[35] M. P. Consens and A. O. Mendelzon. Expressing Structural Hypertext
Queries in Graphlog. In Proc. of the 2th Conf. on Hypertext, pages
269–292. ACM Press, 1989.
55
[36] I. F. Cruz, A. O. Mendelzon, and P. T. Wood. A Graphical Query Lan-
guage Supporting Recursion. In Proc. of the Association for Comput-
ing Machinery Special Interest Group on Management of Data, pages
323–330. ACM Press, May 1987.
[40] Y. Deng and S.-K. Chang. A G-Net Model for Knowledge Represen-
tation and Reasoning. IEEE Transactions on Knowledge and Data
Engineering (TKDE), 2(3):295–310, Dec 1990.
56
[46] S. Flesca and S. Greco. Querying Graph Databases. In Proc. of the 7th
Int. Conf. on Extending Database Technology - Advances in Database
Technology (EDBT), volume 1777 of LNCS, pages 510–524. Springer,
March 2000.
57
[56] R. L. Griffith. Three Principles of Representation for Semantic Net-
works. ACM Transactions on Database Systems (TODS), 7(3):417–
442, 1982.
58
[66] A. Heuer and M. H. Scholl. Principles of Object-Oriented Query
Languages. In Datenbanksysteme in Büro, Technik und Wissenschaft
(BTW), volume 270 of Informatik-Fachberichte, pages 178–197.
Springer, March 1991.
[67] J. Hidders. A Graph-based Update Language for Object-Oriented Data
Models. PhD thesis, Technische Universiteit Eindhoven,, Dec 2001.
[68] J. Hidders. Typing Graph-Manipulation Operations. In Proc. of the
9th Int. Conf. on Database Theory (ICDT), pages 394–409. Springer-
Verlag, 2002.
[69] J. Hidders and J. Paredaens. GOAL, A Graph-Based Object and As-
sociation Language. Advances in Database Systems: Implementations
and Applications, CISM, pages 247–265, Sept 1993.
[70] R. Hull and R. King. Semantic Database Modeling: Survey, Applica-
tions, and Research Issues. ACM Computing Surveys, 19(3):201–260,
1987.
[71] H. V. Jagadish and F. Olken. Data Management for the Biosciences:
Report of the NLM Workshop on Data Management for Molecular
and Cell Biology. Technical Report LBNL-52767, National Library of
Medicine, 2003.
[72] L. Kerschberg, A. C. Klug, and D. Tsichritzis. A Taxonomy of Data
Models. In Proc. of Systems for Large Data Bases (VLDB), pages
43–64. North Holland and IFIP, Sept 1976.
[73] N. Kiesel, A. Schurr, and B. Westfechtel. GRAS: A Graph-Oriented
Software Engineering Database System. In IPSEN Book, pages 397–
425, 1996.
[74] M. Kifer, W. Kim, and Y. Sagiv. Querying Object-Oriented
Databases. In Proc. of the 1992 ACM SIGMOD Int. Conf. on Man-
agement of data, pages 393–402. ACM Press, 1992.
[75] W. Kim. Object-Oriented Databases: Definition and Research Di-
rections. IEEE Transactions on Knowledge and Data Engineering
(TKDE), 2(3):327–341, 1990.
[76] G. Klyne and J. Carroll. Resource Description Framework (RDF)
Concepts and Abstract Syntax. http://www.w3.org/TR/2004/REC-
rdf-concepts-20040210/, Feb 2004.
59
[77] R. Kumar, P. Raghavan, S. Rajagopalan, D. Sivakumar, A. Tomkins,
and E. Upfal. The Web as a Graph. In Proc. of the 19th Symposium
on Principles of Database Systems (PODS), pages 1–10. ACM Press,
May 2000.
[78] H. S. Kunii. DBMS with Graph Data Model for Knowledge Handling.
In Proc. of the 1987 Fall Joint Computer Conference on Exploring
technology: today and tomorrow, pages 138–142. IEEE Computer So-
ciety Press, 1987.
[80] G. M. Kuper and M. Y. Vardi. The Logical Data Model. ACM Trans-
actions on Database Systems (TODS), 18(3):379–413, 1993.
[83] M. Levene and G. Loizou. A Graph-Based Data Model and its Ram-
ifications. IEEE Transactions on Knowledge and Data Engineering
(TKDE), 7(5):809–823, 1995.
[84] M. Levene and A. Poulovassilis. The Hypernode Model and its Asso-
ciated Query Language. In Proc. of the 5th Jerusalem Conf. on In-
formation technology, pages 520–530. IEEE Computer Society Press,
1990.
60
[87] M. Mainguenaud. Modelling the Network Component of Geograph-
ical Information Systems. Int. Journal of Geographical Information
Systems (IJGIS), 9(6):575–593, 1995.
61
[99] J. Paredaens, P. Peelman, and L. Tanca. G-Log: A Graph-Based
Query Language. IEEE Transactions on Knowledge and Data Engi-
neering (TKDE), 7(3):436–453, 1995.
[100] J. Peckham and F. J. Maryanski. Semantic Data Models. ACM Com-
puting Surveys, 20(3):153–189, 1988.
[101] S. Pepper. The TAO of Topic Maps.
http://www.ontopia.net/topicmaps/materials/tao.html.
[102] S. Pepper and G. Moore. XML Topic Maps (XTM) 1.0 - Top-
icMaps.Org Specification. http://www.topicmaps.org/xtm/1.0/xtm1-
20010806.html, Feb 2001.
[103] A. Poulovassilis and S. G. Hild. Hyperlog: A Graph-Based System for
Database Browsing, Querying, and Update. IEEE Transactions on
Knowledge and Data Engineering (TKDE), 13(2):316–333, 2001.
[104] A. Poulovassilis and M. Levene. A Nested-Graph Model for the Repre-
sentation and Manipulation of Complex Objects. ACM Transactions
on Information Systems (TOIS), 12(1):35–68, 1994.
[105] E. Prud’hommeaux and A. Seaborne. SPARQL Query
Language for RDF, W3C Working Draft 21 July.
http://www.w3.org/TR/2005/WD-rdf-sparql-query-20050721/,
2005.
[106] R. Ramakrishnan and J. D. Ullman. A Survey of Research on De-
ductive Database Systems. Journal of Logic Programming (JLP),
23(2):125–149, 1993.
[107] N. Roussopoulos and J. Mylopoulos. Using Semantic Networks for
Database Management. In Proc. of the Int. Conf. on Very Large Data
Bases (VLDB), pages 144–172. ACM, Sept 1975.
[108] R.V.Guha, O. Lassila, E. Miller, and D. Brickley. Enabling Inferenc-
ing. The Query Languages Workshop (QL), Dec 1998.
[109] H. Samet and W. G. Aref. Spatial Data Models and Query Processing.
In Modern Database Systems, pages 338–360. 1995.
[110] K.-D. Schewe, B. Thalheim, J. W. Schmidt, and I. Wetzel. Integrity
Enforcement in Object-Oriented Databases . In Proc. of the 4th Int.
Workshop on Foundations of Models and Languages for Data and Ob-
jects, Oct 1993.
62
[111] D. Shasha, J. T. L. Wang, and R. Giugno. Algorithmics and Applica-
tions of Tree and Graph Searching. In Proc. of the 21th Symposium
on Principles of Database Systems (PODS), pages 39–52. ACM Press,
2002.
[112] S. Shekhar, M. Coyle, B. Goyal, D.-R. Liu, and S. Sarkar. Data Models
in Geographic Information Systems. Communications of the ACM,
40(4):103–111, 1997.
[115] D. W. Shipman. The Functional Data Model and the Data Lan-
guage DAPLEX. ACM Transactions on Database Systems (TODS),
6(1):140–173, 1981.
63
[122] R. W. Taylor and R. L. Frank. CODASYL Data-Base Management
Systems. ACM Computing Surveys, 8(1):67–103, 1976.
64
[134] B. Wellman, J. Salaff, D. Dimitrova, L. Garton, M. Gulia, and
C. Haythornthwaite. Computer Networks as Social Networks: Col-
laborative Work,Telework, and Virtual Community. Annual Review
of Sociology, 22:213–238, 1996.
65