Module 5 Enhanced Data Models
Module 5 Enhanced Data Models
Object-oriented databases and object-relational systems do provide features that allow users
to extend their systems by specifying additional abstract data types for each application.
However, it is quite useful to identify certain common features for some of these advanced
applications and to create models that can represent them. Additionally, specialized storage
structures and indexing methods can be implemented to improve the performance of these
common features. Then the features can be implemented as abstract data types or class
libraries and purchased separately from the basic DBMS software package. The term data
blade has been used in Informix and cartridge in Oracle to refer to such optional sub modules
that can be included in a DBMS package. Users can utilize these features directly if they are
suitable for their applications, without having to reinvent, reimplement, and reprogram such
common features. Active databases provide additional functionality for specifying active rules.
These rules can be automatically triggered by events that occur, such as database updates or
certain times being reached, and can initiate certain actions that have been specified in the rule
declaration to occur if certain conditions are met. Many commercial packages include some of
the functionality provided by active databases in the form of triggers. Temporal databases,
which permit the database system to store a history of changes, and allow users to query both
current and past states of the database. Some temporal database models also allow users to
store future expected information, such as planned schedules. It is important to note that many
database applications are temporal, but they are often implemented without having much
temporal support from the DBMS package—that is, the temporal concepts are implemented in
the application programs that access the data-base. Spatial database concepts uses types of
spatial data, different kinds of spatial analyses, operations on spatial data, types of spatial
queries, spatial data indexing, spatial data mining, and applications of spatial databases.
Multimedia databases provide features that allow users to store and query different types of
multimedia information, which includes images (such as pictures and drawings), video
clips (such as movies, newsreels, and home videos), audio clips (such as songs, phone
messages, and speeches), and documents (such as books and articles). A deductive data-base
system includes capabilities to define (deductive) rules, which can deduce or infer additional
information from the facts that are stored in a database. Because part of the theoretical
foundation for some deductive database systems is mathematical logic, such rules are often
referred to as logic databases. Other types of systems, referred to as expert database
systems or knowledge-based systems, also incorporate reasoning and inferencing capabilities;
such systems use techniques that were developed in the field of artificial intelligence, including
semantic networks, frames, production systems, or rules for capturing domain-specific
knowledge.
Object-Oriented Databases
An object-oriented database is a collection of object-oriented programming and relational
database. There are various items which are created using object-oriented programming
languages like C++, Java which can be stored in relational databases, but object-oriented
databases are well-suited for those items.
An object-oriented database is organized around objects rather than actions, and data rather
than logic. For example, a multimedia record in a relational database can be a definable data
object, as opposed to an alphanumeric value.
Relational database technology was not able to handle complex application systems such as
Computer Aided Design (CAD), Computer Aided Manufacturing (CAM), and Computer
Integrated Manufacturing (CIM), Computer Aided Software Engineering (CASE) etc. The
limitation for relational databases is that, they have been designed to represent entities and
relationship in the form of two-dimensional tables. Any complex interrelationship like, multi-
valued attributes or composite attribute may result in the decomposition of a table into several
tables. Similarly, complex interrelationships result in a number of tables being created. The
main asset of relational databases namely, its simplicity for such applications, is also one of its
weaknesses, in the case of complex applications. The data domains in a relational system can
be represented in relational databases as standard data types defined in the SQL. However, the
relational model does not allow extending these data types or creating the user’s own data
types. Thus, limiting the types of data that may be represented using relational databases.
Another major weakness of the RDMS is that, concepts like inheritance/hierarchy need to be
represented with a series of tables with the required referential constraint. Thus they are not
very natural for objects requiring inheritance or hierarchy. However, one must remember that
relational databases have proved to be commercially successful for text based applications and
have lots of standard features including security, reliability and easy access. Many commercial
DBMS products are basically relational but also support object oriented concepts.
1 – Maintenance Problem
The maintenance of the relational database becomes difficult over time due to the increase in
the data. Developers and programmers have to spend a lot of time maintaining the database.
2 – Cost
The relational database system is costly to set up and maintain. The initial cost of the software
alone can be quite pricey for smaller businesses, but it gets worse when you factor in hiring a
professional technician who must also have expertise with that specific kind of program.
3 – Physical Storage
A relational database is comprised of rows and columns, which requires a lot of physical
memory because each operation performed depends on separate storage. The requirements of
physical memory may increase along with the increase of data.
4 – Lack of Scalability
While using the relational database over multiple servers, its structure changes and becomes
difficult to handle, especially when the quantity of the data is large. Due to this, the data is not
scalable on different physical storage servers. Ultimately, its performance is affected i.e. lack of
availability of data and load time etc. As the database becomes larger or more distributed with
a greater number of servers, this will have negative effects like latency and availability issues
affecting overall performance.
5 – Complexity in Structure
Relational databases can only store data in tabular form which makes it difficult to represent
complex relationships between objects. This is an issue because many applications require
more than one table to store all the necessary data required by their application logic.
The relational database can become slower, not just because of its reliance on multiple tables.
When there is a large number of tables and data in the system, it causes an increase in
complexity. It can lead to slow response times over queries or even complete failure for them
depending on how many people are logged into the server at a given time.
Objects that share similar characteristics are grouped in classes. Therefore, a class is a collection of
similar objects with attributes and methods. In this model, two or more objects are connected with the
help of links. This link is used to relate objects. It is explained in the below example.
E-R Model
ER model is used to represent real life scenarios as entities. The properties of these entities are
their attributes in the ER diagram and their connections are shown in the form of
relationships. An ER model is generally considered as a top down approach in data designing.
An example of ER model is −
Advantages of E - R model
The data requirements are easily understandable using an E - R model as it utilises clear
diagrams.
The E-R model can be easily converted into a relational database.
The E-R diagram is very easy to understand as it has clearly defined entities and the
relations between them.
Disadvantages of E-R model
Due to inheritance, the data types can be reused in different objects. This reduces the
cost of maintaining the same data in multiple locations.
The object oriented model is quite flexible in most cases.
It is easier to extend the design in Object Oriented Model.
Disadvantages of Object Oriented Model
The objects may be complex, or they may consists of low-level object (for example, a window
object may consists of many simpler objects like menu bars scroll bar etc.). However, to
represent the data of these complex objects through relational database models require many
tables – at least one each for each inherited class and a table for the base class. In order to
ensure that these tables operate correctly it is needed to set up referential integrity constraints
as well. On the other hand, object oriented models represent such a system Object Oriented
Database very naturally through an inheritance hierarchy. Consider an example to design a
class, (let say a Date class), the advantage of object oriented database management for such
situations would be that they allow representation of not only the structure but also the
operation on newer user defined database type such as finding the difference of two dates.
Thus, object oriented database technologies are ideal for implementing such systems that
support complex inherited objects, user defined data types (that require operations in addition
to standard operation including the operations that support polymorphism). Another major
reason for the need of object oriented database system would be the seamless integration of
this database technology with object-oriented applications. Software design is mostly based on
object oriented technologies. Thus, object oriented database may provide a seamless interface
for combining the two technologies. The Object oriented databases are also required to
manage complex, highly interrelated information. They provide solution in the most natural and
easy way that is closer to our understanding of the system. The concept of object oriented
database was introduced in the late 1970s and it became significant only in the early 1980s.
The initial commercial product offerings appeared in the late 1980s. Today, many object
oriented databases products are available like Objectivity/DB (developed by Objectivity, Inc.),
ONTOS DB (developed by ONTOS, Inc.), VERSANT (developed by Versant Object Technology
Corp.), ObjectStore (developed by Object Design, Inc.), GemStone (developed by Servio Corp.)
and ObjectStore PSE Pro (developed by Object Design, Inc.). An object oriented database is
presently being used for various applications in areas such as, e-commerce, engineering
product data management; and special purpose databases in areas such as, securities and
medicine.
Object Relational Database Systems are the relational database systems that have been
enhanced to include the features of object oriented paradigm.
Consider an example a composite attribute − Address. The address of a person in a RDBMS can
be represented as: House-no, apartment, Locality, City, State, Pin code. When using RDBMS,
such information either Object Oriented Database needs to be represented as set attributes or
as just one string separated by a comma or a semicolon. The second approach is very inflexible,
as it would require complex string related operations for extracting information. It also hides
the details of an address, thus, it is not suitable. If we represent the attributes of the address as
separate attributes then the problem would be with respect to writing queries. For example, if
we need to find the address of a person, we need to specify all the attributes that we have
created for the address ie, House-no, Locality…. etc. The following may be one such possible
attempt:
CREATE TYPE Address AS ( House Char(20) Locality Char(20) City Char(12) State Char(15)
Pincode Char(6) ) ;
Thus, Address is now a new type that can be used while showing a database system scheme as:
CREATE TABLE STUDENT ( name Char(25), address Address, phone Char(12) programme
Char(5) dob ??? ) ;
Similarly, complex data types may be extended by including the date of birth field (dob), which
is represented in the discussed scheme as DOB . This complex data type should then, comprise
associated fields such as, day, month and year. This data type should also permit the
recognition of difference between two dates; the day; and the year of birth.
Find the name and address of the students who are enrolled in MCA programme.
Note that the attribute ‘address’ although composite, is put only once in the query.
Find the name and address of all the MCA students of Mumbai.
SELECT name, address FROM student WHERE programme = ‘MCA’ AND address.city =
‘Mumbai’; 10 Enhanced Database Models allow us to handle a composite attribute as a single
attribute with a user defined type. The reference to any of the component of this attribute will
be carried out without any problems. So the data definition of attribute components is still
intact. Complex data types also allow us to model a table with multi-valued attributes which
would require a new table in a relational database design.
For example, a library database system would require the representation following information
for a book.
Book table: • ISBN number • Book title • Authors • Published by • Subject areas of the book.
Clearly, in the table above, authors and subject areas are multi-valued attributes. The definition
for them using tables will be as (ISBN number, author) and (ISBN number, subject area) tables.
(Please note that our database is not considering the author position in the list of authors).
Although this database solves the immediate problem, yet it is a complex design. This problem
may be most naturally represented while using the object oriented database system.
Thus, the types such as those given above, can be represented as:
CREATE TYPE Name AS ( given-name Char (20), middle-name Char(15), sur-name Char(20) )
FINAL CREATE TYPE Address AS ( add-det Char(20), city Char(20), state Char(20), pincode
Char(6) ) NOT FINAL 11 CREATE TYPE Date AS ( Object Oriented Database dd Number(2), mm
Number(2), yy Number(4) ) FINAL
CREATE INSTANCE METHOD difference (present Date) RETURNS INTERVAL days FOR Date
BEGIN // Code to calculate difference of the present date to the date stored in the object. //
// The data of the object will be used with a prefix SELF as: SELF.yy, SELF.mm etc. //
// The last statement will be RETURN days that would return the number of days//
END
CREATE TYPE Student AS ( name Name, address Address, dob Date ) ‘FINAL’ and ‘NOT FINAL’
key words have the same meaning as in JAVA. That is a final class cannot be inherited further.
There also exists the possibility of using constructors.
Type Inheritance
In the present standard of SQL one can define inheritance. Let us explain this with the help of
an example.
Now, this type can be inherited by the Staff type or the Student type.
For example, the Student type if inherited from the class given above would be:
Both the inherited types shown above-inherit the name and address attributes from the type
University-person. Methods can also be inherited in a similar way, however, they can be
overridden if the need arises.
Table Inheritance
Now the table inheritance would allow us to create sub-tables for such tables as:
• The type that associated with the sub-table must be the sub-type of the type of the parent
table. This is a major requirement for table inheritance.
• All the attributes of the parent table – (University-members in our case) should be present in
the inherited tables.
• Also, the three tables may be handled separately, however, any record present in the
inherited tables are also implicitly present in the base table. For example, any record inserted in
the student-list table will be implicitly present in university-members tables.
• A query on the parent table (such as university-members) would find the records from the
parent table and all the inherited tables (in our case all the three tables), however, the
attributes of the result table would be the same as the attributes of the parent table.
• One can restrict query to − only the parent table used by using the keyword – ONLY. For
example, SELECT NAME FROM university-member ONLY
Object oriented database systems are the application of object oriented concepts into database
system model to create an object oriented database model
Object Model : The Object Database Management Group (ODMG) has designed the object
model for the object oriented database management system. The Object Definition Language
(ODL) and Object Manipulation Language (OML) are based on this object model. Let us briefly
definethe concepts and terminology related to the object model.
Objects and Literal: These are the basic building elements of the object model. An object has the
following four characteristics:
A unique identifier
A name
A lifetime defining whether it is persistent or not, and
A structure that may be created using a type constructor. The structure in OODBMS can
be classified as atomic or collection objects (like Set, List, Array, etc.).
A literal does not have an identifier but has a value that may be constant. The structure of a
literal does not change. Literals can be atomic, such that they correspond to basic data types
like int, short, long, float etc. or structured literals (for example, current date, time etc.) or
collection literal defining values for some collection object.
Interface: Interfaces defines the operations that can be inherited by a user-defined object.
Interfaces are non-instantiable. All objects inherit basic operations (like copy object, delete
object) from the interface of Objects. A collection object inherits operations – such as, like an
operation to determine empty collection – from the basiccollection interface
Atomic Objects: An atomic object is an object that is not of a collection type. They are user
defined objects that are specified using class keyword. The properties of an atomic object can
be defined by its attributes and relationships.
Inheritance: The interfaces specify the abstract operations that can be inherited by classes. This
is called behavioural inheritance and is represented using “: “ symbol. Sub-classes can inherit
the state and behaviour of super-class(s) using the keyword EXTENDS.
Extents: An extent of an object that contains all the persistent objects of that class. A class
having an extent can have a key.
Here, for each object of the class student there is a reference to book object and theset
of references is called receives
But if it require to access the student based on the book then the “inverse relationship”
could be specified as
relationship set <Student> receivedby
it may need to specify the connection between the relationship receives and receivedby
by, using a keyword “inverse” in each declaration. If the relationship is in a different class,
it is referred to by the relationships name followed by a double colon(::) and the name of
the other relationship.
Methods could be specified with the classes along with input/output types. These
declarations are called “signatures”. These method parameters could be in, out or inout.
Here, the first parameter is passed by value whereas the next two parameters are passed
by reference. Exceptions could also be associated with these methods
The ODL could be atomic type or class names. The basic type uses many class
constructors such as set, bag, list, array, dictionary and structure.
Inheritance is implemented in ODL using subclasses with the keyword “extends”.
Multiple inheritance is implemented by using extends separated by a colon (:).
The difference between relation schema and relation instance, ODL uses the class and its
extent (set of existing objects). The objects are declared with the keyword “extent”.
The major considerations while converting ODL designs into relational designs are as follows:
a) It is not essential to declare keys for a class in ODL but in Relational design now
attributes have to be created in order for it to work as a key.
c) Methods could be part of design in ODL but, they cannot be directly converted into
relational schema although, the SQL supports it, as it is not the property of a relational
schema.
Relationships are defined in inverse pairs for ODL but, in case of relational design,
only one pair is defined.
Object Query Language (OQL) is a standard query language which takes high-level, declarative
programming of SQL and object-oriented features of OOPs.
Find the list of authors for the book titled “The suitable boy”
The more complex query to display the title of the book which has been issued to the
student whose name is Anand, could be
SELECT b.TITLE FROM Book b, Student sWHERE s.NAME =”Anand”
In the previous case, the query creates a bag of strings, but when the keywordDISTINCT is
used, the query returns a set.
In case of complex output the keyword “Struct” is used. If we want to display the pair of
titles from the same publishers then the proposed query is:
Aggregate operators like SUM, AVG, COUNT, MAX, MIN could be used in OQL. If we want to
calculate the maximum marks obtained by any student then the OQL command is
Group by is used with the set of structures, that are called “immediate collection”.
HAVING is used to eliminate some of the groups created by the GROUP bycommands.
SELECT cour, publ, AVG(SELECT p.b.PRICE FROM partition p) FROM Book b GROUP BY
cour:b.receivedby.COURSE, publ:b.PUBLISHER HAVING AVG(SELECT p.b.PRICE FROM partition
p)>=60.
Union, intersection and difference operators are applied to set or bag type with the
keyword UNION, INTERSECT and EXCEPT. If we want to display the details of suppliers from
PATNA and SURAT then the OQL is
The result of the OQL expression could be assigned to host language variables. If,
costlyBooks is a set <book> variable to store the list of books whose price is below Rs.200
then
costlyBooks = SELECT DISTINCT b from Book b where b.price >200.
In order to find a single element of the collection, the keyword “ELEMENT” is used.
If costlySBook is a variable then
costlySBook =ELEMENT (SELECT DISTINCT b FROM Book b WHERE b.PRICE > 200)
An object oriented database management system is created on the basis of persistent programming
paradigm whereas, a object relational is built by creating object oriented extensions of a relational
system. In fact both the products have clearly defined objectives.
XML- enabled
Native XML (NXD)
Big Data includes huge volume, high velocity, and extensible variety of data. These are 3 types:
Structured data, Semi-structured data, and Unstructured data.
1. Structured data – Structured data is data whose elements are addressable for effective
analysis. It has been organized into a formatted repository that is typically a database. It
concerns all data which can be stored in database SQL in a table with rows and columns.
They have relational keys and can easily be mapped into pre-designed fields. Today, those
data are most processed in the development and simplest way to manage
information. Example: Relational data.
2. Semi-Structured data – Semi-structured data is information that does not reside in a
relational database but that has some organizational properties that make it easier to
analyze. With some processes, you can store them in the relation database (it could be very
hard for some kind of semi-structured data), but Semi-structured exist to ease
space. Example: XML data.
3. Unstructured data – Unstructured data is a data which is not organized in a predefined
manner or does not have a predefined data model, thus it is not a good fit for a mainstream
relational database. So for Unstructured data, there are alternative platforms for storing and
managing, it is increasingly prevalent in IT systems and is used by organizations in a variety
of business intelligence and analytics applications. Example: Word, PDF, Text, Media logs.
Matured transaction
and various No transaction
Transaction concurrency Transaction is adapted from management and
management techniques DBMS not matured no concurrency
Only textual
Query Structured query allow Queries over anonymous queries are
performance complex joining nodes are possible possible
In the above diagram, there is a root element named as <company>. Inside that, there is one
more element <Employee>. Inside the employee element, there are five branches named
<FirstName>, <LastName>, <ContactNo>, <Email>, and <Address>. Inside the <Address>
element, there are three sub-branches, named <City> <State> and <Zip>
XML - DTDs
The XML Document Type Declaration, commonly known as DTD, is a way to describe XML
language precisely. DTDs check vocabulary and validity of the structure of XML documents
against grammatical rules of appropriate XML language.
An XML DTD can be either specified inside the document, or it can be kept in a separate
document and then liked separately.
Syntax
Basic syntax of a DTD is as follows −
<!DOCTYPE element DTD identifier
[
declaration1
declaration2
........
]>
In the above syntax,
The DTD starts with <!DOCTYPE delimiter.
An element tells the parser to parse the document from the specified root element.
DTD identifier is an identifier for the document type definition, which may be the path
to a file on the system or URL to a file on the internet. If the DTD is pointing to external
path, it is called External Subset.
The square brackets [ ] enclose an optional list of entity declarations called Internal
Subset.
Internal DTD
A DTD is referred to as an internal DTD if elements are declared within the XML files. To refer it
as internal DTD, standalone attribute in XML declaration must be set to yes. This means, the
declaration works independent of an external source.
Syntax
Following is the syntax of internal DTD −
<!DOCTYPE root-element [element-declarations]>
where root-element is the name of root element and element-declarations is where you
declare the elements.
Rules
The document type declaration must appear at the start of the document (preceded
only by the XML header) − it is not permitted anywhere else within the document.
Similar to the DOCTYPE declaration, the element declarations must start with an
exclamation mark.
The Name in the document type declaration must match the element type of the root
element.
External DTD
In external DTD elements are declared outside the XML file. They are accessed by specifying
the system attributes which may be either the legal .dtd file or a valid URL. To refer it as
external DTD, standalone attribute in the XML declaration must be set as no. This means,
declaration includes information from the external source.
Syntax
Following is the syntax for external DTD −
<!DOCTYPE root-element SYSTEM "file-name">
where file-name is the file with .dtd extension.
Types
One can refer to an external DTD by using either system identifiers or public identifiers.
System Identifiers
A system identifier enables to specify the location of an external file containing DTD
declarations. Syntax is as follows −
<!DOCTYPE name SYSTEM "address.dtd" [...]>
It contains keyword SYSTEM and a URI reference pointing to the location of the document.
Public Identifiers
Public identifiers provide a mechanism to locate DTD resources and is written as follows −
<!DOCTYPE name PUBLIC "-//Beginning XML//DTD Address Example//EN">
As one can see, it begins with keyword PUBLIC, followed by a specialized identifier. Public
identifiers are used to identify an entry in a catalog. Public identifiers can follow any format,
however, a commonly used format is called Formal Public Identifiers, or FPIs
XML – Schemas
XML Schema is commonly known as XML Schema Definition (XSD). It is used to describe and
validate the structure and the content of XML data. XML schema defines the elements,
attributes and data types. Schema element supports Namespaces. It is similar to a database
schema that describes the data in a database.
The basic idea behind XML Schemas is that they describe the legitimate format that an XML
document can take.
Elements
Elements are the building blocks of XML document. An element can be defined within an XSD
as follows −
<xs:element name = "x" type = "y"/>
Definition Types
One can define XML schema elements in the following ways −
Simple Type
Simple type element is used only in the context of the text. Some of the predefined simple
types are: xs:integer, xs:boolean, xs:string, xs:date. For example −
<xs:element name = "phone_number" type = "xs:int" />
Complex Type
A complex type is a container for other element definitions. This allows to specify which child
elements an element can contain and to provide some structure within XML documents.
Global Types
With the global type, one can define a single type in the document, which can be used by all
other references. For example, suppose one want to generalize the person and company for
different addresses of the company.
XML - Document
An XML document is a basic unit of XML information composed of elements and other markup
in an orderly package. An XML document can contains wide variety of data. For example,
database of numbers, numbers representing molecular structure or a mathematical equation.
The following image depicts the parts of XML document.
XML declaration
Document type declaration
XML declaration: contains details that prepare an XML processor to parse the XML document.
It is optional, but when used, it must appear in the first line of the XML document.
Syntax
Following syntax shows XML declaration −
<?xml
version = "version_number"
encoding = "encoding_declaration"
standalone = "standalone_status"
?>
Each parameter consists of a parameter name, an equals sign (=), and parameter value
inside a quote.
Document Elements Section
Document Elements are the building blocks of XML. These divide the document into a
hierarchy of sections, each serving a specific purpose. You can separate a document into
multiple sections so that they can be rendered differently, or used by a search engine. The
elements can be containers, with a combination of text and other elements.
XML - Databases
XML Database is used to store huge amount of information in the XML format. As the use of
XML is increasing in every field, it is required to have a secured place to store the XML
documents. The data stored in the database can be queried using XQuery, serialized, and
exported into a desired format.
XML- enabled
Native XML (NXD)
3. Above two are closely related, and handled by the same tools.
XPath is used to address (select) parts of documents using path expressions A path expression
is a sequence of steps separated by “/”. Result of path expression may be set of values that
along with their containing elements/attributes match the specified path. The initial “/”
denotes root of the document. Path expressions are evaluated left to right. Each step operates
on the set of instances produced by the previous step. Selection predicates may follow any step
in a path, in [ ]. Attributes are accessed using “@”
XML data can be stored in – 1. on-relational data stores 2. Flat files -Natural for storing XML -
But has all problems (no concurrency, no recovery, …)
XML database -Database built specifically for storing XML data, supporting DOM model and
declarative querying. Currently no commercial-grade systems
Advantage: mature database systems Disadvantages: overhead of translating data and queries
String Representation Store each top level element as a string field of a tuple in a relational
database. Use a single relation to store all elements, or Use a separate relation for each top-
level element type -E.g. account, customer, depositor relations -Each with a string-valued
attribute to store the element
************************************