Data Model - Important - Concepts
Data Model - Important - Concepts
A data model is a collection of concepts that can be used to describe the structure of the
data base. It is a type of data abstraction that is used to provide conceptual representation. It uses
logical concepts that can be easily understood rather than the computer storage concepts. A data
model always hides the details of the data than storage and implementation from the user.
A data model is a set of concepts that can be used to describe the structure of a database,
which includes data types, relationships and data constraints etc. it also includes a set of basic
operations for specifying retrievals and update on the database. Also dynamic aspect or behavior
of a database application is included in a data model.
Data models are categorized based on the types of concepts that they provide to describe the
database structure.
This model provides the concepts that are close to user views. The end users can
understand them. It uses concepts such as entity, attributes and relationships.
An entity represents a real world object or concept such as an employee or a project that is
stored in a database.
An attribute represents some property of interest that further describes an entity, such as
employee’s name or salary.
This model provides the concepts that are close to the computer view of storage. They are
meant for computer specialists. It describes how data is stored as files in the computer by
representing information such as record formats, record ordering and access paths. An access
path is a structure that makes the search for particular database records efficient.
This model is in between these two extremes of the physical model and the conceptual
model. This does not hide all the storage details from the user and it can be implemented on a
Names:………………………………………Reg No:………………………………Group:…….Day:……………………Page 1/ 24
computer system directly. It represents data by using record structures and hence is sometimes
called Record-based data model.
Data Model
• Data Model: A set of concepts to describe the structure of a database, and certain
constraints that the database should obey.
• Integrated collection of concepts for describing data, relationships between data, and
constraints on the data in an organization.
• Data Model Operations: Operations for specifying database retrievals and updates by
referring to the concepts of the data model. Operations on the data model may include
basic operations and user-defined operations.
• Purpose
• Conceptual Data Models(CDM): Provide concepts that are close to the way many users perceive
data. It represents complete and accurate data requirements of the enterprise. It uses concepts
such as entities, attributes, and relationships.
• Conceptual model is also referred to as logical model, but the conceptual model is independent
of all implementation details, whereas logical model assumes knowledge of the underlying data
model of the target DBMS.
• Physical Data Models: Provide concepts that describe how data is stored in the computer,
representing information such as record structures, record orderings, and access paths.
Conceptual Modelling
Names:………………………………………Reg No:………………………………Group:…….Day:……………………Page 2/ 24
ER Model - Basic Concepts
The ER model defines the conceptual view of a database. It works around real-world entities and
the associations among them. At view level, the ER model is considered a good option for
designing databases.
Entity
An entity can be a real-world object, either animate or inanimate, that can be easily identifiable.
For example, in a school database, students, teachers, classes, and courses offered can be
considered as entities. All these entities have some attributes or properties that give them their
identity.
An entity set is a collection of similar types of entities. An entity set may contain entities with
attribute sharing similar values. For example, a Students set may contain all the students of a
school; likewise a Teachers set may contain all the teachers of a school from all faculties. Entity
sets need not be disjoint.
Attributes
Entities are represented by means of their properties, called attributes. All attributes have
values. For example, a student entity may have name, class, and age as attributes.
There exists a domain or range of values that can be assigned to attributes. For example, a
student's name cannot be a numeric value. It has to be alphabetic. A student's age cannot be
negative, etc.
Types of Attributes
Simple attribute − Simple attributes are atomic values, which cannot be divided further.
For example, a student's phone number is an atomic value of 10 digits.
Composite attribute − Composite attributes are made of more than one simple attribute.
For example, a student's complete name may have first_name and last_name.
Derived attribute − Derived attributes are the attributes that do not exist in the physical
database, but their values are derived from other attributes present in the database. For
example, average_salary in a department should not be saved directly in the database,
instead it can be derived. For another example, age can be derived from data_of_birth.
Single-value attribute − Single-value attributes contain single value. For example −
Social_Security_Number.
Multi-value attribute − Multi-value attributes may contain more than one values. For
example, a person can have more than one phone number, email_address, etc.
Names:………………………………………Reg No:………………………………Group:…….Day:……………………Page 3/ 24
simple single-valued attributes
simple multi-valued attributes
composite single-valued attributes
composite multi-valued attributes
Key is an attribute or collection of attributes that uniquely identifies an entity among entity set.
For example, the roll_number of a student makes him/her identifiable among students.
Super Key − A set of attributes (one or more) that collectively identifies an entity in an
entity set.
Candidate Key − A minimal super key is called a candidate key. An entity set may have
more than one candidate key.
Primary Key − A primary key is one of the candidate keys chosen by the database
designer to uniquely identify the entity set.
Relationship
The association among entities is called a relationship. For example, an employee works_at a
department, a student enrolls in a course. Here, Works_at and Enrolls are called relationships.
Relationship Set
A set of relationships of similar type is called a relationship set. Like entities, a relationship too
can have attributes. These attributes are called descriptive attributes.
Degree of Relationship
The number of participating entities in a relationship defines the degree of the relationship.
Binary = degree 2
Ternary = degree 3
n-ary = degree
Mapping Cardinalities
Cardinality defines the number of entities in one entity set, which can be associated with the
number of entities of other set via relationship set.
One-to-one − One entity from entity set A can be associated with at most one entity of
entity set B and vice versa.
Names:………………………………………Reg No:………………………………Group:…….Day:……………………Page 4/ 24
One-to-many − One entity from entity set A can be associated with more than one
entities of entity set B however an entity from entity set B, can be associated with at most
one entity.
Many-to-one − More than one entities from entity set A can be associated with at most
one entity of entity set B, however an entity from entity set B can be associated with
more than one entity from entity set A.
Many-to-many − One entity from A can be associated with more than one entity from B
and vice versa.
Names:………………………………………Reg No:………………………………Group:…….Day:……………………Page 5/ 24
BirthDate(Month, Day, Year)
Address(StreetAddr(StrNum, StrName, AptNum), City, State, Zip)
Single- vs. multi-valued attribute: Consider a PERSON entity. The person it represents
has (one) SSN, (one) date of birth, (one, although composite) name, etc. But that person may
have zero or more academic degrees, dependents, or (if the person is a male living in Utah)
spouses! How can we model this via attributes AcademicDegrees, Dependents, and Spouses?
One way is to allow such attributes to be multi-valued (perhaps set-valued is a better term),
which is to say that we assign to them a (possibly empty) set of values rather than a single value.
To distinguish a multi-valued attribute from a single-valued one, it is customary to
enclose the former within curly braces (which makes sense, as such an attribute has a value that
is a set, and curly braces are traditionally used to denote sets). Using the PERSON example from
above, we would depict its structure in text as
PERSON(SSN, Name, BirthDate(Month, Day, Year), { AcademicDegrees(School, Level,
Year) }, { Dependents }, ...)
Here we have taken the liberty to assume that each academic degree is described by a
school, level (e.g., H.S., B.S., Ph.D.), and year. Thus, AcademicDegrees is not only multi-valued
but also composite. We refer to an attribute that involves some combination of multi-valuedness
and compositeness as a complex attribute.
A more complicated example of a complex attribute is AddressPhone in Figure 7.5 (page
207). This attribute is for recording data regarding addresses and phone numbers of a business.
The structure of this attribute allows for the business to have several offices, each described by
an address and a set of phone numbers that ring into that office. Its structure is given by
{ AddressPhone( { Phone(AreaCode, Number) }, Address(StrAddr(StrNum, StrName,
AptNum), City, State, Zip)) }
Stored vs. derived attribute: Perhaps independent and derivable would be better terms
for these (or non-redundant and redundant). In any case, a derived attribute is one whose value
can be calculated from the values of other attributes, and hence need not be stored. Examples:
Age can be calculated from BirthDate, assuming that the current date is accessible. GPA can be
calculated, assuming that the necessary data regarding courses and grades is accessible.
The Null value: In some cases a particular entity might not have an applicable value for a
particular attribute. Or that value may be unknown.
Example: The attribute DateOfDeath is not applicable to a living person and its correct
value may be unknown for some persons who have died.
In such cases, we use a special attribute value (non-value?), called null. There has been
some argument among database experts about whether a different approach (such as having
distinct values for not applicable and unknown) would be superior.
Entity Types, Entity Sets, Keys, and Domains (Value Sets)
In ER modeling, we deal only with entity types, not with instances. In an ER diagram,
each entity type is denoted by a rectangular box.
An entity set is the collection of all entities of a particular type that exist, in a database, at
some moment in time.
Key Attributes of an Entity Type: A minimal collection of attributes (often only one)
that, by design, distinguishes any two (simultaneously-existing) entities of that type.
Names:………………………………………Reg No:………………………………Group:…….Day:……………………Page 6/ 24
Domains (Value Sets) of Attributes: The domain of an attribute is the "universe of
values" from which its value can be drawn. In other words, an attribute's domain specifies its set
of allowable values. The concept is similar to data type.
Example Database Application: COMPANY
Suppose that Requirements Collection and Analysis results in the following (informal)
description of the COMPANY miniworld:
The company is organized as a collection of departments.
Each department
o has a unique name
o has a unique number
o is associated with a set of locations
o has a particular employee who acts as its manager (and who assumed that position
on some date)
o has a set of employees assigned to it
o controls a set of projects
Each project
o has a unique name
o has a unique number
o has a single location
o has a set of employees who work on it
o is controlled by a single department
Each employee
o has a name
o has a SSN that uniquely identifies her/him
o has an address
o has a salary
o has a sex
o has a birthdate
o has a direct supervisor
o has a set of dependents
o is assigned to one department
o works some number of hours per week on each of a set of projects (which need
not all be controlled by the same department)
Each dependent
o has first name
o has a sex
o has a birthdate
o is related to a particular employee in a particular way (e.g., child, spouse, pet)
o is uniquely identified by the combination of her/his first name and the employee
of which (s)he is a dependent
Names:………………………………………Reg No:………………………………Group:…….Day:……………………Page 7/ 24
Relationship Types, Sets, Roles, and Structural Constraints
Having presented a preliminary database schema for COMPANY, it is now convenient to
clarify the concept of a relationship (which is the last of the three main concepts involved in the
ER model).
Relationship: This is an association between two entities. As an example, one can
imagine a STUDENT entity being associated to an ACADEMIC_COURSE entity via, say, an
ENROLLED_IN relationship.
Whenever an attribute of one entity type refers to an entity (of the same or of a different
entity type), we say that a relationship exists between the two entity types.
From our preliminary COMPANY schema, we identify the following relationship types
(using descriptive names and ordering the participating entity types so that the resulting phrase
will be in active voice rather than passive):
Degree of a relationship type: Also note that, in our COMPANY example, all
relationship instances will be ordered pairs, as each relationship associates an instance from one
entity type with an instance of another (or the same, in the case of SUPERVISES) entity type. Such
relationships are said to be binary, or to have degree two. Relationships with degree three (called
ternary) or more are also possible, but they do not arise as often in practice
Roles in relationships: Each entity that participates in a relationship plays a particular
role in that relationship, and it is often convenient to refer to that role using an appropriate name.
For example, in each instance of a WORKS_FOR relationship set, the employee entity plays the role
of worker or (surprise!) employee and each department plays the role of employer or (surprise!)
department. Indeed, as this example suggests, often it is best to use the same name for the role as
for the corresponding entity type.
An exception to this rule occurs when the same entity type plays two (or more) roles in
the same relationship. (Such relationships are said to be reCURsive, which I find to be a
misleading use of that term. A better term might be self-referential.) For example, in each
instance of a SUPERVISES relationship set, one employee plays the role of supervisor and the
other plays the role of supervisee.
cardinality ratio:
Names:………………………………………Reg No:………………………………Group:…….Day:……………………Page 8/ 24
1:1 (one-to-one): Under this constraint, no instance of A may particpate in more than one
instance of R; similarly for instances of B. In other words, if (a1, b1) and (a2, b2) are
(distinct) instances of R, then neither a1 = a2 nor b1 = b2.
Example: Our informal description of COMPANY says that every department has one
employee who manages it. If we also stipulate that an employee may not
(simultaneously) play the role of manager for more than one department, it follows that
MANAGES is 1:1.
1:N (one-to-many): Under this constraint, no instance of B may participate in more than
one instance of R, but instances of A are under no such restriction. In other words, if (a1,
b1) and (a2, b2) are (distinct) instances of R, then it cannot be the case that b1 = b2.
Example: CONTROLS is 1:N because no project may be controlled by more than one
department. On the other hand, a department may control any number of projects, so
there is no restriction on the number of relationship instances in which a particular
department instance may participate. For similar reasons, SUPERVISES is also 1:N.
N:1 (many-to-one): This is just the same as 1:N but with roles of the two entity types
reversed.
Example: WORKS_FOR and DEPENDS_ON are N:1.
M:N (many-to-many): Under this constraint, there are no restrictions. (Hence, the term
applies to the absence of a constraint!)
Example: WORKS_ON is M:N, because an employee may work on any number of projects
and a project may have any number of employees who work on it.
participation: specifies whether or not the existence of an entity depends upon its being
related to another entity via the relationship.
total participation (or existence dependency): To say that entity type A is constrained
to participate totally in relationship R is to say that if (at some moment in time) R's
instance set is
partial participation: the absence of the total participation constraint! (E.g., not every
employee has to participate in MANAGES; hence we say that, with respect to
MANAGES, EMPLOYEE participates partially. This is not to say that for all employees
to be managers is not allowed; it only says that it need not be the case that all employees
are managers.
Relationship types, like entity types, can have attributes. A good example is WORKS_ON,
each instance of which identifies an employee and a project on which (s)he works. In
order to record (as the specifications indicate) how many hours are worked by each
employee on each project, we include Hours as an attribute of WORKS_ON. For example,
the StartDate attribute of the MANAGESrelationship type can be given to either the
EMPLOYEE or the DEPARTMENT entity type.
Names:………………………………………Reg No:………………………………Group:…….Day:……………………Page 9/ 24
Weak Entity Types: An entity type that has no set of attributes that qualify as a key is
called weak. (Ones that do are strong.)
An entity of a weak identity type is uniquely identified by the specific entity to which it is
related (by a so-called identifying relationship that relates the weak entity type with its
so-called identifying or owner entity type) in combination with some set of its own
attributes (called a partial key).
Example: A DEPENDENT entity is identified by its first name together with the
EMPLOYEE entity to which it is related via DEPENDS_ON. (Note that this wouldn't work
for former heavyweight boxing champion George Foreman's sons, as they all have the
name "George"!)
Because an entity of a weak entity type cannot be identified otherwise, that type has a
total participation constraint (i.e., existence dependency) with respect to the
identifying relationship.
This should not be taken to mean that any entity type on which a total participation
constraint exists is weak. For example, DEPARTMENT has a total participation
constraint with respect to MANAGES, but it is not weak.
Click here for more: http://jcsites.juniata.edu/faculty/rhodes/dbms/ermodel
Methodology:
1. Use E-R model to get a high-level graphical view of the essential components of the
enterprise and how these components are related
2. We then convert the E-R diagram to SQL DDL, or whatever database model you are
using
E-R Model is not SQL-based. It's not tied to any particular DBMS. It is a conceptual and
semantic model, which attempts to capture meanings rather than an actual implementation.
Entities
Relationships among entities
Entity – rectangle
Attribute – oval
Relationship – diamond
Link - line
Entity Type or Set: set of similar objects or a category of entities; they are well defined
Attribute: describes one aspect of an entity type; usually [and best as] a single value and
indivisible (atomic)
Note that the value for an attribute can be a set or list of values, sometimes called "multi-
valued" attributes
This is in contrast to the pure relational model which requires atomic values
E.g., (111111, John, 123 Main St, (stamps, coins))
Entity Schema:
The meta-information of entity type name, attributes (and associated domain), key constraints
Entity Types tend to correspond to nouns; attributes are also nouns albeit descriptions of the
parts of entities
We may have null values for some entity attribute instances – no mapping to domain for those
instances
Keys
Superkey: an attribute or set of attributes that uniquely identifies an entity--there can be many of
these
Candidate key: a superkey such that no proper subset of its attributes is also a superkey
(minimal superkey – has no unnecessary attributes)
Primary key: the candidate key chosen to be used for identifying entities and accessing
records. Unless otherwise noted "key" means "primary key"
Secondary key: attribute or set of attributes commonly used for accessing records, but not
necessarily unique
Foreign key: term used in relational databases (but not in the E-R model) for an attribute that
is the primary key of another table and is used to establish a relationship with that table where it
appears as an attribute also.
So a foreign key value occurs in the table and again in the other table. This conflicts with the
idea that a value is stored only once; however, the idea that a fact is stored once is not
undermined.
Rectangle -- Entity
Ellipses -- Attribute (underlined attributes are [part of] the primary key)
Dashed ellipses-- derived attribute, e.g. age is derivable from birthdate and current date.
[Drawing notes: keep all attributes above the entity. Lines have no arrows. Use straight lines
only]
Relationship Types may also have attributes in the E-R model. When they are mapped to the
relational model, the attributes become part of the relation. Represented by a diamond on E-R
diagram.
Relationships tend to be verbs or verb phrases; attributes of relationships are again nouns
[Drawing tips: relationship diamonds should connect off the left and right points; Dia can label
those points with cardinality; use Manhattan connecting line (horizontal/vertical zigzag)]
Roles
The role of a relationship type names one of the related entities. The name of the entity is usually
the role name.
e.g., "John" is value of Student role, "CS" value of Department role of MajorsIn
relationship type
e.g., ReportsTo relationship type relates two elements of Employee entity type:
We do not have distinct names for the roles. It is not clear who reports to whom.
Solution: the role name of relationship type need not be same as name of entity type from which
participants are drawn
Roles are edges labeled with role names (omitted if role name = name of entity set). Most
attributes have been omitted.
Think that entities are nouns; relationship types are often verbs
students and departments are the entities (nouns) and roles in relationship types
majors is the relationship type (verb)
Here we have equate the role name (Student) the name of the entity type (Student) of the
participant in the relationship.
Degree of relationship
The number of roles in the relationship
Note: ternary relationships may sometimes be replaced by two or more binary relationships (see
book Figures 3.5 and 3.13). Semantic equivalence between ternary relationships and two binary
ones are not necessarily the same.
One-to-one: X-Y is 1:1 when each entity in X is associated with at most one entity in Y, and
each entity in Y is associated with at most one entity in X.
One-to-many: X-Y is 1:M when each entity in X can be associated with many entities in Y, but
each entity in Y is associated with at most one entity in X.
Many-to-many: X:Y is M:M if each entity in X can be associated with many entities in Y, and
each entity in Y is associated with many entities in X ("many" =>one or more and sometimes
zero)
Key constraint
If every entity participates in exactly one relationship, both a total participation and a key
constraint hold
E.g., if a class is taught by only one faculty member.
Partial participation
Weak entity may have a partial key, called a discriminator, that distinguishes instances of the
weak entity that are related to the same strong entity
Use double rectangle for weak entity, with double diamond for relationship connecting it to its
associated strong entity
Note: not all existence dependent entities are weak – the lack of a key is essential to definition
Role names, Ri, and their corresponding entity sets. Roles must be single valued (the number of
roles is called its degree)
Attribute names, Aj, and their corresponding domains. Attributes in the E-R model may be set or
multi-valued.
Key: Minimum set of roles and attributes that uniquely identify a relationship