Lecture 5
Minimizing redundancy implies
minimizing redundant storage of the same information
reducing the need for multiple updates to maintain
consistency across multiple copies of the same information
Imparting Clear Semantics to Attributes in Relations
Design a relation schema so that it is easy to explain its meaning.
Do not combine attributes from multiple entity types and
relationship types into a single relation.
Eliminating Redundant and Update Anomalies
In EMP_DEPT, the attribute values pertaining to a particular department
(Dnumber, Dname, Dmgr_ssn) are repeated for every employee who
works for that department.
In contrast, each department’s information appears only once in the
DEPARTMENT relation.
Update Anomalies can be classified into insertion anomalies, deletion
anomalies, and modification anomalies.
Avoiding NULL Values
If many of the attributes do not apply to all tuples in the relation, we end
up with many NULLs in those tuples.
This can waste space at the storage level
may also lead to problems with understanding the meaning of the
attributes
SELECT and JOIN operations involve comparisons; if NULL values are
present, the results may become unpredictable.
denoted by X → Y, between two sets of attributes X and Y that are subsets
of R specifies a constraint on the possible tuples that can form a relation
state r of R.
The constraint is that, for any two tuples t1 and t2 in r that have t1[X] =
t2[X], they must also have t1[Y] = t2[Y].
means that the values of the Y component of a tuple in r depend on, or are
determined by, the values of the X component
Codd proposed the normalization process (1972) that takes a relation
schema through a series of tests to certify whether it satisfies a certain
normal form.
The normalization procedure provides database designers with the
following:
A formal framework for analyzing relation schemas based on their keys and on
the functional dependencies among their attributes.
A series of normal form tests that can be carried out on individual relation
schemas
The normal form of a relation refers to the highest normal form condition
that it meets, and hence indicates the degree to which it has been
normalized.
all the attributes in a relation must have atomic domains.
The values in an atomic domain are indivisible units.
the value of any attribute in a tuple must be a single value from the
domain of that attribute.
First normal form also disallows multivalued attributes that are
themselves composite.
These are called nested relations because each tuple can have a
relation within it.
A relation schema R is in 2NF if every nonprime attribute A in R is
fully functionally dependent on the primary key of R.
In other word, A relation schema R is in 2NF if every nonprime
attribute A in R is not partially dependent on any key of R
a relation schema R is in 3NF if it satisfies 2NF and no nonprime
attribute of R is transitively dependent on the primary key.
A functional dependency X → Y in a relation schema R is a transitive
dependency if there exists a set of attributes Z in R that is neither a
candidate key nor a subset of any key of R, and both X → Z and Z → Y
hold.