Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
9 views8 pages

Normalization

Uploaded by

qqito07
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views8 pages

Normalization

Uploaded by

qqito07
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Anastasia Sulukhia

DB1
Notes on Normalization
--------------------------------------------------------------------------------------------------------------------------
Why is normalization needed at all in the DBMS?

Prevents Anomalies / Improves Data Integrity :


Normalization helps prevent data anomalies:
• Insertion Anomalies: Arise when inserting a new record causes difficulty.
• Update Anomalies: Occur when changes to data aren't consistent, leading to
inaccuracies.
• Deletion Anomalies: Happen when deleting a record unintentionally removes
additional, needed data.

Example of insertion anomaly

It is difficult to insert a new department that has no employees as yet in the EMP_DEPT
relation. The only way to do this is to place NULL values in the attributes for employee.
This violates the entity integrity for EMP_DEPT because its primary key SSN cannot be
null.

Example of deletion anomaly


If we delete from EMP_DEPT an employee record that happens to represent the last
employee working for a particular department, the information concerning that department
is lost inadvertently from the database.
In this case, deletion anomaly occurs when Borg, James E. Leaves the job, since he is the last
representative of department 1. If his row is deleted from the EMP_DEPT table, information
about department 1 Is lost. since it’s is not stored anywhere.
Example of update Anomaly
the manager of department 5 — we must update the tuples of all employees who work in
that department; otherwise, the database will become inconsistent.

Normalization Eliminates Redundancy: Without normalization, data can be duplicated in


multiple tables. This redundancy takes up unnecessary storage and makes it more
challenging to maintain consistency across records. Normalization reduces redundancy by
dividing data into related tables, where each piece of information appears only once.

Non additive joins


In database terms, spurious tuples refer to incorrect or extra tuples that appear in our
results but aren't actually needed or valid. To prevent these spurious tuples, it's crucial that
we join tables in a way that maintains non-additive joins.

When we join these tables on the course column, which is the only common attribute, we
get duplicated or "spurious" rows. For each student-course combination in R1, the join will
look for matching course values in R2. Since there are multiple professors for the same
course (like "Programming" in this example), each student entry in R1 gets duplicated —
once for each professor in R2.
Both students appear twice in the joined table, each time with a different professor for the
same course, creating unnecessary duplicates.
This time, we’re joining based on the professor column, which is unique for each row in R2.
This prevents duplicate rows because each professor in R1 links directly to a single course
in R2. Since each professor is associated with only one course, the join will correctly add the
course information to each student’s record without creating duplicates. For example, the
join adds "Programming" for Professor Hugh, "Usability" for Professor Seager, etc. No
unnecessary duplications occur because each professor only has one matching course in R2.

Functional dependencies

we say that attribute A functionally determines attribute B if, given a specific


value of A, there is only one possible value of B associated with it.
So, Given a value of A, we know the value of B.
it is written as A -> B. (A&B can be set of attributes or just single ones).

example:
The ID number of a person determines their first name. This means that if you
know a person’s ID number, you can uniquely identify their first name because
ID numbers are unique for each person. In terms of functional dependency, we
can say that “First Name is functionally dependent on ID” This is valid because
there’s only one possible first name associated with each unique ID.
dictionary

1. non–key / non – prime attributes – attributes that are not part of any
candidate key in the table

2. key/prime attributes – attributes that are part of any candidate key in the
table

3. multivalued attributes, grouping – when a column is not atomic and has


multiple values.

4. nested tables – when a column in a table consists of multiple sub


columns. Skill should be divided into skill & qualification columns

Normal forms

1st Normal Form


• we have a primary key
• attributes are atomic (no multivalued attributes / repeating groups)
• no nested tables

every table that we work on in this course should be at least in 1st normal
Form.
2nd Normal Form
• it is in 1st normal form
• No partial dependency –> every non–key attribute is determined by the
whole key.

Partial functional dependency is given if a non-prime attribute depends on


part of a candidate key. If you have a non-key attribute determined by the
part of a candidate key, you have partial dependency.

For example:
PK (A, B)
non-prime attribute – C

if A->C or B->C, it is a partial dependency

If primary key is atomic, not composite, you don’t even have to check for 2 nd
normal form, since you can’t have partial dependency without composite
primary key.

3rd normal form


• it is in 2nd normal form
• No transitivity – no non-key attribute determines other non-key
attribute, so no non – key attribute is transitively dependent on the
primary key.

For example:

PK (A, B)
Non-prime attributes – C, D

If C->D or D->C, we have transitivity.


(A->C and C->D so A->D)
BCNF
• It is 3rd normal form
• Every determinant is a candidate key of the relation (can uniquely
identify all attributes in the table)

Determinant – attribute / attributes that determine other attributes, others


are dependent on.

There is only difference between 3rd normal form and BCNF if and only if you
have candidate keys with overlapping attributes. So, if you only have one
candidate key or multiple keys without overlapping attribute and you are in
3rd NF you are also in BCNF.

Example of normalization from last year midterm.

BookingID ArrivalDate Nigths RoomID roomPrice GuestID GuestName

2024-001 2024-01-06 1 3 65.00 1 Miller

2024-002 2024-01-06 1 11 65.00 2 Summer

2024-002 2024-01-06 1 21 65.00 2 Summer

2024-002 2024-01-06 1 30 35.00 2 Summer

2024-003 2024-01-07 2 30 35.00 3 Mais

2024-003 2024-01-07 2 11 65.00 3 Mais

2024-004 2024-01-07 3 3 65.00 3 Mais

1. Write down 2 candidate keys of the relation


2. Write down all functional dependencies
3. Transform the relations step by step into 3rd / BCNF NF. Explain.

Candidate keys: {bookingID, RoomId} and {arrivalDate, roomID}

Prime attributes : {bookingID, RoomId, arrivalDate, roomID}


Functional dependencies:

{bookingiD,roomID} → {arrivalDate,Nights,roomPrice,guestID,guestName}
{arrivalDate, roomID} → {bookingID,Nights,roomPrice,guestID,guestName}
roomID → roomPrice
guestID → guestName
bookingID → {arrivalDate, nights, guestID, guestName}

chose (arrivalDate, roomID) as pk

this table is already in 1st normal form since we have atomic values and no nesting with
primary key already chosen.

But, we are not in 2nd normal form because non-prime attribute roomPrice is functionally
dependent on the part of a primary key (roomID → roomPrice). arrivalDate, nights, guestID
and guestName are also dependent on the part of a candidate key BookingID. For these
reasons we have partial dependency.

Transform it into 2nd Normal form. We should separate non-prime attributes thar are
partially dependent on the key into new tables with their determinant as a PK.

booking: {bookingID, nights, guestID, guestName}


room: {roomID, roomPrice}
stay: {arrivalDate, roomID, booking ID}

now, no non-prime attribute is partially dependent on the key in any table.


we use foreign keys to be able to join these tables when needed. We set FKs in a way to
maintain non additive join property.

Are we in 3rd NF?

guestID → guestName is transitive dependency in this relation - booking: {bookingID, nights,


guestID, guestName}, since non-prime attribute determines another non-prime.

To fix this we should separate the columns which have transitive dependency from the
original table and put them in separate table, making determinant the PK.

guest: {guestID,guestName} ---- table created because of transitively dependant columns


room: {roomID, roomPrice}
booking: {bookingID, nights, guestID}
stay: {arrivalDate, roomID, booking ID}

no transitivity left in any table.


Is it in BCNF? – since we had two composite candidate keys with overlapping attribute, we
know that transforming to 3rd normal form, does not automatically transform us to BCNF.
In BCNF every determinant should be a candidate key of its table. Let’s check.

in relation stay we have FDs

{bookingID, roomID} → arrivalDate


{arrivalDate, roomID} → bookingID
bookingID → arrivalDate

booking ID -> arrivalDate but not roomID, so it is not a candidate key of relation stay, so we
are not in BCNF.

guest: {guestID,guestName}
room: {roomID, roomPrice}
booking: {bookingID, nights, guestID}
stay: {roomID, booking ID}
extra: {bookingID , arrivalDate }

since booking and stay have the same PK and adding arrivalDate to booking would not cause
partial dependency neither transitivity we can merge extra and stay into one relation.

Finally, we get these relations:

guest: {guestID,guestName}
room: {roomID, roomPrice}
booking: {bookingID, nights, guestID, arrivalDate }
stay: {roomID, booking ID}

You might also like