Database - Vocational Training Council



Chapter 7 – Logical Design for Relational Systems

1. Dependency

A dependency is a relationship among attributes (not a relationship among entities) usually in the same table.

1.1 Functional Dependency

Y is functionally dependent on X ( X ( Y ), if for a given value of X a unique value of Y can be determined.

In relational tables, each attribute should be functionally dependent on the primary key.

Examples:

CUSTOMER ( SEQ_ID, NAME, ADDRESS , PHONE )

NAME, ADDRESS and PHONE are each functionally dependent on SEQ_ID.

SEQ_ID ( NAME

SEQ_ID ( ADDRESS

SEQ_ID ( PHONE

or

SEQ_ID ( NAME, ADDRESS , PHONE

1.2 Full Functional Dependency

Each of the non-key attributes is functionally dependent on the entire primary key and not on a part of the key.

Examples:

CUSTOMER ( SEQ_ID, NAME, ADDRESS , PHONE )

NAME, ADDRESS and PHONE are fully functionally dependent on SEQ_ID (the primary key).

ORDER ( ORDER_NUM, PROD_ID, PROD_NAME, ORDER_QTY )

PROD_NAME is dependent on PROD_ID only, not fully functionally dependent on the entire primary key

PROD_ID ( PROD_NAME

ORDER_QTY is dependent on ORDER_NAME and PROD_ID, fully functionally dependent on the entire primary key

ORDER_NUM, PROD_ID ( ORDER_QTY

2. Normalization

• A process that analyzes the dependency between attributes, and usually results in decomposing a table into two or more tables.

• A JOIN of the decomposed tables must produce the original table with no loss (decomposed tables should not be Join Dependent)

• Normalized tables are flexible, stable and easier to maintain than de-normalized tables. A well-structured relation (table) contains minimal redundancy and allows users to insert, modify, and delete the rows in a table without errors or inconsistencies.

• Decomposing tables required the use of JOIN to retrieve the data from the decomposed tables. This may affect performance.

3. Database Anomalies with De-normalized Tables

Problems occur when tables are not in the proper normal form.

Example:

ORDER is a de-normalized table.

ORDER ( ORDER_NUM, CUST_NO, CUST_NAME, CUST_PHONE, ORDER_DATE, PART_ID, PART_NAME, ORDER_QTY )

Insert

• Insert problems when part of the primary key is unknown

• For example, an attempt to enter a new customer that does not have an ORDER_NUM.

Delete

• Delete problems when a row contains important information

• For example, an attempt to delete order number 7027 will delete the row and also the customer information

Update

• Update problems when there are unnecessary redundancy in the data

• For example, an attempt to update a customer name (change John to Johnson ) requires updates to all rows where the customer name is John.

4. Normalization Process

4.1 First Normal Form (1NF)

The object of First Normal Form is to eliminate repeating columns for a single row in the table.

• In 1NF table must be flat, a cell in a table can contain single value only

• No repeating groups in tables.

Normalization process

• Remove the repeating group from the table and form two tables, one with the non-repeating data and the other with the repeating data.

• Add one to many relationships between the two tables

Example:

De-normalized

ORDER ( ORDER_NUM, CUST_NO, CUST_NAME, CUST_PHONE, ORDER_DATE, PART_ID, PART_NAME, ORDER_QTY )

← ORDER contains many parts, so PART_ID, PART_NAME and ORDER_QTY is a repeating group.

1NF

ORDER ( ORDER_NUM, CUST_NO, CUST_NAME, CUST_PHONE, ORDER_DATE )

PART_ORDER ( ORDER_NUM, PART_ID, PART_NAME, ORDER_QTY )

4.2 Second Normal Form (2NF)

• The table is in 1NF.

• Each of the non-key attributes is functionally dependent on the entire primary key (Full Functional Dependency) and not on part of the key.

• Table with single primary key attribute is automatically in 2NF

Normalization process

• Remove attributes that are not fully functionally dependent from the table and form two or more tables, one with fully functionally dependent attributes and the other(s) with the partly functionally dependent attributes.

• Add relationships between the tables(1:1 or 1:M)

Example:

De-normalized

PART_ORDER ( ORDER_NUM, PART_ID, PART_NAME, ORDER_QTY )

← Attribute PART_NAME, is dependent on part of the primary key only, the PART_ID.

PART_ID ( PART_NAME

2NF

PART_ORDER ( ORDER_NUM, PART_ID, ORDER_QTY )

PART ( PART_ID, PART_NAME )

4.3 Third Normal Form (3NF)

• The table is in 2NF.

• No non-key attribute is dependent on another non-key attribute (no Transitive Dependency)

Normalization Process

Remove attributes that are transitive dependent from the table and form two tables, one with the non-transitive dependent attributes and the other with the transitive dependent attributes

Add relationships between the table (1:1 or 1:M)

Example:

De-normalized

ORDER ( ORDER_NUM, CUST_NO, CUST_NAME, CUST_PHONE, ORDER_DATE )

← Attributes CUST_NAME, CUST_PHONE, are transitive dependent on CUST_ID

CUST_NO ( CUST_NAME, CUST_PHONE

3NF

ORDER ( ORDER_NUM, CUST_NO, ORDER_DATE )

CUSTOMER ( CUST_NO, CUST_NAME, CUST_PHONE )

4.4 Boyce-Codd Normal Form (BCNF) *

• The new definition actually defines a normal form that is strictly stronger than the old 3NF, it is better to introduce a new name for it, instead of just continuing to call it 3NF, hence the term Boyce/Codd Normal Form (BCNF).

• Every relation in BCNF is also 3NF; however, a relation in 3NF is not necessarily in BCNF.

• No prime attribute is dependent on another non-key attribute.

• Every determinant in a table is a candidate key.

Normalization Process

Remove prime attributes that are dependent on non-key attribute and form two tables.

Add relationships between the tables (1:1 or 1:M)

Example:

De-normalized ( in 2NF - attributes AREA_CODE and COUNTRY added)

ORDER ( ORDER_NUM, CUST_NO, CUST_NAME, CUST_PHONE, AREA_CODE, COUNTRY_NAME, ORDER_DATE )

← Attributes CUST_NAME, CUST_PHONE, AREA_CODE, COUNTRY_NAME are transitive dependent on CUST_NO

CUST_NO ( CUST_NAME, CUST_PHONE, AREA_CODE, COUNTRY_NAME

3NF

ORDER ( ORDER_NUM, CUST_NO, ORDER_DATE )

CUSTOMER ( CUST_NO, CUST_NAME, CUST_PHONE, AREA_CODE, COUNTRY_NAME )

De-normalized ( in 3NF )

CUSTOMER ( CUST_NO, CUST_NAME, CUST_PHONE, AREA_CODE, COUNTRY_NAME )

← Assume attributes COUNTRY_NAME, CUST_PHONE is a candidate key in CUSTOMER table

COUNTRY_NAME, CUST_PHONE ( CUST_NO, CUST_NAME, AREA_CODE

← Non-key attribute AREA_CODE determines the prime attribute of candidate key, COUNTRY_NAME

AREA_CODE ( COUNTRY_NAME

(e.g. Area Code 604 for Vancouver, 416 for Toronto .. in country CANADA)

BCNF

CUSTOMER ( CUST_NO, CUST_NAME, CUST_PHONE, AREA_CODE )

CUST_AREA ( AREA_CODE, COUNTRY )

5. Normalization Process Example

An order system contains the following:

ORDER ( ORDER_NUM, SUPPLIER_ID, SUPPLIER_NAME,

ORDER_DATE, PRODUCT_ID, PRODUCT_NAME, ORDER_QTY, UNIT_COST, TOTAL_COST )

• An order is supplied by one supplier

• Order contains many products

1. Step#1 – Remove Unnecessary Attributes

← TOTAL_COST is a derived attribute (UNIT_COST*ORDER_QTY). This attribute can be removed without losing any information

ORDER ( ORDER_NUM, SUPPLIER_ID, SUPPLIER_NAME,

ORDER_DATE, PRODUCT_ID, PRODUCT_NAME, ORDER_QTY, UNIT_COST )

2. Step#2 – Normalize the System to 1NF

← ORDER contains many products, PRODUCT_ID, PRODUCT_NAME, ORDER_QTY, UNIT_COST attributes belong to a repeating group

ORDER ( ORDER_NUM, SUPPLIER_ID, SUPPLIER_NAME,

ORDER_DATE, PRODUCT_ID, PRODUCT_NAME, ORDER_QTY, UNIT_COST )

1NF

ORDER ( ORDER_NUM, SUPPLIER_ID, SUPPLIER_NAME,

ORDER_DATE )

ORDER_LINE ( ORDER_NUM, PRODUCT_ID, PRODUCT_NAME, ORDER_QTY, UNIT_COST )

3. Step#3 – Normalize the System to 2NF

← PRODUCT_NAME and UNIT_COST, in ORDER_LINE table, are not fully functionally dependent. They are dependent on PRODUCT_ID only (part of the primary key)

ORDER ( ORDER_NUM, SUPPLIER_ID, SUPPLIER_NAME,

ORDER_DATE )

ORDER_LINE ( ORDER_NUM, PRODUCT_ID, PRODUCT_NAME, ORDER_QTY, UNIT_COST )

2NF

ORDER ( ORDER_NUM, SUPPLIER_ID, SUPPLIER_NAME,

ORDER_DATE )

PRODUCT ( PRODUCT_ID, PRODUCT_NAME, UNIT_COST )

ORDER_LINE ( ORDER_NUM, PRODUCT_ID, ORDER_QTY )

5.5 Step#4 – Normalize the System to 3NF

← SUPPLIER_NAME, in ORDER table, is transitively dependent on SUPPLIER_ID

ORDER ( ORDER_NUM, SUPPLIER_ID, SUPPLIER_NAME,

ORDER_DATE )

PRODUCT ( PRODUCT_ID, PRODUCT_NAME, UNIT_COST )

ORDER_LINE ( ORDER_NUM, PRODUCT_ID, ORDER_QTY )

3NF

ORDER ( ORDER_NUM, SUPPLIER_ID, ORDER_DATE )

SUPPLIER ( SUPPLIER_ID, SUPPLIER_NAME )

PRODUCT ( PRODUCT_ID, PRODUCT_NAME, UNIT_COST )

ORDER_LINE ( ORDER_NUM, PRODUCT_ID, ORDER_QTY )

5.6 Step#5 – Normalize the System to BCNF

All determinants are candidate keys. All tables are in BCNF.

6. Logical Database Design Consideration

6.1 Synonyms

In some situations, two (or more) attributes may have different names but the same meaning, as when they describe the same characteristic of an entity. Such attributes are called synonyms.

← When merging the relations that contain synonyms, you should obtain agreement from users on a single, standardized name for the attribute and eliminate any other synonyms.

STUDENT1 ( STUDENT_ID, NAME )

STUDENT2 ( MATRICULATION_NO, NAME, ADDRESS)

Social Security Number (SSN) area identical attributes One possible resolution would be to use new attribute name and merge the two relations:

STUDENT ( SSN, NAME, ADDRESS)

6.2 Homonyms

An attribute that may have more than one meaning is called a homonym.

STUDENT1 ( STUDENT_ID, NAME, ADDRESS )

STUDENT2 ( STUDENT_ID, NAME, PHONE, ADDRESS)

The attribute ADDRESS in STUDENT1 refers to a student’s campus address, while in STUDENT2 the same attribute refers to a student’s home address.

← To resolve this conflict, we would probably need to create new attribute names, so that the merged relation would become

STUDENT ( STUDENT_ID, NAME, PHONE, CAMPUS_ADDRESS, HOME_ADDRESS)

6.3 Transitive Dependencies

When two 3NF relations are merged to form a single relation, transitive dependencies may result.

STUDENT1 ( STUDENT_ID, MAJOR )

STUDNET2 ( STUDENT_ID, ADVISOR )

Since STUDENT1 and STUDENT2 have the same primary key, the two relations may be merged:

STUDENT ( STUDENT_ID, MAJOR, ADVISOR )

← Suppose that each major has exactly one advisor. In this case, ADVICER is functionally dependent on MAJOR.

MAJOR ( ADVICER

If preceding functional dependency exists, then STUDENT is in 2NF. Remove transitive dependency to form 3NF.

STUDENT ( STUDENT_ID, MAJOR )

STUDENT ( MAJOR, ADVISOR )

6.4 Supertype/Subtype Relationship

Supertpe and Subtype:

The supertype PATIENT contains all patients information and the subtype INPATIENT and the subtype OUTPATIENT contain information of the patients stayed in hospital and patients consulting from outside accordingly.

← Supertype/subtype relationships may be hidden in user views or relations.

Suppose that we have following two hospital relations:

PATIENT1 ( PATIENT_ID, NAME, ADDRESS )

PATIENT2 ( PATIENT_ID, ROOM_NO )

Initially, two relations can be merged into a single PATIENT relation, because of same key PATIENT_ID. However, PATIENT1 actually contains attributes common to all patients. PATIENT2 contains an attribute ROOM_NO that is only of inpatients.

← We should create supertype/subtype relationships for these entities:

PATIENT ( PATIENT_ID, NAME, ADDRESS )

INPATIENT ( PATIENT_ID, ROOM_NO )

OUTPATIENT ( PATIENT_ID, DATE_TREATED, NEXT_APPOINTMENT)

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download