Chapter 5



Chapter 5

The Relational Model and Normalization

Answers to Group I Questions

5.1 What restrictions must be placed on a table for it to be considered a relation?

A cell contains one and only one data value; a column must contain data of the same type; no row can be duplicated; the order of the columns is immaterial; the order of the rows is immaterial.

5.2 Define the following terms: relation, tuple, attribute, file, record, field, table, row, column.

Relation is a two-dimensional table that meets the restrictions in question 5.1; tuple is a row of a relation; attribute is a column of a relation; file is often considered the same as relation; record same as tuple; file same as attribute.

5.3 Define functional dependency. Give an example of two attributes that have a functional dependency, and give an example of two attributes that do not have a functional dependency.

A functional dependency is a relationship between attributes such that given the value of one attribute it is possible to determine the value of the other attribute. Example of functional dependency: Name--->Phone#. Example of attributes that are not functionally dependent: Age and Address.

5.4 If SID functionally determines Activity, does this mean that only one value of SID can exist in the relation? Why or why not?

No, a particular value of SID can occur many times in the relation. A determinant (such as SID) is not necessarily unique within a relation. However, a particular value of SID will have only one corresponding value of Activity, no matter how many rows they appear in.

5.5 Define determinant.

A determinant is an attribute whose value enables us to obtain the value(s) of other related attributes. It appears on the left side of a functional dependency. Thus, in

A--->B, the determinant is A.

5.6 Give an example of a relation having a functional dependency in which the determinant has two or more attributes.

TOWN-RESIDENTS (Name, Age, Citizenship, Voter-eligibility) where (Age, Citizenship)--->Voter-eligibility.

5.7 Define key.

A key is a group of one or more attributes that uniquely identifies a tuple.

5.8 If SID is a key of a relation, is it a determinant? Can a given value of SID occur more than once in the relation?

Yes, SID is a determinant. No, multiple values of SID may not occur in a relation in which it is the key.

5.9 What is a deletion anomaly? Give an example other than one in this text.

A deletion anomaly occurs when facts about two themes are lost with one deletion. Example: the relation VENDOR-PART (VendorName, Phone, Part#), and assume Vendor 123 supplies only one part. When that part is deleted, information about the vendor is deleted from the database as well.

5.10 What is an insertion anomaly? Give an example other than one in this text.

An insertion anomaly occurs when insertion of a fact about one theme requires a fact about another theme. Example: in VENDOR-PART in the answer to question 5.9, suppose engineering is developing a product that requires a new part, but the vendor has not been selected. With an insertion anomaly you cannot add the part to the database until the vendor is also added.

5.11 Explain the relationship of first, second, third, Boyce–Codd, fourth, fifth, and domain/key normal forms.

Each higher normal form includes the one(s) preceding it. A relation in 3NF is also in 1NF and 2NF.

5.12 Define second normal form. Give an example of a relation in 1NF but not in 2NF. Transform the relation into relations in 2NF.

A relation is in 2NF if all nonkey attributes are dependent on all of the keys. This relation is not in 2NF:

LINE-ITEM (OrderNumber, ItemNumber, Description).

Key: (OrderNumber, ItemNumber).

Dependencies: (OrderNumber, ItemNumber)--->Description.

ItemNumber--->Description.

These equivalent relations are in 2NF:

LINE-ITEM (OrderNumber, ItemNumber).

STOCK (ItemNumber, Description).

5.13 Define third normal form. Give an example of a relation in 2NF but not in 3NF. Transform the relation into relations in 3NF.

A relation is in 3NF if it is in 2NF and has no transitive dependencies. This relation is not in 3NF:

DOG (Name, Breed, MaxSize)

Key: Name

Dependencies: Name--->Breed

Breed--->MaxSize

(transitively)Name--->MaxSize

These equivalent relations are in 3NF:

REQUIREMENT (Name, Breed)

OFFERINGS (Breed, MaxSize)

5.14 Define BCNF. Give an example of a relation in 3NF but not in BCNF. Transform the relation into relations in BCNF.

A relation is in BCNF if every determinant is a candidate key. Consider this relation: FAC-OFFICE (FID, Department, Building, Office). Assume faculty members in the same department have offices in the same building. The key is FID. FID determines Department, Building, and Office. But Department (not a candidate key) determines Building. These relations are in BCNF:

FACULTY (FID, Department, Office)

DEPARTMENT-LOCATIONS (Department, Building)

5.15 Define multi-value dependency. Give an example.

A multi-valued dependency exists if there are three attributes in a relation and an attribute is dependent on only one of the other attributes. An example is: PROF (Name, Hobby, Class-taught). Assume professors have many hobbies and teach many classes. The key of the relation is (Name, Hobby, Class-taught). When a professor takes up a new hobby, all of the class-taught values must be duplicated. This requirement makes no sense.

5.16 Why are multi-value dependencies not a problem in relations with only two attributes?

Multi-valued dependencies must exist in pairs because a multi-valued dependency calls for two independent attributes that can have multiple values.

5.17 Define fourth normal form. Give an example of a relation in BCNF but not in 4NF. Transform the relation into relations in 4NF.

A relation is in 4NF if it is in BCNF (every determinant is a candidate key) and it has no multi-valued dependencies. The following relation is in BCNF but not 4NF: EMPLOYEE-HISTORY (Name, Project, PublicServiceActivity). Project can be multi-valued because an employee could have worked on many projects. PublicServiceActivity can also be multi-valued. But Project and PublicServiceActivity are unrelated. These relations are in 4NF:

EMPLOYEE-WORK-HIST (Name, Project).

EMPLOYEE-SERVICE-HIST (Name, PublicServiceActivity)

5.18 Define domain/key normal form. Why is it important?

A relation is in DK/NF if every constraint on the relation is a logical consequence of the definition of keys and domains. DK/NF is important because if a relation is in DK/NF, then there will be NO modification anomalies.

5.19 Transform the following relation into DK/NF. Make and state the appropriate assumptions about functional dependencies and domains.

EQUIPMENT (Manufacturer, Model, AcquisitionDate, BuyerName, BuyerPhone, PlantLocation, City, State, ZIP)

Assumptions:

BuyerName --> BuyerPhone, PlantLocation, City, State, Zip

Zip --> City, State

(Manufacturer, Model, BuyerName) --> AcqDate

Relations:

BUYER (BuyerName, BuyerPhone, PlantLocation, City, State, Zip)

PURCHASE (Manufacturer, Model, BuyerName, AcqDate)

Ignore Zip functional dependency.

5.20 Transform the following relation into DK/NF. Make and state the appropriate assumptions about functional dependencies and domains.

INVOICE (Number, CustomerName, CustomerNumber,

CustomerAddress, ItemNumber, ItemPrice, ItemQuantity,

SalespersonNumber, SalespersonName, Subtotal, Tax, TotalDue)

Assumptions:

Number --> CustomerNumber, ItemNumber, ItemQuantity, SalespersonNumber, SubTotal, Tax, TotalDue

ItemNumber --> ItemPrice

CustomerNumber -- > CustomerAddress

SalespersonNumber --> SalespersonName

Relations:

INVOICE (Number, CustomerNumber, ItemNumber, ItemQuantity, SalespersonNumber, SubTotal, Tax, TotalDue)

ITEM (ItemNumber, ItemPrice)

CUSTOMER (CustomerNumber, CustomerAddress)

SALESPERSON (SalespersonNumber, SalespersonName)

5.21 Answer Question 5.20 again, but this time add attribute CustomerTaxStatus (0 if nonexempt, 1 if exempt). Also add the constraint that there will be no tax if CustomerTaxStatus 1.

CUSTOMER becomes two relations:.

EX-CUSTOMER (CustomerNumber, CustomerName, CustomerAddress, CustomerTaxStatus).

Constraint: CustomerTaxStatus = 1.

NOT-EX-CUSTOMER (CustomerNumber, CustomerName, CustomerAddress, CustomerTaxStatus).

Constraint: Customer-tax-status = 0.

5.22 Give an example, other than one in this text, in which you would judge normalization to not be worthwhile. Show the relations and justify your design.

BASKETBALL-PLAYER (Number, Name, Position, GameDate, PointsScored), with key (Number, GameDate). Not normalized because Number ( (Name, Position), but (Number, GameDate) ( PointsScored. People are used to looking at this data in non-normalized format.

5.23 Explain two situations in which database designers might intentionally choose to create data duplication. What is the risk of such designs?

Two reasons are de-normalization and controlled redundancy. De-normalization might be done because people are accustomed to seeing data in a particular way – the example of Zip ( (City, State) and the answer to question 5.22 are examples. Controlled normalization usually occurs to improve performance — creating a table that has a subset of the data for query or high-transaction applications. The first is less risky because people expect it — if a player changed her number, everyone would expect to update all of her score data — or would adjust it their minds. The second is unexpected — even unknown by the users.

Answers to Group II Questions

5.24 Answers for relation and questions starting on page 147.

a. False

b. False

c. True, but only because EmployeeName ( Employee Salary. Sometimes this is stated that (ProjectID, EmployeeName) is not a minimal key.

d. True

e. False

f. False

g. (ProjectID, EmployeeName)

h. No. EmployeeSalary is dependent only on EmployeeName

i. 1st but not in 2nd

j. Insertion: to give an employee a salary, we must first assign the employee to a project. Deletion: If project 200c is deleted we lose Parks’s salary.

k. No

l. Yes

m. Yes

n. No. (Actually, for the data given, it is a determinant. This is most likely happenstance unless the organization has a rule that only one employee can have a given salary. This seems unlikely. This illustrates the dangers of inferring dependencies from sample data. Ask the users!)

o. No

p. ASSIGNMENT (ProjectID, EmployeeName)

SALARY (EmployeeName, EmployeeSalary)

25. Answers for relation and questions on page 148:

This relation is a mess. Certainly, EmployeeName multi-determines ProjectID. From the hint “TaskID is the name of a standard task” it appears that EmployeeName multidetermines TaskID as well. It could be, however, that there is a transitive multidependency: EmployeeName multidetermines ProjectID and Project ID multidetermines TaskID. The answers below follow the hint and assume that ProjectID and TaskID are independent.

a. False

b. True

c. False

d. True

e. True

f. True

g. True, but not minimal (see answer to 5.24c)

h. False

i. False

j. False

k. Assuming there is not transitive multidependency, EmployeeName is the only determinant.

l. Assuming note at start, there is no transitive depedency.

m. Yes, the two unrelated attributes are ProjectID and TaskID. Note, too, that even if TaskID were not in the relation, there would still be a multivalued dependency because of the presence of Phone and TotalHours.

n. If you delete the second row, you must also delete the fourth row to preserve the multi-valued dependency.

o. Three: employees and their projects, employees and their tasks, and employees and their personal data (Phone, TotalHours). You could even say four if you split Phone and TotalHours, but that seems too fine, to me.

p. EMPLOYEE-PROJECT (EmployeeName, ProjectID)

EMPLOYEE-TASK (EmployeeName, TaskID)

EMPLOYEE-DATA (EmployeeName, Phone, TotalHours)

Three; one theme each.

26. In this question, EQUIPMENT is like a machine or medical equipment or something similar.

a. Modify the definitions to ass this constraint: An employee may no sign up for more than one equipment appointment.

To enforce this constraint, we need EmployeeName to determine EquipmentName. Think of this as a typical normalization problem and say that EmployeeName ( EquipmentName in APPOINTMENT. In this case, we split APPOINTMENT into EQUIP-APPT and EMP-EQUIP as follows:

EQUIP-APPT (Date, Time, EquipmentName) with key (Date, Time, EquipmentName)

And

EMP-EQUIP (EmployeeName, EquipmentName) with key (EmployeeName, EquipmentName).

b. Define nighttime to refer to the hours between 2100 and 0500. Add an attribute Employee Type whose value is 1 if the employee works during nighttime. Change this design to enforce the constraint that only employees who work at night can schedule nighttime appointments.

Make the following changes to the design:

1. Add EmployeeType to EMPLOYEE:

EMPLOYEE (EmployeeName, PhoneNumber, EmployeeType)

2. Define new attribute NightEmp with domain in (EmployeeName of EMPLOYEE, where EmployeeType = 1)

3. Define new attribute DayTime with domain HH >05 and ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download