CS 4/53005 Database System Design



Final Exam Study Guide

The final exam is Monday, December 10 from 5:45pm to 8:00pm.

The test is closed book, but you may bring one sheet of notes.

Chapter 1 – Introduction

• Be able to define or use these terms. Compare and contrast related terms

o Database, Database Management System (DBMS)

o Schema, Instance

o Levels of Abstraction – User, Logical, Physical

o Two roles of DB language – Data Definition vs Data Manipulation

• What are some of the problems that DBs are intended to solve?

o Data redundancy

o Data inconsistency

o Non-standard data storage format

o Non-standard language for defining and accessing data

Chapter 2

• Contextual definitions for: relation, tuple, attribute, domain

That is, rather than giving a dictionary-style definition, understand what these terms mean when either you are or someone else is discussing database operations.

• Have a visual understanding of basic relational algebra operations:

select, project, union, set difference

(see last 2 slides from Ch. 2)

• Understand Cartesian product vs. natural join

o Cartesian product is non-selective (all possible combinations)

o Natural join is selective (only combinations where matched attributes have the same value)

o Choose which of these to use for an application

Chapters 3 and 4 – SQL

Data Manipulation Language (DML):

• Write a simple SELECT... FROM...WHERE query, or

given a table of data and a query, show the results.

o SELECT: How do you include specific columns? All columns?

o FROM: How do you do a Cartesian product? A natural join?

o WHERE: How do you write these basic predicates?

▪ Equality/inequality of numeric values

▪ Basic pattern matching for strings using LIKE

▪ Set membership testing using IN

• How do you rename a column or a table?

• Aggregation

o Use Sum(), Count(), Min(), Max), Average()

o When and how to use GROUP BY

o Distinguish the effects of HAVING vs. WHERE:

WHERE filters tuples (rows) before aggregation

HAVING filters groups after aggregation

• What does NULL mean and how does it respond to queries?

• NO Nested Queries, Derived Queries, or Views on the test

Data Definition Language (DDL)

• Three keywords for {add, modify, remove}:

o For the contents of a table: INSERT, UPDATE, DELETE

o For the definition (schema) of a table: CREATE, ALTER, DROP

o You won’t have to write these queries, but you might need to read and interpret them.

• Pick a reasonable data type (int, char, varchar, etc) for an attribute.

• How to specify a multi-attribute primary key.

• How to specify a foreign key relationship.

• When you have a foreign key relationship, what are the possible inter-table consequences of UPDATE or DELETE?

Chapter 6 – Entity-Relationship Model

• Understand Primary Key really, really well

o Formally: “minimal set of attributes that uniquely define each tuple”

o Implications / Recommendations:

▪ If I already have one record with PK value = p, then I could not possibly have another tuple with PK value = p.

If it actually could happen, then my PK is wrong.

▪ ALWAYS run through mental examples, imagining how much similarity between tuples could occur.

▪ Given a scenario, pick an appropriate PK.

• Understand candidate key, superkey, and foreign key

• You will not need to create an entire E-R model or reduce an entire model to a DB schema, but you may need to analyze particular relationships:

o Determine the proper cardinality of a binary relationship A-B.

Ask yourself:

▪ One entity A could correspond to one or several entities B?

▪ One entity B could correspond to one or several entities B?

• Reduce an E-R or UML diagram to a Database Schema:

o 1:many: Add the PK of the “1” entity into the “many” entity as a FK.

o many:many: Create a new table, using the PKs of both tables.

Chapter 7 – Functional Dependency and Normalization

• How are the meanings of FD and Key similar?

o Implication: For R(A,B,C,D), if A→B,C,D, then A is a superkey.

• What does “normalization” mean? What is its purpose?

• What is usually considered an acceptable degree of normalization?

• What is definition of Boyce-Codd Normal Form (BCNF)?

• Decomposition

o What is the claim of Heath’s Theorem?

o Given a table schema and an FD, apply Heath’s Theorem to decompose the table schema. (Caution: On HW #3, very few of you were able to explain what you were doing during decomposition. You must be able to walk the reader through your application of Heath’s Theorem to justify your actions.)

• Describe by example a Multi-valued Dependency.

• Demonstrate by example normalization of a MVD to 4NF

Chapters 11 and 12 – Physical Data Storage and Indexing

• Distinguish between data access latency and data bandwidth

• What is the bottleneck (slowest operation) in disk operations?

• Therefore, data record storage is block-based.

• What are some strategies and trade-offs for dealing with the addition or deletion of records to a table?

• How is a VARCHAR stored? (How much space does it require?)

• Ordered Indices

o What is characteristic of a Primary Index?

o Dense vs Sparse index?

o What benefits of a primary index are not available in a secondary index?

o How do you update the index when updates/deletions occur in the data records?

• B+ Tree indices

o Given the maximum size for a B+ node, draw a tree for a small data set.

o What is the advantage of B+ trees over ordered indices?

• Hash indices – what’s a hash function?

• Compare the performance of different indexing schemas:

o When insertions and deletions are frequent? Not frequent?

o When range queries are common? Not common?

Chapter 15 – Transactions

• What is a transaction?

• What are the ACID properties?

• In Conflict Serialization, which operations are defined to be in conflict?

• Given a pair of short schedules, determine if they are conflict equivalent.

• Given a short schedule, use a precedence graph to determine if it is conflict serializable

• What is a rollback? Why or when do they occur?

• What is a cascading rollback? Why or when do they occur?

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download