Third Normal Form



Third Normal FormDatabase theory has developed over the years. We will NOT study this history! We will simply look at the current preferred method of database design, which is to create “relational” databases with tables in “third normal form.” The databases are called “relational” because of the relationships between the primary keys of some tables with the foreign keys of others. The concept of normal forms is too complicated to cover in a simple topic (and too much for this intro course); suffice it to say that there are six normal forms defined, with each representing a refinement in database tables. The higher the normal form, the better organized the data is—at least for the computer. In real databases in which data is frequently modified, the goal is generally to achieve third normal form. This prevents many common errors in data entry and retrieval.What does a table look like that ISN’T in third normal form? Typically, it’s what everyone does the first time they design a table (usually in Excel). Here’s an example of a first try at a table for actors in a movie database:So what’s wrong with it? Aside from the fact that Actor isn’t a good primary key (two actors can have the same name). First, actors may have never been married, or they may have been married more than three times. So having space for three spouses means that for some actors there will be blanks, while for others there won’t be enough space for all of the marriages. Similarly for Movies. Furthermore, putting the names of the directors of the three movies into the actors table will cause problems with data entry. Every time you enter a movie, you’ll have to enter the name of the director, making sure to spell it exactly the same way each time if you every hope to be able to query this table by director. Director is a property of the movie, not the actor directly.The solution to these problems is to create additional tables in third normal form, each with unique primary keys which allows the tables to be joined. An obvious table would be Movies. This tables design would look something like this:Actors would be linked to movies through a correlation table which matches ActorID’s with MovieID’s, like this:The Actors Table would contain the ActorID, ActorName, RealName, Birthday, and other fields which apply DIRECTLY to the actor himself/herself. This would put the table in third normal form.There is a golden rule for how to put tables in third normal form.Golden Rule: All non-key fields should relate directly to the key, the whole key, and nothing but the key.In the original Actors table above, the three Director fields related to the movies, which were not key fields. Therefore, having these fields violated the Golden Rule, meaning the table was not in third normal form.The spouses are a different issue. What table(s) might you add to the database to represent actors’ spouses? Remember that many actors are (or were) married to each other.Simple ways to spot tables that aren’t in Third Normal FormThere are several columns with the same name followed by a number. For example, you try to include the classes that students take in the students table by adding fields Class1, Class2, Class3, etc. In this case, you are trying to represent a many-to-many relationship in a single table, which can’t be done and stay in third normal form. Add a Classes table and a StudentClasses correlation table to properly relate students and classes.There are fields which refer to other non-key fields, not the key fields. The DirectorOfMovie1 field in the example above is an example of this (as well as of item 1). If you have detailed information to store about non-key fields (like Movie in the Actors table), it means that you need to create new tables for those fields (Movies). ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download