|
Without question, the most important task in any database development is the proper definition of the data model. It is important for the proper fast execution of queries in the database, and goes a long way to defining the business model which underlies the data. No matter what database manager you use, defining the data model properly is job 1. There are five main rules of data normalization. They all have to do with eliminating redundant and inconsistent data in the design of the table structure We begin with a listing of characteristics of a membership list which includes, members, their employers, their employers' addresses, and children. Zero Form members | name | company | address | kid1 | kid2 | Rob | ABC | Suite 17 Industrial Park | Harvey | Justin | Carol | XYZ | Main Street Office Park | Harvey | Justin | We say this data is not normalized because we have not applied any of the normalization rules to it yet. Look at the columns kid1, and kid2. It would be inconvenient to add another column in case one of the members decided to enlarge his/her family. So we will in the first rule decide to combine the kid columns. First Normal Form - Eliminate repeating groups in individual tables.
- Create a separate table for each set of related data.
- Identify each set of related data with a primary key.
In addition to annexing the kid fields, we need to set a unique identifier for each record, so that if another member named Rob joins we can identify him. When we apply the rules of the First Normal Form the result is with the following table: members | userId | name | company | address | kid | 1 | Rob | ABC | Suite 17 Industrial Park | Harvey | 1 | Rob | ABC | Suite 17 Industrial Park | Justin | 2 | Carol | XYZ | Main Street Office Park | Harvey | 2 | Carol | XYZ | Main Street Office Park | Justin | Second Normal Form - Create separate tables for sets of values that apply to multiple records.
- Relate these tables with a foreign key.
Annex the kid values to a second table, and add a column in that table to relate to the members table. That way, if anyone's family grows, we just need to add another record to the kid table, which relates to the member table- hence the term, relational database. Use the primary key value to relate these fields: members | userId | name | company | address | 1 | Rob | ABC | Suite 17 Industrial Park | 2 | Carol | XYZ | Main Street Office Park | kids | kidId | relUserId | kid | 1 | 1 | Harvey | 2 | 1 | Justin | 3 | 2 | Harvey | 4 | 2 | Justin | Now we have two tables: members and kids, and keys in each. The foreign key relUserId is used in the kids table to relate the kids to their parents. However, when we want to add another employee of an existing company, there is duplication with the company name and address. Third Normal Form - Eliminate fields that do not depend on the key.
Our Company Name and Address have nothing to do with the User Id, so they should have their own Company Id. These fields relate to the company. members | userId | name | relCompId |
| Rob | 1 | 2 | Carol | 2 | companies | compId | company | address | 1 | ABC | Suite 17 Industrial Park | 2 | XYZ | Main Street Office Park | kids | kidId | relUserId | kid | 1 | 1 | Harvey | 2 | 1 | Justin | 3 | 2 | Harvey | 4 | 2 | Justin | Annex the companies and their addresses to a separate table, add unique key fields (compId), and use the fierign key relCompId in the members table to relate those members to their company. Coincidently, Rob and Carol have kids with the same names. We could also make a table of possible kids' names, just as we did for the companies. In some cases this would be 'OK', but in cases where the universe of names is small, it would be preferable to normalize it as well. Data Relationships There are three basic data relationships: one-to-one, one-to-many, and many-to-many. If each company had one and only one address, we could make a table for each and link them, creating a one-to-one relationship. The relationship of the kids to the members is a one-to-many relationship, there each member can have several kids In our Third Normal Form example we have one user related to many kids. Now, we want to change that structure to allow many members to be related to many kids, and thus we want a many-to-many relationship. members | userId | name | relCompId | 1 | Rob | 1 | 2 | Carol | 2 | companies | compId | company | address | 1 | ABC | Suite 17 Industrial Park | 2 | XYZ | Main Street Office Park | kids | kidId | kid | 1 | Harvey | 2 | Justin | kid_relations | relationId | relatedKidId | relatedUserId | 1 | 1 | 1 | 2 | 1 | 2 | 3 | 2 | 1 | 4 | 2 | 2 | To minimize the duplication of data, now there is a table which contains only keys and foreign keys: the kid-relations table. Fourth Normal Form - In a many-to-many relationship, independent entities should not be stored in the same table.
Sometimes this rule is disregarded because it applies only to many-to-many relationships.. To give an example, we can select all of Rob's kids by invoking the following SQL statement: SELECT name, kid FROM members, kids, kid_relations WHERE kid_relations.relatedUserId = 1 AND members.userId = 1 AND kids.kidId = kid_relations.relatedKidId Fifth Normal Form The final form or normalization is a check on the deconstruction. It ensures that there have been no additional unnecessary fields or tables created - The original table must be reconstructed from the tables into which it has been broken down.
Metro NY / NJ SQL Server Consultants We specialize is custom database software. Call us for a free consultation (973) 635 0080
|