Database Normalization And Design Techniques
Without question, the most important task in any database development is the proper definition of the data model. It is important for the proper fast execution of queries in the database, and goes a long way to defining the business model which underlies the data. No matter what database manager you use, defining the data model properly is job 1.
There are five main rules of data normalization. They all have to do with eliminating redundant and inconsistent data in the design of the table structure
We begin with a listing of characteristics of a membership list which includes, members, their employers, their employers' addresses, and children.
Zero Form
members
name
company
address
kid1
kid2
Rob
ABC
Suite 17 Industrial Park
Harvey
Justin
Carol
XYZ
Main Street Office Park
Harvey
Justin
We say this data is not normalized because we have not applied any of the normalization rules to it yet. Look at the columns kid1, and kid2. It would be inconvenient to add another column in case one of the members decided to enlarge his/her family. So we will in the first rule decide to combine the kid columns.
First Normal Form
- Eliminate repeating groups in individual tables.
- Create a separate table for each set of related data.
- Identify each set of related data with a primary key.
In addition to annexing the kid fields, we need to set a unique identifier for each record, so that if another member named Rob joins we can identify him. When we apply the rules of the First Normal Form the result is with the following table:
members
userId
name
company
address
kid
1
Rob
ABC
Suite 17 Industrial Park
Harvey
1
Rob
ABC
Suite 17 Industrial Park
Justin
2
Carol
XYZ
Main Street Office Park
Harvey
2
Carol
XYZ
Main Street Office Park
Justin
Second Normal Form
- Create separate tables for sets of values that apply to multiple records.
- Relate these tables with a foreign key.
Annex the kid values to a second table, and add a column in that table to relate to the members table. That way, if anyone's family grows, we just need to add another record to the kid table, which relates to the member table- hence the term, relational database. Use the primary key value to relate these fields:
members
userId
name
company
address
1
Rob
ABC
Suite 17 Industrial Park
2
Carol
XYZ
Main Street Office Park
kids
kidId
relUserId
kid
1
1
Harvey
2
1
Justin
3
2
Harvey
4
2
Justin
Now we have two tables: members and kids, and keys in each. The foreign key relUserId is used in the kids table to relate the kids to their parents.
However, when we want to add another employee of an existing company, there is duplication with the company name and address.
Third Normal Form
- Eliminate fields that do not depend on the key.
Our Company Name and Address have nothing to do with the User Id, so they should have their own Company Id. These fields relate to the company.
members
userId
name
relCompId
Rob
1
2
Carol
2
companies
compId
company
address
1
ABC
Suite 17 Industrial Park
2
XYZ
Main Street Office Park
kids
kidId
relUserId
kid
1
1
Harvey
2
1
Justin
3
2
Harvey
4
2
Justin
Annex the companies and their addresses to a separate table, add unique key fields (compId), and use the fierign key relCompId in the members table to relate those members to their company.
Coincidently, Rob and Carol have kids with the same names. We could also make a table of possible kids' names, just as we did for the companies.
In some cases this would be 'OK', but in cases where the universe of names is small, it would be preferable to normalize it as well.
Data Relationships
There are three basic data relationships: one-to-one, one-to-many, and many-to-many. If each company had one and only one address, we could make a table for each and link them, creating a one-to-one relationship.
The relationship of the kids to the members is a one-to-many relationship, there each member can have several kids
In our Third Normal Form example we have one user related to many kids. Now, we want to change that structure to allow many members to be related to many kids, and thus we want a many-to-many relationship.
members
userId
name
relCompId
1
Rob
1
2
Carol
2
companies
compId
company
address
1
ABC
Suite 17 Industrial Park
2
XYZ
Main Street Office Park
kids
kidId
kid
1
Harvey
2
Justin
kid_relations
relationId
relatedKidId
relatedUserId
1
1
1
2
1
2
3
2
1
4
2
2
To minimize the duplication of data, now there is a table which contains only keys and foreign keys: the kid-relations table.
Fourth Normal Form
- In a many-to-many relationship, independent entities should not be stored in the same table.
Sometimes this rule is disregarded because it applies only to many-to-many relationships..
To give an example, we can select all of Rob's kids by invoking the following SQL statement:
SELECT name, kid FROM members, kids, kid_relations WHERE kid_relations.relatedUserId = 1 AND members.userId = 1 AND kids.kidId = kid_relations.relatedKidId
Fifth Normal Form
The final form or normalization is a check on the deconstruction. It ensures that there have been no additional unnecessary fields or tables created
- The original table must be reconstructed from the tables into which it has been broken down.
Metro NY / NJ SQL Server Consultants
We specialize is custom database software. Call us for a free consultation (973) 635 0080 or email us at paladn.com
Database programming: Database Development
Computer Programming: Web Site Services
Computer Programming: Custom Software
Computer Consulting: IT Consulting
Paladin Consultants, Inc. Home Page
Computer Consulting: Contact Us
|
![]() |
Copyright © 2001 by Paladin Consultants,Inc.
Chatham, NJ All Rights Reserved This page revised May 23, 2006 |
|
