Normalization is the process of efficiently organizing data in a
database. There are two goals of the normalization process: eliminating
redundant data (for example, storing the same data in more than one table) and ensuring data dependencies
make sense (only storing related data in a table). Both of these are
worthy goals as they reduce the amount of space a database consumes and
ensure that data is logically stored.
Now, are all of the columns fully dependent upon the primary key? The customer number varies with the order number and it doesn't appear to depend upon any of the other fields. What about the unit price? This field could be dependent upon the customer number in a situation where we charged each customer a set price. However, looking at the data above, it appears we sometimes charge the same customer different prices. Therefore, the unit price is fully dependent upon the order number. The quantity of items also varies from order to order, so we're OK there.
What about the total? It looks like we might be in trouble here. The total can be derived by multiplying the unit price by the quantity, therefore it's not fully dependent upon the primary key. We must remove it from the table to comply with the third normal form. Perhaps we use the following attributes:
Conisder a database table that stores
employee information and has the attributes employee_id, first_name,
last_name, title. In this table, the field employee_id determines
first_name and last_name. Similarly, the tuple (first_name, last_name)
determines employee_id.
• If an attribute is dependent on only part of the multivalued key, then remove it to a separate table.
In simple words,
• It should meet all the requirements of the first normal form.
. It should remove subsets of data that apply to multiple rows of a table and place them in separate tables.
. It create relationships between these new tables and their predecessors through the use of foreign keys.
The First Normal form deals with the atomicity whereas the Second Normal Form deals with the relationship between the composite key columns and non-key columns. To achieve the next progressive level your table should satisfy the requirement of First Normal Form then move towards the Second Normal Form.
Let’s introduce a Review table as an example :
Item Colors Price Tax
Pen red 2.0 0.20
Pen blue 2.0 0.20
Scale red 2.0 0.20
Scale yellow 2.0 0.20
Bag blue 150.00 7.80
Bag black 150.00 7.80
Table is not in Second Normal Form because the price and tax depends on the item, but not color.
Item Colors
Pen red
Pen blue
Scale red
Scale yellow
Bag blue
Bag black
Item Price Tax
Pen 2.0 0.20
Scale 2.0 0.20
Bag 150.00 7.80
Tables are now in Second Normal Form.
3. 3NF (eliminate Columns not dependent on the Key)
• If attributes do not contribute description of the key, then remove it to the separate table. All the attributes must be directly dependent on the primary key. In simple words,
• It should meet all the requirements of the second normal form.
• It should remove columns that are not dependent upon the primary key.
In the Third Normal Form all columns depend upon the primary key. When one column depends upon the other column, tables break the rule and turn into the dependency on the primary key.
Item Colors
Pen red
Pen blue
Scale red
Scale yellow
Bag blue
Bag black
Item Price Tax
Pen 2.0 0.20
Scale 2.0 0.20
Bag 150.00 7.80 Tables are not in Second Normal Form because tax depends on price, not item.
Item Colors
Pen red
Pen blue
Scale red
Scale yellow
Bag blue
Bag black
Item Price
Pen 2.0
Scale 2.0
Bag 150.00
Price Tax
2.0 0.20
150.00 7.80
Tables are now in Third Normal Form.
The Normal Forms
The database community has developed a series of guidelines for ensuring that databases are normalized. These are referred to as normal forms and are numbered from one (the lowest form of normalization, referred to as first normal form or 1NF) through five (fifth normal form or 5NF).First Normal Form (1NF)
First normal form (1NF) sets the very basic rules for an organized database:- Eliminate duplicative columns from the same table.
- Create separate tables for each group of related data and identify each row with a unique column or set of columns (the primary key).
Second Normal Form (2NF)
- Second normal form (2NF) further addresses the concept of removing duplicative data:
- Meet all the requirements of the first normal form.
- Remove subsets of data that apply to multiple rows of a table and place them in separate tables.
- Create relationships between these new tables and their predecessors through the use of foreign keys.
Third Normal Form (3NF)
Third normal form (3NF) goes one large step further:- Meet all the requirements of the second normal form.
- Remove columns that are not dependent upon the primary key.
- Imagine that we have a table of widget orders that contains the following attributes:
- Order Number
- Customer Number
- Unit Price
- Quantity
- Total
Now, are all of the columns fully dependent upon the primary key? The customer number varies with the order number and it doesn't appear to depend upon any of the other fields. What about the unit price? This field could be dependent upon the customer number in a situation where we charged each customer a set price. However, looking at the data above, it appears we sometimes charge the same customer different prices. Therefore, the unit price is fully dependent upon the order number. The quantity of items also varies from order to order, so we're OK there.
What about the total? It looks like we might be in trouble here. The total can be derived by multiplying the unit price by the quantity, therefore it's not fully dependent upon the primary key. We must remove it from the table to comply with the third normal form. Perhaps we use the following attributes:
- Order Number
- Customer Number
- Unit Price
- Quantity
SELECT OrderNumber, Total
FROM WidgetOrders
We can now use the following query: SELECT OrderNumber, UnitPrice * Quantity AS Total
FROM WidgetOrders
Boyce-Codd Normal Form (BCNF or 3.5NF)
The Boyce-Codd Normal Form, also referred to as the "third and half (3.5) normal form", adds one more requirement:- Meet all the requirements of the third normal form.
- Every determinant must be a candidate key.
Examples:
Examples with Explanation:
1. 1NF (Eliminating Repeating Groups)
• Make a separate table for each set of related attributes and give each table a primary key.
In simple words I can say it as
• There are no repeating or duplicate fields.
• Each cell contains only a single value.
• Each record is unique and identified by primary key
Let’s see the example below :
Prior to Normalization
Item Colors Price Tax
Pen red, blue 2.0 0.20
Scale red, yellow 2.0 0.20
Pen red, blue 2.0 0.20
Bag blue, black 150.00 7.80
This table is not in first normal form because:
• A. There are multiple fields in color lab.
• B. Records are repeating (Duplicate records) or no primary key.
First Normal Form (1NF)
Item Colors Price Tax
Pen red 2.0 0.20
Pen blue 2.0 0.20
Scale red 2.0 0.20
Scale yellow 2.0 0.20
Bag blue 150.00 7.80
Bag black 150.00 7.80
• This table is now in first normal form.
2. 2 NF (Eliminating Redundant Data)• If an attribute is dependent on only part of the multivalued key, then remove it to a separate table.
In simple words,
• It should meet all the requirements of the first normal form.
. It should remove subsets of data that apply to multiple rows of a table and place them in separate tables.
. It create relationships between these new tables and their predecessors through the use of foreign keys.
The First Normal form deals with the atomicity whereas the Second Normal Form deals with the relationship between the composite key columns and non-key columns. To achieve the next progressive level your table should satisfy the requirement of First Normal Form then move towards the Second Normal Form.
Let’s introduce a Review table as an example :
Item Colors Price Tax
Pen red 2.0 0.20
Pen blue 2.0 0.20
Scale red 2.0 0.20
Scale yellow 2.0 0.20
Bag blue 150.00 7.80
Bag black 150.00 7.80
Table is not in Second Normal Form because the price and tax depends on the item, but not color.
Item Colors
Pen red
Pen blue
Scale red
Scale yellow
Bag blue
Bag black
Item Price Tax
Pen 2.0 0.20
Scale 2.0 0.20
Bag 150.00 7.80
Tables are now in Second Normal Form.
3. 3NF (eliminate Columns not dependent on the Key)
• If attributes do not contribute description of the key, then remove it to the separate table. All the attributes must be directly dependent on the primary key. In simple words,
• It should meet all the requirements of the second normal form.
• It should remove columns that are not dependent upon the primary key.
In the Third Normal Form all columns depend upon the primary key. When one column depends upon the other column, tables break the rule and turn into the dependency on the primary key.
Item Colors
Pen red
Pen blue
Scale red
Scale yellow
Bag blue
Bag black
Item Price Tax
Pen 2.0 0.20
Scale 2.0 0.20
Bag 150.00 7.80 Tables are not in Second Normal Form because tax depends on price, not item.
Item Colors
Pen red
Pen blue
Scale red
Scale yellow
Bag blue
Bag black
Item Price
Pen 2.0
Scale 2.0
Bag 150.00
Price Tax
2.0 0.20
150.00 7.80
Tables are now in Third Normal Form.
No comments:
Post a Comment