Normalization of Tables: From 1NF to 2NF
Database normalization is the process of organizing data in a database to eliminate redundancy and improve data integrity. Normalization involves dividing a database into tables and establishing relationships between them based on their logical dependencies. The process of normalization is divided into several normal forms, each representing a higher level of data organization.
The first normal form (1NF) requires that each column in a table be atomic, meaning that it contains only a single value. This eliminates repeating groups of data and ensures that each piece of information is stored in only one place.
For example, suppose you have a table called "Students" that contains information about each student, including their name, address, phone number, and courses taken. The table might look like this:
Student ID | Name | Address | Phone Number | Courses Taken |
1 | John Doe | 123 Main Street | 555-1234 | Math, Science |
2 | Jane Smith | 456 Elm Street | 555-5678 | English, History |
This table violates the 1NF because the "Courses Taken" column contains multiple values separated by commas. To convert this table to 1NF, you would split the "Courses Taken" column into a separate table called "Courses", which would contain a row for each course taken by each student:
Student ID | Name | Address | Phone Number |
1 | John Doe | 123 Main Street | 555-1234 |
2 | Jane Smith | 456 Elm Street | 555-5678 |
Student ID | Course |
1 | Math |
1 | Science |
2 | English |
2 | History |
Once the table has been normalized to 1NF, you can proceed to the second normal form (2NF), which requires that every non-key column in a table be fully dependent on the primary key. This means that if a table has a composite primary key (i.e., a primary key that consists of multiple columns), each non-key column must be dependent on the entire composite key, not just on a part of it.
For example, suppose you have a table called "Orders" that contains information about each customer order, including the order ID, customer ID, customer name, product ID, product name, and quantity ordered. The table might look like this:
Order ID | Customer ID | Customer Name | Product ID | Product Name | Quantity |
1 | 100 | John Doe | 500 | Widget | 10 |
2 | 100 | John Doe | 600 | Gadget | 5 |
3 | 200 | Jane Smith | 500 | Widget | 2 |
4 | 200 | Jane Smith | 700 | Gizmo | 3 |
This table violates the 2NF because the non-key columns "Customer Name" and "Product Name" depend only on the customer ID and product ID, respectively, not on the entire composite key. To convert this table to 2NF, you would split it into two tables: "Customers", which would contain information about each customer, and "Products", which would contain information about each product:
Customer ID | Customer Name |
100 | John Doe |
200 | Jane Smith |