Database normalization is a process used to organize a database into tables and columns to minimize redundancy and dependency. The first three normalization forms (1NF, 2NF, and 3NF) are foundational in this process:
1st Normal Form (1NF):
- Objective: Ensure that the table structure is simple and clear.
- Requirements:
- Each table cell contains a single value (no repeating groups or arrays).
- Each record (row) must be unique.
In simple terms, 1NF ensures that the data is stored in a structured format with no duplicate rows and no multiple values in a single cell.
2nd Normal Form (2NF):
- Objective: Eliminate partial dependencies on the primary key.
- Requirements:
- The table must first meet all the requirements of 1NF.
- All non-key attributes (columns that are not part of the primary key) must be fully dependent on the entire primary key, not just part of it.
In simpler terms, 2NF ensures that each non-key column is dependent on the whole primary key, not just a part of it. This is relevant in tables with a composite primary key (a key made up of more than one column).
3rd Normal Form (3NF):
- Objective: Remove transitive dependencies.
- Requirements:
- The table must first meet all the requirements of 2NF.
- All the attributes must be directly dependent on the primary key. There should be no transitive dependency, where one non-key attribute depends on another non-key attribute.
In simple terms, 3NF ensures that non-key columns are only dependent on the primary key and not on other non-key columns, thus reducing redundancy and ensuring data integrity.
Examples
Here are examples to illustrate the first three normalization forms (1NF, 2NF, and 3NF):
Example: Consider a table containing information about students, courses, and instructors.
Unnormalized Table (Before 1NF):
StudentID | StudentName | Courses | Instructor |
---|---|---|---|
1 | Alice | Math, Science | Dr. Smith, Dr. Lee |
2 | Bob | English, History | Dr. Johnson, Dr. King |
3 | Charlie | Math, History, Science | Dr. Smith, Dr. King, Dr. Lee |
In this table, the Courses
and Instructor
columns have multiple values, which violates the 1NF rule.
1st Normal Form (1NF):
To convert to 1NF, we need to ensure that each column contains only atomic values (a single value per cell).
StudentID | StudentName | Course | Instructor |
---|---|---|---|
1 | Alice | Math | Dr. Smith |
1 | Alice | Science | Dr. Lee |
2 | Bob | English | Dr. Johnson |
2 | Bob | History | Dr. King |
3 | Charlie | Math | Dr. Smith |
3 | Charlie | History | Dr. King |
3 | Charlie | Science | Dr. Lee |
Now, each cell contains a single value, and there are no repeating groups. The table is in 1NF.
2nd Normal Form (2NF):
In 1NF, if the table has a composite primary key (e.g., StudentID
and Course
), we must ensure that non-key attributes are fully dependent on the entire primary key.
However, in this example, the table is already structured to avoid partial dependencies because Instructor
depends on both StudentID
and Course
. So, it satisfies 2NF. Let’s move on to a more common scenario to illustrate 2NF:
Imagine we add StudentAddress
:
StudentID | StudentName | StudentAddress | Course | Instructor |
---|---|---|---|---|
1 | Alice | 123 Apple St. | Math | Dr. Smith |
1 | Alice | 123 Apple St. | Science | Dr. Lee |
2 | Bob | 456 Orange St. | English | Dr. Johnson |
2 | Bob | 456 Orange St. | History | Dr. King |
In this case, StudentAddress
depends only on StudentID
, not on the entire composite key (StudentID
, Course
). Therefore, this table violates 2NF.
To convert to 2NF:
Student Table (2NF):
StudentID | StudentName | StudentAddress |
---|---|---|
1 | Alice | 123 Apple St. |
2 | Bob | 456 Orange St. |
Enrollment Table (2NF):
StudentID | Course | Instructor |
---|---|---|
1 | Math | Dr. Smith |
1 | Science | Dr. Lee |
2 | English | Dr. Johnson |
2 | History | Dr. King |
Now, StudentAddress
is in a separate table, fully dependent on StudentID
, ensuring the table meets 2NF.
3rd Normal Form (3NF):
To convert a table into 3NF, there should be no transitive dependencies (where a non-key attribute depends on another non-key attribute).
Suppose the Instructor
column can be broken down into InstructorName
and InstructorDepartment
:
StudentID | Course | InstructorName | InstructorDepartment |
---|---|---|---|
1 | Math | Dr. Smith | Math Department |
1 | Science | Dr. Lee | Science Department |
2 | English | Dr. Johnson | English Department |
2 | History | Dr. King | History Department |
Here, InstructorDepartment
is dependent on InstructorName
, not on the primary key (StudentID
and Course
), violating 3NF.
To convert this to 3NF, we separate the instructor information:
Instructor Table (3NF):
InstructorName | InstructorDepartment |
---|---|
Dr. Smith | Math Department |
Dr. Lee | Science Department |
Dr. Johnson | English Department |
Dr. King | History Department |
Enrollment Table (3NF):
StudentID | Course | InstructorName |
---|---|---|
1 | Math | Dr. Smith |
1 | Science | Dr. Lee |
2 | English | Dr. Johnson |
2 | History | Dr. King |
Now, all non-key attributes are directly dependent on the primary key, meeting the requirements of 3NF.
Leave a Reply