Database Normalization

Database normalization is a process used to organize a database into tables and columns to minimize redundancy and dependency. The first three normalization forms (1NF, 2NF, and 3NF) are foundational in this process:

1st Normal Form (1NF):

  • Objective: Ensure that the table structure is simple and clear.
  • Requirements:
    • Each table cell contains a single value (no repeating groups or arrays).
    • Each record (row) must be unique.

In simple terms, 1NF ensures that the data is stored in a structured format with no duplicate rows and no multiple values in a single cell.

2nd Normal Form (2NF):

  • Objective: Eliminate partial dependencies on the primary key.
  • Requirements:
    • The table must first meet all the requirements of 1NF.
    • All non-key attributes (columns that are not part of the primary key) must be fully dependent on the entire primary key, not just part of it.

In simpler terms, 2NF ensures that each non-key column is dependent on the whole primary key, not just a part of it. This is relevant in tables with a composite primary key (a key made up of more than one column).

3rd Normal Form (3NF):

  • Objective: Remove transitive dependencies.
  • Requirements:
    • The table must first meet all the requirements of 2NF.
    • All the attributes must be directly dependent on the primary key. There should be no transitive dependency, where one non-key attribute depends on another non-key attribute.

In simple terms, 3NF ensures that non-key columns are only dependent on the primary key and not on other non-key columns, thus reducing redundancy and ensuring data integrity.

Examples

Here are examples to illustrate the first three normalization forms (1NF, 2NF, and 3NF):

Example: Consider a table containing information about students, courses, and instructors.

Unnormalized Table (Before 1NF):

StudentIDStudentNameCoursesInstructor
1AliceMath, ScienceDr. Smith, Dr. Lee
2BobEnglish, HistoryDr. Johnson, Dr. King
3CharlieMath, History, ScienceDr. Smith, Dr. King, Dr. Lee

In this table, the Courses and Instructor columns have multiple values, which violates the 1NF rule.

1st Normal Form (1NF):

To convert to 1NF, we need to ensure that each column contains only atomic values (a single value per cell).

StudentIDStudentNameCourseInstructor
1AliceMathDr. Smith
1AliceScienceDr. Lee
2BobEnglishDr. Johnson
2BobHistoryDr. King
3CharlieMathDr. Smith
3CharlieHistoryDr. King
3CharlieScienceDr. Lee

Now, each cell contains a single value, and there are no repeating groups. The table is in 1NF.

2nd Normal Form (2NF):

In 1NF, if the table has a composite primary key (e.g., StudentID and Course), we must ensure that non-key attributes are fully dependent on the entire primary key.

However, in this example, the table is already structured to avoid partial dependencies because Instructor depends on both StudentID and Course. So, it satisfies 2NF. Let’s move on to a more common scenario to illustrate 2NF:

Imagine we add StudentAddress:

StudentIDStudentNameStudentAddressCourseInstructor
1Alice123 Apple St.MathDr. Smith
1Alice123 Apple St.ScienceDr. Lee
2Bob456 Orange St.EnglishDr. Johnson
2Bob456 Orange St.HistoryDr. King

In this case, StudentAddress depends only on StudentID, not on the entire composite key (StudentID, Course). Therefore, this table violates 2NF.

To convert to 2NF:

Student Table (2NF):

StudentIDStudentNameStudentAddress
1Alice123 Apple St.
2Bob456 Orange St.

Enrollment Table (2NF):

StudentIDCourseInstructor
1MathDr. Smith
1ScienceDr. Lee
2EnglishDr. Johnson
2HistoryDr. King

Now, StudentAddress is in a separate table, fully dependent on StudentID, ensuring the table meets 2NF.

3rd Normal Form (3NF):

To convert a table into 3NF, there should be no transitive dependencies (where a non-key attribute depends on another non-key attribute).

Suppose the Instructor column can be broken down into InstructorName and InstructorDepartment:

StudentIDCourseInstructorNameInstructorDepartment
1MathDr. SmithMath Department
1ScienceDr. LeeScience Department
2EnglishDr. JohnsonEnglish Department
2HistoryDr. KingHistory Department

Here, InstructorDepartment is dependent on InstructorName, not on the primary key (StudentID and Course), violating 3NF.

To convert this to 3NF, we separate the instructor information:

Instructor Table (3NF):

InstructorNameInstructorDepartment
Dr. SmithMath Department
Dr. LeeScience Department
Dr. JohnsonEnglish Department
Dr. KingHistory Department

Enrollment Table (3NF):

StudentIDCourseInstructorName
1MathDr. Smith
1ScienceDr. Lee
2EnglishDr. Johnson
2HistoryDr. King

Now, all non-key attributes are directly dependent on the primary key, meeting the requirements of 3NF.


Posted

in

by

Tags:

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *