Course Content
Relational Database and Normalization
Relational Database and Normalization
First Normal Form
Unnormalized Form
Normal forms establish guidelines for refining a database's design and structure. A database that doesn't adhere to these guidelines is termed an unnormalized database. Such databases may suffer from redundancy, inconsistency, and other data challenges.
Note
An Unnormalized Form describes a database that doesn't meet any of the normal form criteria.
Consider this example:
Suppose you have a Customer table with columns Customer and Phone Number. It's a bit messy. Time to normalize!
First Normal Form
Rules for the First Normal Form include:
- Every table should possess a unique primary key — the smallest set of columns that distinctly identifies a record. (Typically, an ID is used for this).
- Every attribute should contain only a single value, not multiple;
In the unnormalized sample, the Customer and Phone Number entities lack unique keys, which can lead to confusion with identical customer names. Additionally, the Phone Number attribute holds several values.
To transition the database to the first normal form:
Great! Now we've split it into two tables! We introduced a new table named "Phone Numbers" and incorporated a relation named "Customer ID" within it.
Moreover, the First Normal Form stipulates that every table must have a unique key (known as the primary key). There are primary keys called Customer ID in the Customer table and distinct phone numbers in the Phone Numbers table.
Note
The Customer ID column in the Phone Numbers table should be designated as a foreign key.
To reference Phone Numbers from the Customer table, you'd employ
JOIN
operations.
Achieving the First Normal Form makes data sorting and querying more streamlined: If a cell contains multiple phone numbers, sorting and categorizing them can become cumbersome. The process would entail extracting numbers from each cell, separating them, and then assembling a fresh dataset.
Note
An attribute, or a set of attributes, that distinctly labels each row is termed a primary key. All the attributes comprising the primary key are dubbed key attributes.
Everything was clear?