Introduction
A
relational database is a digital database based on the relational model of data
term as proposed by Edgar Codd in 1970 (https://en.wikipedia.org/wiki/Relational_database,
2018). Relational databases organize data stored in tables or relations. Each
table contains rows (also called records or tuple) and columns (also called
fields or attributes). The relationships among the tables enables a relational
database to efficiently store large amount of data, and effectively retrieve
specific data by using a language called Structured Query Language (SQL).
What is essential
to good database design?
Any
well-designed databases should follow certain principles. One principle is that
redundant information is bad because it wastes space and increases the
likelihood of encountering errors and experiencing inconsistencies (Microsoft).
Another principle is that the accuracy and completeness of information is
important. A good database design is one that divides information into
subject-based tables to reduce the likelihood of duplicate data and provide
users with the ability to join the information in the table together – by using
Primary Keys and Foreign Keys. These design principles help ensure the accuracy
and integrity of your data and accommodate your data processing and reporting
needs.
The database design
process
The
first step of the database design process is to determine what is the purpose
of your database. Then, you want to gather all the information that you want to
record in the database. After you know all the types of information you want to
put in the database, the next step is organizing all the information into
tables through a process known as normalization. All tables in a relational
database must adhere these specific normalization guidelines to achieve optimal
structure and performance.
What is database normalization?
Database
normalization is the process of organizing the columns (attributes) and tables
(relations) of a relational database to reduce data redundancy and improve data
integrity ( https://en.wikipedia.org/wiki/Database_normalization,
2018). Normalization is comprised of several rules known as normal forms. The
first three are considered the most important.
First Normal Form
The
first normal form states that each column contains values that cannot be
further divided without losing their meaning (INFO 330, Module 1 Notes). For
example, a person’s name can be further divided into first name, last name, and
even middle name if applicable. However, dividing a first name into individual
letters will cause the data to lose its meaning.
Second Normal Form
To
satisfy the second normal form, every row of data in a table needs to be unique.
You can accomplish this by choosing one or a combination of values in the row
to be the unique identifier and primary key. The second normal form states that
the relation must be in first normal form and every non-prime attribute of the
relation is dependent on the whole of every candidate key (https://en.wikipedia.org/wiki/Second_normal_form,
2018). In other words, every other column in the row must be dependent on the
unique identifier and determined by the primary key.
Third Normal Form
Being
in third normal form means you have to satisfy all previous normal forms and
that no other columns is determined by something other than the chosen primary
key (transitive dependency).
In
summary, all tables in a relational database should be about only one subject
or event and all tables should have a primary key to identify one row from
another. Furthermore, no table should have multi-part or multi-valued fields and
all columns must depend on the composite primary key when there is one.
Defining relationships between tables
After
you successfully divided all the information into separate tables, you need to
bring the information together by defining relationships between tables and
connecting each table to other tables using foreign keys. A foreign key is a
column in a table that uniquely identifies the row of another table. In other
words, foreign keys are used to link one table to another. Tables in a
relational database can have a one-to-one, one-to-many, or many-to-many
relationship. Many-to-many related tables need to be represented by a bridge
table (an associative identity) to create two one-to-many relationships.
Summary
Creating
a well-designed relational database requires multiple steps of careful planning
and execution. To start off, you need to recognize the primary purpose of your
database. Then, you need to figure out what information to collect and store in
the database. Subsequently, you have to decide on how you can split the
information into separate tables while following normalization rules. After
creating all the tables, you need to ask yourself how should I connected these
tables to one another? If there are any many-to-many relationships, you have to
use a bridge table. Following these basic steps will help you design a good
relational database.
Bibliography
“Database
Design Basics.” Access, Microsoft, support.office.com/en-us/article/Database-design-basics-EB2159CF-1E30-401A-8084-BD4F9C9CA1F5.
Hock-Chuan,
Chua. “A Quick-Start Tutorial on Relational Database Design.” Relational
Database Design, Sept. 2010, www.ntu.edu.sg/home/ehchua/programming/sql/relational_database_design.html.
Nguyen, Kim.
“Relational Database Schema Design Overview – Kim Nguyen –
Medium.” Medium, Medium, 4 Oct. 2017, medium.com/@kimtnguyen/relational-database-schema-design-overview-70e447ff66f9.
INFO 330,
Module 1 - Various Wikipedia articles