Monday, November 12, 2018

Designing a Relational Database



Image result for relational database design

Image From Lynda.com – FileMaker Pro: Relational Database Design (2011)

Introduction

A relational database is a digital database based on the relational model of data term as proposed by Edgar Codd in 1970 (https://en.wikipedia.org/wiki/Relational_database, 2018). Relational databases organize data stored in tables or relations. Each table contains rows (also called records or tuple) and columns (also called fields or attributes). The relationships among the tables enables a relational database to efficiently store large amount of data, and effectively retrieve specific data by using a language called Structured Query Language (SQL).

What is essential to good database design?

Any well-designed databases should follow certain principles. One principle is that redundant information is bad because it wastes space and increases the likelihood of encountering errors and experiencing inconsistencies (Microsoft). Another principle is that the accuracy and completeness of information is important. A good database design is one that divides information into subject-based tables to reduce the likelihood of duplicate data and provide users with the ability to join the information in the table together – by using Primary Keys and Foreign Keys. These design principles help ensure the accuracy and integrity of your data and accommodate your data processing and reporting needs.

The database design process

The first step of the database design process is to determine what is the purpose of your database. Then, you want to gather all the information that you want to record in the database. After you know all the types of information you want to put in the database, the next step is organizing all the information into tables through a process known as normalization. All tables in a relational database must adhere these specific normalization guidelines to achieve optimal structure and performance.

What is database normalization?

Database normalization is the process of organizing the columns (attributes) and tables (relations) of a relational database to reduce data redundancy and improve data integrity ( https://en.wikipedia.org/wiki/Database_normalization, 2018). Normalization is comprised of several rules known as normal forms. The first three are considered the most important.

First Normal Form

The first normal form states that each column contains values that cannot be further divided without losing their meaning (INFO 330, Module 1 Notes). For example, a person’s name can be further divided into first name, last name, and even middle name if applicable. However, dividing a first name into individual letters will cause the data to lose its meaning.

Second Normal Form

To satisfy the second normal form, every row of data in a table needs to be unique. You can accomplish this by choosing one or a combination of values in the row to be the unique identifier and primary key. The second normal form states that the relation must be in first normal form and every non-prime attribute of the relation is dependent on the whole of every candidate key (https://en.wikipedia.org/wiki/Second_normal_form, 2018). In other words, every other column in the row must be dependent on the unique identifier and determined by the primary key.

Third Normal Form

Being in third normal form means you have to satisfy all previous normal forms and that no other columns is determined by something other than the chosen primary key (transitive dependency).

In summary, all tables in a relational database should be about only one subject or event and all tables should have a primary key to identify one row from another. Furthermore, no table should have multi-part or multi-valued fields and all columns must depend on the composite primary key when there is one.

Defining relationships between tables

After you successfully divided all the information into separate tables, you need to bring the information together by defining relationships between tables and connecting each table to other tables using foreign keys. A foreign key is a column in a table that uniquely identifies the row of another table. In other words, foreign keys are used to link one table to another. Tables in a relational database can have a one-to-one, one-to-many, or many-to-many relationship. Many-to-many related tables need to be represented by a bridge table (an associative identity) to create two one-to-many relationships.

Summary

Creating a well-designed relational database requires multiple steps of careful planning and execution. To start off, you need to recognize the primary purpose of your database. Then, you need to figure out what information to collect and store in the database. Subsequently, you have to decide on how you can split the information into separate tables while following normalization rules. After creating all the tables, you need to ask yourself how should I connected these tables to one another? If there are any many-to-many relationships, you have to use a bridge table. Following these basic steps will help you design a good relational database.

Bibliography

“Database Design Basics.” Access, Microsoft, support.office.com/en-us/article/Database-design-basics-EB2159CF-1E30-401A-8084-BD4F9C9CA1F5.

Hock-Chuan, Chua. “A Quick-Start Tutorial on Relational Database Design.” Relational Database Design, Sept. 2010, www.ntu.edu.sg/home/ehchua/programming/sql/relational_database_design.html.

Nguyen, Kim. “Relational Database Schema Design Overview – Kim Nguyen – Medium.” Medium, Medium, 4 Oct. 2017, medium.com/@kimtnguyen/relational-database-schema-design-overview-70e447ff66f9.


INFO 330, Module 1 - Various Wikipedia articles

No comments:

Post a Comment