Skip to main content
Engineering LibreTexts

1.1: Relational Databases

  • Page ID
    15489
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    Relational Databases were introduced by E. F. Codd in 19691; Codd’s 1970 paper2 is considered one of the great papers in Computer Science.

    We begin with a very small example: a database with one relation, the list of employees shown in Table \(\PageIndex{1}\). You should notice this looks just like a two dimensional table of rows and columns. The name of the table is Employees, each column of the table has its own title, and each row has the same structure. Each row has a value for employee number, first name, last name, and gender. As tables of data appear in so many places (newspaper articles, text books, web pages, etc.) it is very likely you have seen and used this representation for data previously.

    Table \(\PageIndex{1}\): A list of employees.

    Employees

    Employee ID

    First Name

    Last Name

    Gender

    123

    Joe

    Smith

    Male

    333

    Jim

    Jones

    Male

    456

    April

    Smith

    Female

    842

    Jenny

    Jones

    Female

    777

    Tom

    Lee

    Male

    Let us assume the Employees table in Table \(\PageIndex{1}\). has one row for each employee who works for some hypothetical company. Data kept for each employee comprises their employee identification number, their first and last names, and their gender. Information structured in tables is very concise; at a glance we can obtain useful information.

    According to the database design methodology in Information Modeling and Relational Databases3, a database designer must be able to express structured information as verbalizations. A verbalization that fits the information in one row of the Employees table is:

    Employee with ID ... has a first name ..., a last name ..., and is of ... gender

    In verbalizations like this the ellipses are placeholders: we can use values from a single row to create complete statements that explain the meaning of a row. For example,

    Employee with ID 123 has a first name Joe, a last name Smith, and is of Male gender

    Employee with ID 333 has a first name Jim, a last name Jones, and is of Male gender

    A similar approach to organizing knowledge about data appears in the literature on literacy. In the Journal of Reading several articles by Kirsch and Mosenthal discuss the organization of information and its conceptualization as document sentences. In Building Documents by Combining Simple Lists4, Kirsch and Mosenthal present an example based on information from The World Almanac and Book of Facts: 1980 (Newspaper Enterprise Association, p. 427). That data is reproduced in Table \(\PageIndex{2}\).

    Table \(\PageIndex{2}\): Circulation of leading U.S. magazines.

    Circulation of Leading U.S. Magazines

    Magazines

    Circulation

    TV Guide

    19,547,763

    Reader's Digest

    18,094,192

    National Geographic

    10, 249,748

    Better Homes & Gardens

    8,007,202

    Family Circle

    7,611,578

    Woman's Day

    7,535,855

    McCall's

    6,502,880

    A major point the authors make is that such information can be re-conceptualized as a series of simple document sentences formed from a basic document sentence. This document sentence expresses an understanding of the tabular data in natural language. The document sentence for Table \(\PageIndex{2}\) is:

    Magazine X has a circulation of Y.

    Kirsch and Mosenthal use variables (X and Y) to stand for data that comes from a table. Taking values from a row, we plug values for X and Y into the document sentence to obtain sentence instantiations:

    TV Guide has a circulation of 19,547,763.

    Reader's Digest has a circulation of 18,094,192.

    National Geographic has a circulation of 10,249,748.

    Better Homes & Gardens has a circulation of 8,007,202.

    Family Circle has a circulation of 7,611,578.

    Woman's Day has a circulation of 7,535,855.

    McCall's has a circulation of 6,502,880.

    Document sentences and verbalization sentences are essentially the same. Both sentences use natural language to express in words the meaning of tabular data. Whether one is designing databases or reading structured information, it can be useful for understanding to re-formulate data as statements in natural language.

    Let us be a bit formal for a moment. Commercial relational database systems are systems where data is organized into relations. Figure \(\PageIndex{1}\) shows the general structure of a relation. We say a relation comprises a set of tuples where each tuple has the same number of attribute values, where each attribute value is taken from some corresponding domain, and where a domain represents a set of valid values for an attribute.

    General structure of a relation.
    Figure \(\PageIndex{1}\): General structure of a relation.

    The Employees table in Table \(\PageIndex{1}\) can be considered a relation of 5 tuples where each tuple has 4 values drawn from each of the employee identifier, first name, last name, and gender domains. Similarly we can say the lists comprising the Circulation of leading U.S. Magazines in Table \(\PageIndex{2}\) can be considered a relation with 7 tuples each having 2 attribute values.

    Relations are typically implemented in commercial databases as tabular structures comprising rows and a fixed number of columns. Everybody is familiar with tables as they are commonplace in textbooks, papers, magazines, etc. This simplicity of representation is one reason why relational databases have been very successful as repositories for important data.

    Exercises

    To design a database, a database engineer needs to find good representations of how an organization uses data. Good sources include: input forms, reports, web pages, etc. A challenge for database designers is to find these sources and interpret them.

    Exercise \(\PageIndex{1}\)

    Consider the following table of product information sold by ABC Foods. Verbalize the information presented.

    Product ID

    Product Name

    Unit Price

    Units In Stock

    1

    Black Tea

    $2.00

    44

    2

    Green Tea

    $3.00

    33

    3

    Vegetarian Lasagne

    $10.00

    20

    4

    Cajun Seasoning

    $11.00

    29

    5

    Cranberry Sauce

    $21.00

    0

    Exercise \(\PageIndex{2}\)

    Consider the following report that the Human Resources department of ABC Foods must produce. Verbalize the information in that report.

    Employee ID

    First Name

    Last Name

    Department

    1

    John

    Smith

    Receiving

    2

    Lee

    Daniels

    Sales

    3

    April

    Turner

    Sales

    4

    Thomas

    Trump

    Marketing

    5

    Lee

    Smith

    Marketing

    Exercise \(\PageIndex{3}\)

    Suppose the following input form is used to enter contact information. Verbalize the information that is being collected.

    Exercise 1.1.3: Contact Form


    Footnotes

    1. Derivability, Redundancy, and Consistency of Relations Stored in Large Data Banks, IBM Research Report, 1969.
    2. A Relational Model of Data for Large Shared Data Banks, CACM 13, No. 6, June 1970.
    3. Information Modeling and Relational Databases, 2nd edition, by Terry Halpin and Tony Morgan; Morgan Kaufmann Publishers; ISBN -13 978-0-12-373568-3.
    4. Irwin S. Kirsch and Peter B. Mosenthal. Building Documents by Combining Simple Lists. Journal of Reading, Vol. 33, No. 2, pp. 132-134.

    This page titled 1.1: Relational Databases is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Ron McFadyen.

    • Was this article helpful?