Skip to main content
Engineering LibreTexts

6.8: Inner and Outer Joins

  • Page ID
    23927
  • \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \( \newcommand{\id}{\mathrm{id}}\) \( \newcommand{\Span}{\mathrm{span}}\)

    ( \newcommand{\kernel}{\mathrm{null}\,}\) \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\) \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\) \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\id}{\mathrm{id}}\)

    \( \newcommand{\Span}{\mathrm{span}}\)

    \( \newcommand{\kernel}{\mathrm{null}\,}\)

    \( \newcommand{\range}{\mathrm{range}\,}\)

    \( \newcommand{\RealPart}{\mathrm{Re}}\)

    \( \newcommand{\ImaginaryPart}{\mathrm{Im}}\)

    \( \newcommand{\Argument}{\mathrm{Arg}}\)

    \( \newcommand{\norm}[1]{\| #1 \|}\)

    \( \newcommand{\inner}[2]{\langle #1, #2 \rangle}\)

    \( \newcommand{\Span}{\mathrm{span}}\) \( \newcommand{\AA}{\unicode[.8,0]{x212B}}\)

    \( \newcommand{\vectorA}[1]{\vec{#1}}      % arrow\)

    \( \newcommand{\vectorAt}[1]{\vec{\text{#1}}}      % arrow\)

    \( \newcommand{\vectorB}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vectorC}[1]{\textbf{#1}} \)

    \( \newcommand{\vectorD}[1]{\overrightarrow{#1}} \)

    \( \newcommand{\vectorDt}[1]{\overrightarrow{\text{#1}}} \)

    \( \newcommand{\vectE}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash{\mathbf {#1}}}} \)

    \( \newcommand{\vecs}[1]{\overset { \scriptstyle \rightharpoonup} {\mathbf{#1}} } \)

    \( \newcommand{\vecd}[1]{\overset{-\!-\!\rightharpoonup}{\vphantom{a}\smash {#1}}} \)

    \(\newcommand{\avec}{\mathbf a}\) \(\newcommand{\bvec}{\mathbf b}\) \(\newcommand{\cvec}{\mathbf c}\) \(\newcommand{\dvec}{\mathbf d}\) \(\newcommand{\dtil}{\widetilde{\mathbf d}}\) \(\newcommand{\evec}{\mathbf e}\) \(\newcommand{\fvec}{\mathbf f}\) \(\newcommand{\nvec}{\mathbf n}\) \(\newcommand{\pvec}{\mathbf p}\) \(\newcommand{\qvec}{\mathbf q}\) \(\newcommand{\svec}{\mathbf s}\) \(\newcommand{\tvec}{\mathbf t}\) \(\newcommand{\uvec}{\mathbf u}\) \(\newcommand{\vvec}{\mathbf v}\) \(\newcommand{\wvec}{\mathbf w}\) \(\newcommand{\xvec}{\mathbf x}\) \(\newcommand{\yvec}{\mathbf y}\) \(\newcommand{\zvec}{\mathbf z}\) \(\newcommand{\rvec}{\mathbf r}\) \(\newcommand{\mvec}{\mathbf m}\) \(\newcommand{\zerovec}{\mathbf 0}\) \(\newcommand{\onevec}{\mathbf 1}\) \(\newcommand{\real}{\mathbb R}\) \(\newcommand{\twovec}[2]{\left[\begin{array}{r}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\ctwovec}[2]{\left[\begin{array}{c}#1 \\ #2 \end{array}\right]}\) \(\newcommand{\threevec}[3]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\cthreevec}[3]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \end{array}\right]}\) \(\newcommand{\fourvec}[4]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\cfourvec}[4]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \end{array}\right]}\) \(\newcommand{\fivevec}[5]{\left[\begin{array}{r}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\cfivevec}[5]{\left[\begin{array}{c}#1 \\ #2 \\ #3 \\ #4 \\ #5 \\ \end{array}\right]}\) \(\newcommand{\mattwo}[4]{\left[\begin{array}{rr}#1 \amp #2 \\ #3 \amp #4 \\ \end{array}\right]}\) \(\newcommand{\laspan}[1]{\text{Span}\{#1\}}\) \(\newcommand{\bcal}{\cal B}\) \(\newcommand{\ccal}{\cal C}\) \(\newcommand{\scal}{\cal S}\) \(\newcommand{\wcal}{\cal W}\) \(\newcommand{\ecal}{\cal E}\) \(\newcommand{\coords}[2]{\left\{#1\right\}_{#2}}\) \(\newcommand{\gray}[1]{\color{gray}{#1}}\) \(\newcommand{\lgray}[1]{\color{lightgray}{#1}}\) \(\newcommand{\rank}{\operatorname{rank}}\) \(\newcommand{\row}{\text{Row}}\) \(\newcommand{\col}{\text{Col}}\) \(\renewcommand{\row}{\text{Row}}\) \(\newcommand{\nul}{\text{Nul}}\) \(\newcommand{\var}{\text{Var}}\) \(\newcommand{\corr}{\text{corr}}\) \(\newcommand{\len}[1]{\left|#1\right|}\) \(\newcommand{\bbar}{\overline{\bvec}}\) \(\newcommand{\bhat}{\widehat{\bvec}}\) \(\newcommand{\bperp}{\bvec^\perp}\) \(\newcommand{\xhat}{\widehat{\xvec}}\) \(\newcommand{\vhat}{\widehat{\vvec}}\) \(\newcommand{\uhat}{\widehat{\uvec}}\) \(\newcommand{\what}{\widehat{\wvec}}\) \(\newcommand{\Sighat}{\widehat{\Sigma}}\) \(\newcommand{\lt}{<}\) \(\newcommand{\gt}{>}\) \(\newcommand{\amp}{&}\) \(\definecolor{fillinmathshade}{gray}{0.9}\)

    Whenever we use a query to retrieve data from two or more tables, the database query processor performs an operation called a join. In this section, we discuss inner joins, outer joins, and Cartesian products. Following that we discuss some interesting special cases: self-join, anti-join, non-equi joins.

    If we have previously established relationships between tables, and if we have more than one table in a query, then MS Access will create joins based on those relationships. If necessary, we can alter, delete, or include new relationships.

    MS Access creates joins where rows join if the join fields are equal in value; such joins are called equi-joins. If we create a query for the University database and add the Department and Course tables to the relationships area of the query we have:

    Standard equi-join.
    Figure \(\PageIndex{1}\): Standard equi-join.

    If you edit the relationship line (double-click it), you see the join properties:

    Join properties.
    Figure \(\PageIndex{2}\): Join properties.

    Here, we can see the join is based on the common attribute deptCode. If you click on the Join Type button, you will get information on the type of join used. You will see (as the following diagram shows) that Access has selected the first of three options:

    Choosing inner join or outer join.
    Figure \(\PageIndex{3}\): Choosing inner join or outer join.

    Joins can be further characterized as inner or outer joins. Option 1 is an inner join.

    Options 2 and 3 are outer joins. One of these would also be called a Left Outer Join and the other a Right Outer Join. If you examine the SQL statement generated you will see which is used; Left and Right choices are related to the textual expression of the SQL statement – which table name is leftmost/rightmost in the From clause.


    Inner Join

    All of the joins we have seen up to this point have been inner joins. For a row of one table to be included in the result of an inner join, the row must match a row in the other table. Because all joins so far have also been equi-joins the matching is based on the values of the join fields of one table being equal to the values of the join fields of the other table. Consider the inner join between Department and Course based on deptCode:

    Inner join.
    Figure \(\PageIndex{4}\): Inner join.

    If the tables have the contents shown below:

    Table \(\PageIndex{1}\): Course table contents.

    Course

    Dept Code

    Course Number

    Title

    Description

    Credit Hours

    ACS

    1453

    Introduction to Computers

    This course will introduce students to the basic concepts of computers: types of computers, hardware, software, and types of application systems.

    3

    ACS

    1803

    Introduction to Information Systems

    This course examines applications of information technology to businesses and other organizations.

    3

    Table \(\PageIndex{2}\): Department table contents.

    Department

    Dept Code

    Dept Name

    Location

    Phone

    Chair

    ACS

    Applied Computer Science

    3D07

    (204) 786-0300

    Simon Lee

    ENG

    English

    3D05

    (204) 786-9999

    April Jones

    MATH

    Mathematics

    2R33

    (204) 786-0033

    Peter Smith

    then the result of running the query is

    Table \(\PageIndex{3}\): Query result.

    Dept Name

    Course Number

    Title

    Applied Computer Science

    1453

    Introduction to Computers

    Applied Computer Science

    1803

    Introduction to Information Systems

    In the above result notice there is no result line for English or Mathematics; this is because for the sample data there were no rows in Course that joined to the English or Mathematics rows in Department. Both rows in Course have a value of “ACS” in the deptCode field and so they joined to the ACS row in Department.

    This query demonstrates a distinguishing characteristic of the inner join: only rows that match other rows are included in the results.

    Exercises (Set 1)

    Exercise \(\PageIndex{1}\)

    Consider the Library database:

    1. Write a query that joins Loan and Member. List the member name and date due.
    2. Write a query that joins Loan and Book. List the book title and date due.
    3. Write a query that joins all three tables and lists the member name, book title, and date due.

    Exercise \(\PageIndex{2}\)

    Consider the two tables A and B below.

    Table A
    X Y Z
    1 3 5
    2 4 6
    4 9 9
    Table B
    X Y Q
    1 3 5
    1 4 6
    2 4 7
    3 4 5
    1. How many rows are in the result if A and B are joined based on the attribute X?
    2. How many rows are in the result if A and B are joined based on both attributes X and Y?

    Outer Join

    Consider the Company database. Suppose we wanted to produce a report that lists each department and its employees, and that we must include every department. The two tables would be joined based on equal values of the dept id field. We want all departments and we know that an inner join will not include a department if there are no employees for the department to join to. To get all departments included when we are joining two tables we must use an outer join.

    Consider the query that is started below:

    Initial query.
    Figure \(\PageIndex{5}\): Initial query.

    By default the join is an inner join, but with MS Access you can get an outer join if you edit the relationship and specify either option 2 or option 3, as shown in the dialogue below:

    Default property is option 1.
    Figure \(\PageIndex{6}\): Default property is option 1.

    By choosing option 2 your query will include all departments whether or not the department can join to an employee. If there is no employee for a department to join to, then the row is joined to a row of nulls. When you do this notice the change in the relationship line – it is now a directed line; this is how MS Access illustrates outer joins:

    Outer join – all rows of Department.
    Figure \(\PageIndex{7}\): Outer join – all rows of Department.

    The first few rows of the result are:

    Table \(\PageIndex{4}\): Query result.

    deptId

    department

    dept

    lastName

    4

    Special Operations

    3

    Sales

    3

    Long

    3

    Sales

    3

    Craft

    3

    Sales

    3

    Watkins

    Notice that the Special Operations department joined to a null row.

    Exercises (Set 2)

    Exercise \(\PageIndex{3}\)

    Consider the Company database and list each department and the number of employees in the department.

    Exercise \(\PageIndex{4}\)

    Consider the Orders database.

    1. Create a query to list each customer and their orders (order id and order date). Are there any customers who have not placed an order?
    2. Modify the above query to list each customer and the number of orders they have placed (include all customers).

    Exercise \(\PageIndex{5}\)

    Consider the library database.

    1. Create a query that will list every book and the date it was borrowed. Include all books in your result.
    2. Create a query to list every library member and the dates they borrowed books. Include all members
    3. Try creating a query that will list books that have never been borrowed.
    4. Try creating a query to list any members who have not borrowed a book.

    Cartesian Product

    Suppose you create a query, but without a join criteria. This is easily done by clicking on the relationship line and deleting it. When criteria for matching rows is not present, then each row of one table will join to each row of the other table.

    This type of join is called a Cartesian Product and these can easily have very large result sets. If Department has 4 rows and Employee has 100 rows then the Cartesian Product has (4x100=) 400 rows. Databases used in practice have hundreds, thousands, even millions of rows; a Cartesian Product may take a long, long time to run.

    Exercises (Set 3)

    Exercise \(\PageIndex{6}\)

    Consider the Sales database and its Store and Product tables. Construct a query to list the storeID and the productID. When you add Store and Product to the relationships area there is a line joining the two tables. Delete the join line. Run the query. Notice how many rows there are; the number of rows in the result set is the number of stores times the number of products.Exercise \(\PageIndex{1}\)

    Exercise \(\PageIndex{7}\)

    Consider the Sales database and its Store, Product, and Sales tables. Suppose we want to obtain a list that shows for each store and product the total quantity sold. Note that the end user wants to see every store and product combination.

    The second query is developed as a join between the query CP and the table Sales. CP is outer-joined to Sales in order that every combination of Store and Product is in the final result.

    Hint

    An approach you can use with MS Access is to create two queries. The first of these performs a cross product of store and product (call this CP).


    Self-Join

    A self-join, also called a recursive join, is a case where a table is joined to itself.

    Consider the Company database and suppose we must obtain a list of employees who report to another employee named Raphael Delaney (i.e. List the employees Raphael Delaney supervises). To do this we need to find the row in Employee for Raphael Delaney and then join that row to other rows of Employee where the supervisor field is equal to the empId field for Raphael. When we build the query in MS Access we simply add the Employee table to the relationships area twice. One copy of Employee will be named Employee_1. Consider the following query:

    Self-join.
    Figure \(\PageIndex{8}\): Self-join.

    Note the following:

    • the criteria specifies the row in Employee will be that of Raphael Delaney
    • the join line connects supervisor to empId and so rows of Employee_1 will be employees who report to Raphael.

    Exercises (Set 4)

    Exercise \(\PageIndex{8}\)

    Consider the Genealogy database and develop queries to obtain:

    1. the father of Peter Chan.
    2. the mother of Peter Chan.
    3. the father and mother of Peter Chan.
    4. the children of Peter Chan.
    5. the grandchildren of Peter Chan.

    Exercise \(\PageIndex{9}\)

    Consider the Orders database and the Employee table.

    1. Write a query to list the employee who does not report to anyone.
    2. Write a query to list each employee and the number of employees they supervise.

    Anti-Join

    Suppose we need to list persons in our Company database that are not supervising anyone. One way of looking at this problem is to say we need to find those people that do not join to someone else based on the supervises relationship. That is, we need to find those employees whose employee id does not appear in the supervisor field of any employee.

    To do this with MS Access we can construct a query that uses an outer join to connect an employee to another employee based on employeeID equaling supervisor, but where the supervisor value is null. That is, we are looking for an employee who, in an outer join, does not join to another employee. See the query below:

    Anti-join.
    Figure \(\PageIndex{9}\): Anti-join.

    This query involves a join, specifically an outer join, and because it retrieves that rows that do not join, it is sometimes referred to as a special case – an anti-join.

    Exercise \(\PageIndex{10}\)

    Consider the Genealogy database and develop a query to find people that do not have any sons.

    Exercise \(\PageIndex{11}\)

    Consider the Genealogy database and develop a query to find people that do not have any daughters.


    Non-Equi Join

    A non-equi join is any join where the join criteria does not specify equals, “=”.

    Suppose we wish to list all persons in the Genealogy database who are younger than, say, Peter Chan. One approach to getting the results is to join the row for Peter Chan to another row in Person where the birthdate of Peter Chan is greater than the birthdate of the other person. This type of join would be a “greater than” join as opposed to an equi-join. Proceed in the following way:

    1. Add Person to the relationships area twice so there is a Person table and a Person_1 table. If there are any relationship lines delete them.
    2. In the criteria line for Person fields: for firstName type “Peter” and for LastName type “Chan”.
    3. In the criteria line for birthDate in Person_1 type “> [Person].[birthDate]”
      Non-equi join.
      Figure \(\PageIndex{10}\): Non-equi join.

      In this way you are creating a “greater than” join.

    4. Include attributes from Person_1 to display these younger people.
    5. Run your query.

    Exercises (Set 5)

    Exercise \(\PageIndex{12}\)

    Consider the genealogy database.

    1. Run the example from above.
    2. Modify the example to list those people who are older than Peter Chan.

    This page titled 6.8: Inner and Outer Joins is shared under a CC BY-NC-SA license and was authored, remixed, and/or curated by Ron McFadyen.

    • Was this article helpful?