Contents / Previous / Next


Joins Between Tables

A query that accesses multiple rows of the same or different tables at one time is called a join query.

Storage in multitable relations and hence multitable queries involving more than on table are fundamental to relational databases.

To combine relations, i.e. to perform a join, we have to select data from the tables, and relate the tables to each other with conditons on some attributes (often keys attributes).


Table and Column References

When a query involves multiple tables, column names can become confusing. Sometimes two tables may use the same column name. For these reasons, SQL allows you to fully qualify column names by preceding the column name with the table name (separated by a period): > SELECT stud.id FROM stud WHERE id<3; Instead of specifying the table name, you can create a table alias to take the place of the table name in the query. The alias name follows the table name in the FROM clause: > SELECT s.id FROM stud s WHERE id<3; This features is useful in multitable queries.


Join Types

We use the following three tables for examples: SELECT * from teachers; id | name ----+-------- 1 | Volker 2 | Elke (3 rows) SELECT * from projects; id | name | duration | teacher ----+----------+----------+--------- 1 | compiler | 180 | 1 2 | xpaint | 120 | 1 3 | game | 250 | 2 4 | perl | 80 | 4 (4 rows) SELECT * FROM assign; project | stud | percentage ---------+------+------------ 1 | 2 | 10 1 | 4 | 60 1 | 1 | 30 2 | 1 | 50 2 | 4 | 50 3 | 2 | 70 3 | 4 | 30 (7 rows)


Cross Join:
Each row in teachers is arbitrarily combined with each row in projects. This is known as a cartesian product. In practical terms a cross join is a join without a join condition:

SELECT * FROM table-1, table-2 or SELECT * FROM table-1 CROSS JOIN table-2

Example:

> SELECT * FROM teachers, projects; id | name | id | name | duration | teacher ----+--------+----+----------+----------+--------- 1 | Volker | 1 | compiler | 180 | 1 1 | Volker | 2 | xpaint | 120 | 1 1 | Volker | 3 | game | 250 | 2 2 | Elke | 1 | compiler | 180 | 1 2 | Elke | 2 | xpaint | 120 | 1 2 | Elke | 3 | game | 250 | 2 (6 rows)


Theta Join: Equi join or non-equi join.

Inner Join = Equi Join:
Inner joins or equi joins are the most common type of joins, they use equality "=" of common attributes to join tables:

SELECT * FROM table-1, table-2 WHERE condition; or SELECT * FROM table-1 [ NATURAL ] INNER JOIN table-2 [ ON join_condition | USING ( join_column_list ) ] join_condition: A qualification condition similar to the WHERE condition.

join_column_list: A USING column list ( a, b, ... ) is shorthand for the ON condition left_table.a = right_table.a AND left_table.b = right_table.b ...

NATURAL: Shorthand for a USING list that mentions all similarly-named columns in the tables.

Example:

SELECT projects.name AS "Project Name", teachers.name AS "Teacher" FROM teachers, projects WHERE teachers.id = projects.teacher; Project Name | Teacher --------------+-------- compiler | Volker xpaint | Volker game | Elke perl | Elke (4 rows)


Not-Equi Join:
Like an inner join, but with an operator different from "=" in the condition, e.g., not equal "<>". Example:

SELECT projects.name AS "Project Name", teachers.name FROM teachers, projects WHERE teachers.id != projects.teacher; Project Name | name --------------+-------- game | Volker perl | Volker compiler | Elke xpaint | Elke (4 rows) Of cause, this result does not make too much sense.


LEFT/RIGHT/FULL Outer Join:
An inner join excludes rows from either table that do not have a matching row in the other table. An outer join combines the unmatched row in one of the tables with an artificial row (all columns set to null) for the other table:

SELECT * FROM table-1 [ NATURAL ] { LEFT | RIGHT | FULL } OUTER JOIN table-2 [ ON join_condition | USING ( join_column_list ) ] Join types:

LEFT OUTER JOIN returns all rows in the qualified Cartesian product (i.e., all combined rows that pass its ON condition), plus one copy of each row in the left-hand table for which there was no right-hand row that passed the ON condition. This left-hand row is extended to the full width of the joined table by inserting NULLs for the right-hand columns.

RIGHT OUTER JOIN returns all the joined rows, plus one row for each unmatched right-hand row (extended with nulls on the left). This is just a notational convenience, since you could convert it to a LEFT OUTER JOIN by switching the left and right inputs.

FULL OUTER JOIN returns all the joined rows, plus one row for each unmatched left-hand row (extended with nulls on the right), plus one row for each unmatched right-hand row (extended with nulls on the left).


(LEFT) OUTER JOIN example "project-percentage assignment":

SELECT assign.project, projects.name, assign.percentage FROM projects LEFT OUTER JOIN assign ON projects.id = assign.project ; project | name | percentage ---------+----------+------------ 1 | compiler | 10 1 | compiler | 60 1 | compiler | 30 2 | xpaint | 50 2 | xpaint | 50 3 | game | 70 3 | game | 30 | perl | (8 rows)


Self Join:
With table aliases, you can join a table to itself. In this case, the same table is given two different alias names. Each alias then represents a different instance of the table.
Example "projects where students 2 and 4 work together":

SELECT DISTINCT * FROM assign a, assign b WHERE a.stud = 2 AND b.stud = 4 AND a.project = b.project; project | stud | percentage | project | stud | percentage ---------+------+------------+---------+------+------------ 1 | 2 | 10 | 1 | 4 | 60 3 | 2 | 70 | 3 | 4 | 30 (2 rows)


Multi-Table Join

The number of tables involved in a join are not restricted, example "Who is working on project 1": SELECT DISTINCT assign.project, projects.name AS "Project", assign.percentage, stud.name AS "Student" FROM projects, assign, stud WHERE projects.id = 1 AND projects.id = assign.project AND assign.stud = stud.id ; project | Project | percentage | Student ---------+----------+------------+--------- 1 | compiler | 10 | lisa 1 | compiler | 30 | fred 1 | compiler | 60 | john (3 rows)


Unjoined Tables

When joining tables, it is necessary to join each table mentioned in the FROM clause by specifying joins in the WHERE clause. If you use a table name in the FROM clause but fail to join it in the WHERE clause, the table is marked as unjoined. It is then paired with every row in the query result.
This result is usually not intended. When a query returns many more rows than expected, look for an unjoined table in the query.