ansi joins in oracle 9i

Oracle has introduced ANSI-compliant joins into its SQL implementation in 9i Release One (9.0). This provides an alternative syntax to joining datasets together, which can be used in conjunction, or as an alternative to, existing Oracle syntax. This article briefly introduces the new syntax.

a note on the examples

The examples of ANSI-compliant joins in this article are mostly drawn from the EMP and DEPT tables. These tables are familiar to most developers and contain few records, which limits the output we will generate. Wherever possible, the equivalent Oracle syntax is included for reference.

general syntax

With Oracle, we are used to joining datasets by combinations of datasources in the FROM clause and a WHERE clause to join them together. Datasets are usually tables, but can also be views, in-inline views, subqueries, table functions, nested tables and so on. Oracle join syntax is generally as follows:

SELECT ...
FROM   dataset_one d1
,      dataset_two d2
WHERE  d1.column(s) = d2.column(s)
AND    ...

With this syntax we separate datasources by commas and code a single WHERE clause that will include the join predicates together with any filter predicates we might require. ANSI join syntax is slightly different on two counts. First, we specify the type of join we require and second we separate the join predicates from the filter predicates. ASNI syntax can notionally be expressed as follows:

SELECT ...
FROM   dataset_one d1
       JOIN TYPE
       dataset_two d2
       ON (d1.column(s) = d2.column(s)) --<-- can also use USING (column(s))
WHERE  filter_predicates...

As commented, the ON clause is where we specify our joins. If the column names are the same, we can replace this with a USING clause. We will see examples of both methods for expressing join predicates throughout this article.

Given this pseudo-syntax, we will examples of the following join types in this article.

inner join

When we join two tables or datasets together on an equality (i.e. column or set of columns) we are performing an inner join. The ANSI method for joining EMP and DEPT is as follows.

SQL> SELECT d.dname, d.loc, e.ename, e.job
  2  FROM   dept d
  3         INNER JOIN
  4         emp e
  5  USING (deptno);

DNAME          LOC           ENAME      JOB
-------------- ------------- ---------- ---------
RESEARCH       DALLAS        SMITH      CLERK
SALES          CHICAGO       ALLEN      SALESMAN
SALES          CHICAGO       WARD       SALESMAN
RESEARCH       DALLAS        JONES      MANAGER
SALES          CHICAGO       MARTIN     SALESMAN
SALES          CHICAGO       BLAKE      MANAGER
ACCOUNTING     NEW YORK      CLARK      MANAGER
RESEARCH       DALLAS        SCOTT      ANALYST
ACCOUNTING     NEW YORK      KING       PRESIDENT
SALES          CHICAGO       TURNER     SALESMAN
RESEARCH       DALLAS        ADAMS      CLERK
SALES          CHICAGO       JAMES      CLERK
RESEARCH       DALLAS        FORD       ANALYST
ACCOUNTING     NEW YORK      MILLER     CLERK

14 rows selected.

Note that the INNER keyword is optional and JOIN alone will work. In this example we used the USING(deptno) clause to specify our join predicate. We were able to use this because both tables have the same join-column name. When we use this clause, however, we cannot alias the join columns anywhere in our query as the following example demonstrates.

SQL> SELECT COUNT(d.deptno)
  2  FROM   dept d
  3         INNER JOIN
  4         emp e
  5  USING (deptno);
SELECT COUNT(d.deptno)
             *
ERROR at line 1:
ORA-25154: column part of USING clause cannot have qualifier

The error message is reasonably meaningful (though "alias" might have been a better choice of word than "qualifier"). The correct way to express this query is to simply remove the alias as follows.

SQL> SELECT COUNT(deptno)
  2  FROM   dept d
  3         INNER JOIN
  4         emp e
  5  USING (deptno);

COUNT(DEPTNO)
-------------
           14

1 row selected.

As stated in the syntax overview earlier, we can also use an ON clause to express our join predicates. This is necessary when the join columns have different names or when we wish to alias our columns. The parentheses around the ON clause are optional.

SQL> SELECT d.dname, d.loc, e.ename, e.job
  2  FROM   dept d
  3         INNER JOIN
  4         emp e
  5  ON    (d.deptno = e.deptno);

DNAME          LOC           ENAME      JOB
-------------- ------------- ---------- ---------
RESEARCH       DALLAS        SMITH      CLERK
SALES          CHICAGO       ALLEN      SALESMAN
SALES          CHICAGO       WARD       SALESMAN
RESEARCH       DALLAS        JONES      MANAGER
SALES          CHICAGO       MARTIN     SALESMAN
SALES          CHICAGO       BLAKE      MANAGER
ACCOUNTING     NEW YORK      CLARK      MANAGER
RESEARCH       DALLAS        SCOTT      ANALYST
ACCOUNTING     NEW YORK      KING       PRESIDENT
SALES          CHICAGO       TURNER     SALESMAN
RESEARCH       DALLAS        ADAMS      CLERK
SALES          CHICAGO       JAMES      CLERK
RESEARCH       DALLAS        FORD       ANALYST
ACCOUNTING     NEW YORK      MILLER     CLERK

14 rows selected.

The Oracle equivalent of the inner joins we have seen so far is as follows.

SELECT d.dname, d.loc, e.ename, e.job
FROM   dept d
,      emp e
WHERE  d.deptno = e.deptno;

One of the key differences between ANSI and Oracle syntax is that the former separates the join and filter predicates. On complex statements, this can be extremely useful. In the following example, we will join EMP and DEPT but add a filter predicate on DEPT using the standard WHERE clause.

SQL> SELECT d.dname, d.loc, e.ename, e.job
  2  FROM   dept d
  3         INNER JOIN
  4         emp e
  5  ON    (d.deptno = e.deptno)
  6  WHERE  d.loc = 'DALLAS';

DNAME          LOC           ENAME      JOB
-------------- ------------- ---------- ---------
RESEARCH       DALLAS        SMITH      CLERK
RESEARCH       DALLAS        JONES      MANAGER
RESEARCH       DALLAS        SCOTT      ANALYST
RESEARCH       DALLAS        ADAMS      CLERK
RESEARCH       DALLAS        FORD       ANALYST

5 rows selected.

We could add the filter to the ON clause but this makes no difference to performance, even if the filter is on the second table. Furthermore, adding filter predicates to the ON clause can cause data issues with outer joins (we will see an example of this later), so we should avoid this practice altogether.

For reference, the Oracle syntax for the inner join with an additional filter predicate is as follows.

SELECT d.dname, d.loc, e.ename, e.job
FROM   dept d
,      emp e
WHERE  d.deptno = e.deptno
AND    d.loc = 'DALLAS';

natural join

A natural join will join two datasets on all matching column names, regardless of whether the columns are actually related in anything other than name. For example, the EMP and DEPT tables share one common column name and a natural join between the two tables would be correct in this scenario.

The following example converts our INNER JOIN from previous examples to a NATURAL JOIN.

SQL> SELECT d.dname, d.loc, e.ename, e.job
  2  FROM   dept d
  3         NATURAL JOIN
  4         emp e;

DNAME          LOC           ENAME      JOB
-------------- ------------- ---------- ---------
RESEARCH       DALLAS        SMITH      CLERK
SALES          CHICAGO       ALLEN      SALESMAN
SALES          CHICAGO       WARD       SALESMAN
RESEARCH       DALLAS        JONES      MANAGER
SALES          CHICAGO       MARTIN     SALESMAN
SALES          CHICAGO       BLAKE      MANAGER
ACCOUNTING     NEW YORK      CLARK      MANAGER
RESEARCH       DALLAS        SCOTT      ANALYST
ACCOUNTING     NEW YORK      KING       PRESIDENT
SALES          CHICAGO       TURNER     SALESMAN
RESEARCH       DALLAS        ADAMS      CLERK
SALES          CHICAGO       JAMES      CLERK
RESEARCH       DALLAS        FORD       ANALYST
ACCOUNTING     NEW YORK      MILLER     CLERK

14 rows selected.

Note that the only identifiable benefit of NATURAL JOIN is that we do not need to specify a join predicate. Oracle determines all matching column names and uses these to join the two tables. We cannot alias any columns used in the natural join, as the following example demonstrates.

SQL> SELECT COUNT(d.deptno)
  2  FROM   dept d
  3         NATURAL JOIN
  4         emp e;
SELECT COUNT(d.deptno)
             *
ERROR at line 1:
ORA-25155: column used in NATURAL join cannot have qualifier

As we saw earlier with the USING clause for our INNER JOIN examples, the solution is to remove any aliases on shared column names.

SQL> SELECT COUNT(deptno)
  2  FROM   dept d
  3         NATURAL JOIN
  4         emp e;

COUNT(DEPTNO)
-------------
           14

1 row selected.

With EMP and DEPT as our demonstration tables, we are fortunate that the NATURAL JOIN gives us the correct answer. Commercial data models are not always so rigid in the naming of columns and the data they contain, especially as systems grow over time. We can imagine the consequences of adding another column to EMP that shares the same name as one in DEPT.

SQL> ALTER TABLE emp ADD loc VARCHAR2(10) DEFAULT 'LONDON';

Table altered.

SQL> SELECT *
  2  FROM   dept d
  3         NATURAL JOIN
  4         emp e;

no rows selected

Probably the best advice to offer regarding NATURAL JOIN is to avoid it! Note that there is no equivalent Oracle syntax.

left outer join

In traditional Oracle syntax, outer joins are indicated by (+) and this can sometimes cause issues when attempting to outer join multiple tables or includeg expressions in join predicates. Oracle outer joins have no concept of direction, whereas ANSI-compliant outer joins do. In the following example, we will outer join DEPT to EMP using the ANSI LEFT OUTER JOIN. The way to interpret this is to read the tables in the FROM clause from left to right. The left-hand table is the superset and the table on the right is the potentially deficient set.

SQL> SELECT deptno, d.dname, d.loc, e.ename, e.job
  2  FROM   dept d
  3         LEFT OUTER JOIN
  4         emp e
  5  USING (deptno);

    DEPTNO DNAME          LOC           ENAME      JOB
---------- -------------- ------------- ---------- ---------
        20 RESEARCH       DALLAS        SMITH      CLERK
        30 SALES          CHICAGO       ALLEN      SALESMAN
        30 SALES          CHICAGO       WARD       SALESMAN
        20 RESEARCH       DALLAS        JONES      MANAGER
        30 SALES          CHICAGO       MARTIN     SALESMAN
        30 SALES          CHICAGO       BLAKE      MANAGER
        10 ACCOUNTING     NEW YORK      CLARK      MANAGER
        20 RESEARCH       DALLAS        SCOTT      ANALYST
        10 ACCOUNTING     NEW YORK      KING       PRESIDENT
        30 SALES          CHICAGO       TURNER     SALESMAN
        20 RESEARCH       DALLAS        ADAMS      CLERK
        30 SALES          CHICAGO       JAMES      CLERK
        20 RESEARCH       DALLAS        FORD       ANALYST
        10 ACCOUNTING     NEW YORK      MILLER     CLERK
        40 OPERATIONS     BOSTON

15 rows selected.

The OUTER keyword is optional but due to the lack of (+) symbols, including it seems to be more descriptive. Note that this example included the USING clause for our outer join predicates, but the ON clause would also work as well. The Oracle syntax for this join is as follows.

SELECT d.deptno, d.dname, d.loc, e.ename, e.job
FROM   dept d
,      emp e
WHERE  d.deptno = e.deptno (+);

outer joins and expressions

One of the benefits of the ANSI outer join syntax is that it makes expressions much easier to work with (on the occasions where we join a column to an expression). By way of simplified example, we can multiply EMP.DEPTNO by 1 to create a small expression in our DEPT-EMP outer join. The Oracle outer join syntax in this case is uninituitive, as we can see below.

SQL> SELECT d.deptno, d.dname, d.loc, e.ename, e.job
  2  FROM   dept d
  3  ,      emp e
  4  WHERE  d.deptno = e.deptno*1 (+);
WHERE  d.deptno = e.deptno*1 (+)
                             *
ERROR at line 4:
ORA-00933: SQL command not properly ended

The correct way to write this join predicate is as follows which can get tricky with more complicated expressions, especially those involving functions.

SQL> SELECT d.deptno, d.dname, d.loc, e.ename, e.job
  2  FROM   dept d
  3  ,      emp e
  4  WHERE  d.deptno = e.deptno (+) * 1;
  
    DEPTNO DNAME          LOC           ENAME      JOB
---------- -------------- ------------- ---------- ---------
        20 RESEARCH       DALLAS        SMITH      CLERK
        30 SALES          CHICAGO       ALLEN      SALESMAN
        30 SALES          CHICAGO       WARD       SALESMAN
        20 RESEARCH       DALLAS        JONES      MANAGER
        30 SALES          CHICAGO       MARTIN     SALESMAN
        30 SALES          CHICAGO       BLAKE      MANAGER
        10 ACCOUNTING     NEW YORK      CLARK      MANAGER
        20 RESEARCH       DALLAS        SCOTT      ANALYST
        10 ACCOUNTING     NEW YORK      KING       PRESIDENT
        30 SALES          CHICAGO       TURNER     SALESMAN
        20 RESEARCH       DALLAS        ADAMS      CLERK
        30 SALES          CHICAGO       JAMES      CLERK
        20 RESEARCH       DALLAS        FORD       ANALYST
        10 ACCOUNTING     NEW YORK      MILLER     CLERK
        40 OPERATIONS     BOSTON

15 rows selected.

ANSI syntax makes this much easier as no special considerations are needed, as follows.

SQL> SELECT d.deptno, d.dname, d.loc, e.ename, e.job
  2  FROM   dept d
  3         LEFT OUTER JOIN
  4         emp e
  5         ON (d.deptno = e.deptno * 1);

    DEPTNO DNAME          LOC           ENAME      JOB
---------- -------------- ------------- ---------- ---------
        20 RESEARCH       DALLAS        SMITH      CLERK
        30 SALES          CHICAGO       ALLEN      SALESMAN
        30 SALES          CHICAGO       WARD       SALESMAN
        20 RESEARCH       DALLAS        JONES      MANAGER
        30 SALES          CHICAGO       MARTIN     SALESMAN
        30 SALES          CHICAGO       BLAKE      MANAGER
        10 ACCOUNTING     NEW YORK      CLARK      MANAGER
        20 RESEARCH       DALLAS        SCOTT      ANALYST
        10 ACCOUNTING     NEW YORK      KING       PRESIDENT
        30 SALES          CHICAGO       TURNER     SALESMAN
        20 RESEARCH       DALLAS        ADAMS      CLERK
        30 SALES          CHICAGO       JAMES      CLERK
        20 RESEARCH       DALLAS        FORD       ANALYST
        10 ACCOUNTING     NEW YORK      MILLER     CLERK
        40 OPERATIONS     BOSTON

15 rows selected.

outer joins and predicates

Remember that ANSI join syntax separates join predicates from filter predicates. In the following example, we outer join DEPT and EMP but limit the results to those employees working in DALLAS only.

SQL> SELECT d.dname, d.loc, e.ename, e.job
  2  FROM   dept d
  3         LEFT OUTER JOIN
  4         emp e
  5  ON    (d.deptno = e.deptno)
  6  WHERE  d.loc = 'DALLAS';

DNAME          LOC           ENAME      JOB
-------------- ------------- ---------- ---------
RESEARCH       DALLAS        SMITH      CLERK
RESEARCH       DALLAS        JONES      MANAGER
RESEARCH       DALLAS        SCOTT      ANALYST
RESEARCH       DALLAS        ADAMS      CLERK
RESEARCH       DALLAS        FORD       ANALYST

5 rows selected.

While it is syntactically possible to move the filter predicates into the join clause, in the case of an ANSI outer join, this can give incorrect results if the filter is on a column from the superset (or "driving table"). In the following example, we move the filter predicate from our previous query into the ON clause.

SQL> SELECT d.dname, d.loc, e.ename, e.job
  2  FROM   dept d
  3         LEFT OUTER JOIN
  4         emp e
  5  ON    (d.deptno = e.deptno
  6  AND    d.loc = 'DALLAS');

DNAME          LOC           ENAME      JOB
-------------- ------------- ---------- ---------
ACCOUNTING     NEW YORK
RESEARCH       DALLAS        SMITH      CLERK
RESEARCH       DALLAS        JONES      MANAGER
RESEARCH       DALLAS        SCOTT      ANALYST
RESEARCH       DALLAS        ADAMS      CLERK
RESEARCH       DALLAS        FORD       ANALYST
SALES          CHICAGO
OPERATIONS     BOSTON

8 rows selected.

We can see that these results are wrong and that the driving table has all of its data returned by this query. What if we wish to outer join to a subset of EMP? In this case, the filter predicates on EMP columns can be safely included in the join predicates, as follows.

SQL> SELECT d.dname, d.loc, e.ename, e.job
  2  FROM   dept d
  3         LEFT OUTER JOIN
  4         emp e
  5  ON    (d.deptno = e.deptno
  6  AND    e.job = 'SALESMAN');

DNAME          LOC           ENAME      JOB
-------------- ------------- ---------- ---------
SALES          CHICAGO       ALLEN      SALESMAN
SALES          CHICAGO       WARD       SALESMAN
SALES          CHICAGO       MARTIN     SALESMAN
SALES          CHICAGO       TURNER     SALESMAN
OPERATIONS     BOSTON
RESEARCH       DALLAS
ACCOUNTING     NEW YORK

7 rows selected.

If we move the EMP filter predicates to the WHERE clause, we see inconsistent results again, as follows. This is the semantic equivalent of an inner join because all outer joined records will have a NULL JOB.

SQL> SELECT d.dname, d.loc, e.ename, e.job
  2  FROM   dept d
  3         LEFT OUTER JOIN
  4         emp e
  5  ON    (d.deptno = e.deptno)
  6  WHERE  e.job = 'SALESMAN';

DNAME          LOC           ENAME      JOB
-------------- ------------- ---------- ---------
SALES          CHICAGO       ALLEN      SALESMAN
SALES          CHICAGO       WARD       SALESMAN
SALES          CHICAGO       MARTIN     SALESMAN
SALES          CHICAGO       TURNER     SALESMAN

4 rows selected.

For reference, the Oracle equivalent of this type of outer join is as follows.

SELECT d.dname, d.loc, e.ename, e.job
FROM   dept d
,     (SELECT ename, job, deptno
       FROM   emp
       WHERE  job = 'SALESMAN') e
WHERE  d.deptno = e.deptno (+);

outer joins as anti-joins

Outer joins are often used as an alternative to anti-joins, particularly NOT IN queries. The Oracle syntax for this is as follows.

SQL> SELECT d.dname, d.loc
  2  FROM   dept d
  3  ,      emp e
  4  WHERE  d.deptno = e.deptno (+)
  5  AND    e.deptno IS NULL;

DNAME          LOC
-------------- -------------
OPERATIONS     BOSTON

1 row selected.

This can be expressed as a NOT IN query as follows.

SQL> SELECT d.dname, d.loc
  2  FROM   dept d
  3  WHERE  d.deptno NOT IN (SELECT e.deptno FROM emp e);

DNAME          LOC
-------------- -------------
OPERATIONS     BOSTON

1 row selected.

The ANSI equivalent for this technique is shown below.

SQL> SELECT d.dname, d.loc
  2  FROM   dept d
  3         LEFT OUTER JOIN
  4         emp e
  5  ON    (d.deptno = e.deptno)
  6  WHERE  e.deptno IS NULL;

DNAME          LOC          
-------------- -------------
OPERATIONS     BOSTON

1 row selected.

right outer join

As its name suggests, a right outer join is an outer join where datasets are driven from right to left (i.e. related to the ordering in the query). There are a few occasions where we might want to use RIGHT OUTER JOIN, but generally, there is nothing to be gained from switching the ordering of the tables in this way.

Everything described in the left outer join section of this article applies equally to right outer joins so we will not spend too much time on these. The following example simply shows how to use this type of join. Again, the OUTER keyword is optional.

SQL> SELECT deptno, d.dname, d.loc, e.ename, e.job
  2  FROM   emp  e
  3         RIGHT OUTER JOIN
  4         dept d
  5  USING (deptno);

    DEPTNO DNAME          LOC           ENAME      JOB
---------- -------------- ------------- ---------- ---------
        20 RESEARCH       DALLAS        SMITH      CLERK
        30 SALES          CHICAGO       ALLEN      SALESMAN
        30 SALES          CHICAGO       WARD       SALESMAN
        20 RESEARCH       DALLAS        JONES      MANAGER
        30 SALES          CHICAGO       MARTIN     SALESMAN
        30 SALES          CHICAGO       BLAKE      MANAGER
        10 ACCOUNTING     NEW YORK      CLARK      MANAGER
        20 RESEARCH       DALLAS        SCOTT      ANALYST
        10 ACCOUNTING     NEW YORK      KING       PRESIDENT
        30 SALES          CHICAGO       TURNER     SALESMAN
        20 RESEARCH       DALLAS        ADAMS      CLERK
        30 SALES          CHICAGO       JAMES      CLERK
        20 RESEARCH       DALLAS        FORD       ANALYST
        10 ACCOUNTING     NEW YORK      MILLER     CLERK
        40 OPERATIONS     BOSTON

15 rows selected.

Note that all we have done is to switch the ordering of the tables in the FROM clause such that Oracle will join from right to left. The Oracle equivalent of this query is as follows.

SELECT d.deptno, d.dname, d.loc, e.ename, e.job
FROM   dept d
,      emp e
WHERE  e.deptno (+) = d.deptno;

full outer join

A full outer join will join two datasets from left-to-right and right-to-left. Records that join in both directions are output once to avoid duplication. In the following demonstration, we will create two different subsets of the EMP table. The two subsets will share a small number of common records but each will also have some unique data.

SQL> CREATE TABLE e1 AS SELECT * FROM emp WHERE deptno IN (10,20);

Table created.

SQL> CREATE TABLE e2 AS SELECT * FROM emp WHERE deptno IN (20,30);

Table created.

SQL> exec DBMS_STATS.GATHER_TABLE_STATS(USER,'E1');

PL/SQL procedure successfully completed.

SQL> exec DBMS_STATS.GATHER_TABLE_STATS(USER,'E2');

PL/SQL procedure successfully completed.

Using SQL Trace to generate an execution plan, we can now see an ANSI FULL OUTER JOIN between these two subsets of EMP.

SQL> ALTER SESSION SET SQL_TRACE = TRUE;

Session altered.

SQL> SELECT e1.ename, e1.deptno, e1.job
  2  ,      e2.ename, e2.deptno, e2.job
  3  FROM   e1
  4         FULL OUTER JOIN
  5         e2
  6  USING (empno);

ENAME          DEPTNO JOB       ENAME          DEPTNO JOB
---------- ---------- --------- ---------- ---------- ---------
SMITH              20 CLERK     SMITH              20 CLERK
JONES              20 MANAGER   JONES              20 MANAGER
SCOTT              20 ANALYST   SCOTT              20 ANALYST
ADAMS              20 CLERK     ADAMS              20 CLERK
FORD               20 ANALYST   FORD               20 ANALYST
KING               10 PRESIDENT
CLARK              10 MANAGER
MILLER             10 CLERK
                                TURNER             30 SALESMAN
                                BLAKE              30 MANAGER
                                ALLEN              30 SALESMAN
                                WARD               30 SALESMAN
                                MARTIN             30 SALESMAN
                                JAMES              30 CLERK

14 rows selected.

We can see that records from both datasets appear in the output, even if they do not have a match in the opposite table. This is what a full outer join does and can be useful when partial datasets (such as delta files) need to be joined.

From the trace file, the execution plan for this query is as follows.


Rows     Row Source Operation
-------  ---------------------------------------------------
     14  VIEW  
     14   UNION-ALL  
      8    HASH JOIN OUTER 
      8     TABLE ACCESS FULL E1 
     11     TABLE ACCESS FULL E2 
      6    HASH JOIN ANTI 
     11     TABLE ACCESS FULL E2 
      8     TABLE ACCESS FULL E1 

The full outer join has performed two separate query blocks (one outer join and one anti-join) and appended the results with UNION ALL as we can see in the execution plan.

This type of ANSI join does not have a direct equivalent in Oracle syntax, but we can simulate it with the following query.

SQL> SELECT e1.ename, e1.deptno, e1.job
  2  ,      e2.ename, e2.deptno, e2.job
  3  FROM   e1
  4  ,      e2
  5  WHERE  e1.empno (+) = e2.empno
  6  UNION
  7  SELECT e1.ename, e1.deptno, e1.job
  8  ,      e2.ename, e2.deptno, e2.job
  9  FROM   e1
 10  ,      e2
 11  WHERE  e1.empno = e2.empno (+);

ENAME          DEPTNO JOB       ENAME          DEPTNO JOB
---------- ---------- --------- ---------- ---------- ---------
ADAMS              20 CLERK     ADAMS              20 CLERK
CLARK              10 MANAGER
FORD               20 ANALYST   FORD               20 ANALYST
JONES              20 MANAGER   JONES              20 MANAGER
KING               10 PRESIDENT
MILLER             10 CLERK
SCOTT              20 ANALYST   SCOTT              20 ANALYST
SMITH              20 CLERK     SMITH              20 CLERK
                                ALLEN              30 SALESMAN
                                BLAKE              30 MANAGER
                                JAMES              30 CLERK
                                MARTIN             30 SALESMAN
                                TURNER             30 SALESMAN
                                WARD               30 SALESMAN

14 rows selected.

This time the execution plan looks slightly different, as follows.


Rows     Row Source Operation
-------  ---------------------------------------------------
     14  SORT UNIQUE 
     19   UNION-ALL  
     11    HASH JOIN OUTER 
     11     TABLE ACCESS FULL E2 
      8     TABLE ACCESS FULL E1 
      8    HASH JOIN OUTER 
      8     TABLE ACCESS FULL E1 
     11     TABLE ACCESS FULL E2 

Oracle has used two outer joins in opposite directions to perform this query, which is in fact what we asked it to do in the way we structured our query.

A final point to note is that full outer joins are "expensive" operations and can have a fine tipping-point at which performance degrades. This point can be at relatively low volumes for some systems (especially when outer joining more than two datasets) so is something for developers to be aware of while testing.

cross join

A cross join is what we call a Cartesian Product in Oracle. A Cartesian Product "joins" each row in the first dataset to every row in the second dataset. To understand how many records are generated by a Cartesian Product or cross join, simply multiply the number of rows in the two datasets together. In the following example, we will cross join two small datasets of 2 rows each, thereby outputting 4 rows. This will keep the screen output to a minimum for demonstration purposes.

SQL> WITH two_rows AS (
  2     SELECT 1 AS r FROM dual
  3     UNION ALL
  4     SELECT 2 AS r FROM dual
  5     )
  6  SELECT a.r, b.r
  7  FROM   two_rows a
  8         CROSS JOIN
  9         two_rows b;

         R          R
---------- ----------
         1          1
         2          1
         1          2
         2          2

4 rows selected.

We can see that there is no join predicate with a cross join, though we can add filter predicates (WHERE clause) if required. Note how each row in dataset A has joined to both rows in dataset B. The Oracle equivalent of this syntax is as follows.

SQL> WITH two_rows AS (
  2     SELECT 1 AS r FROM dual
  3     UNION ALL
  4     SELECT 2 AS r FROM dual
  5     )
  6  SELECT a.r, b.r
  7  FROM   two_rows a
  8  ,      two_rows b;

         R          R
---------- ----------
         1          1
         2          1
         1          2
         2          2

4 rows selected.

mutable join

Mutable joins are simply multi-table joins. These can include any combination of ANSI joins as required.

mutable inner join

We can see an example of this by joining DEPT to EMP and to EMP again as follows.

SQL> SELECT d.dname, d.loc, e.ename, e.job, e2.sal
  2  FROM   dept d
  3         INNER JOIN
  4         emp e
  5  USING (deptno)
  6         INNER JOIN
  7         emp e2
  8  USING (empno);

DNAME          LOC           ENAME      JOB              SAL
-------------- ------------- ---------- --------- ----------
RESEARCH       DALLAS        SMITH      CLERK            800
SALES          CHICAGO       ALLEN      SALESMAN        1600
SALES          CHICAGO       WARD       SALESMAN        1250
RESEARCH       DALLAS        JONES      MANAGER         2975
SALES          CHICAGO       MARTIN     SALESMAN        1250
SALES          CHICAGO       BLAKE      MANAGER         2850
ACCOUNTING     NEW YORK      CLARK      MANAGER         2450
RESEARCH       DALLAS        SCOTT      ANALYST         3000
ACCOUNTING     NEW YORK      KING       PRESIDENT       5000
SALES          CHICAGO       TURNER     SALESMAN        1500
RESEARCH       DALLAS        ADAMS      CLERK           1100
SALES          CHICAGO       JAMES      CLERK            950
RESEARCH       DALLAS        FORD       ANALYST         3000
ACCOUNTING     NEW YORK      MILLER     CLERK           1300

14 rows selected.

In ANSI syntax, joins are parsed from left to right and join conditions can only reference preceding tables in the FROM clause. We can see this in the following example where we attempt to include a reference to the second EMP join in the first DEPT-EMP join predicates.

SQL> SELECT d.dname, d.loc, e.ename, e.job, e2.sal
  2  FROM   dept d
  3         INNER JOIN
  4         emp e
  5  ON    (d.deptno = e.deptno
  6  AND    e.empno = e2.empno) --<-- E2 has not been joined to query yet
  7         INNER JOIN
  8         emp e2
  9  USING (empno);
AND    e.empno = e2.empno) --<-- E2 has not been joined to query yet
                 *
ERROR at line 6:
ORA-00904: "E2"."EMPNO": invalid identifier

Note that this restriction is syntactic only. It does not mean that the CBO has to physically join the datasets in the order in which they appear. To do so would render ANSI-compliant joins too restrictive. As with Oracle join syntax, the CBO is free to order the tables or datasets in the way it considers the "cheapest".

The Oracle syntax for mutable inner joins is as follows.

SELECT d1.dname, d1.loc, e1.ename, e1.job, e2.sal
FROM   dept d1
,      emp  e1
,      emp  e2
WHERE  d1.deptno = e1.deptno
AND    e1.empno = e2.empno;

mutable outer join

One of the benefits of ANSI syntax is that it makes outer joins easier to write and understand. It also helps us to workaround the minor restrictions of Oracle syntax (we saw an example earlier when attempting to outer join using an expression). Another restriction with Oracle syntax is that a table can only be outer joined once in any one query block.

We can setup an example of this. We will create three small datasets, D1, D2 and D3, as three variations on the contents of DEPT as follows.

SQL> CREATE TABLE d1 AS SELECT * FROM dept;

Table created.

SQL> CREATE TABLE d2 AS SELECT * FROM dept WHERE ROWNUM <= 3;

Table created.

SQL> CREATE TABLE d3 AS SELECT * FROM dept WHERE ROWNUM <= 2;

Table created.

If we are required to outer join both D1 and D2 to the D3 dataset, Oracle will not allow us to use a single query block for this, as we can see below.

SQL> SELECT d1.dname, d1.deptno, d2.deptno, d3.deptno
  2  FROM   d1
  3  ,      d2
  4  ,      d3
  5  WHERE  d1.deptno = d3.deptno(+)
  6  AND    d2.deptno = d3.deptno(+);
WHERE  d1.deptno = d3.deptno(+)
                 *
ERROR at line 5:
ORA-01417: a table may be outer joined to at most one other table

The Oracle workaround to this is to use an in-line view for one of the outer joins, as follows.

SQL> SELECT d1.dname, d1.deptno, d2_d3.d2_deptno, d2_d3.d3_deptno
  2  FROM   d1
  3  ,     (
  4         SELECT d2.deptno AS d2_deptno
  5         ,      d3.deptno AS d3_deptno
  6         FROM   d2
  7         ,      d3
  8         WHERE  d2.deptno = d3.deptno (+)
  9        ) d2_d3
 10  WHERE  d1.deptno = d2_d3.d3_deptno(+);

DNAME              DEPTNO  D2_DEPTNO  D3_DEPTNO
-------------- ---------- ---------- ----------
ACCOUNTING             10         10         10
RESEARCH               20         20         20
SALES                  30
OPERATIONS             40

4 rows selected.

ANSI syntax makes this easier as we can combine left and right outer joins in a single query as follows.

SQL> SELECT d1.dname, d1.deptno, d2.deptno, d3.deptno
  2  FROM   d2
  3         LEFT OUTER JOIN
  4         d3
  5         ON (d2.deptno = d3.deptno)
  6         RIGHT OUTER JOIN
  7         d1
  8         ON (d1.deptno = d3.deptno);

DNAME              DEPTNO     DEPTNO     DEPTNO
-------------- ---------- ---------- ----------
ACCOUNTING             10         10         10
RESEARCH               20         20         20
SALES                  30
OPERATIONS             40

4 rows selected.

mixed mutable joins

Finally, mutable joins can combine any of the ANSI joins. For completeness, we will run a query that combines everything we've seen in this article. The results will be completely useless of course!

SQL> SELECT COUNT(*)
  2  FROM   dept d1
  3         --
  4         INNER JOIN
  5         dept d2
  6         USING (deptno)
  7         --
  8         LEFT OUTER JOIN
  9         emp  e1
 10         USING (deptno)
 11         --
 12         RIGHT OUTER JOIN
 13         emp  e2
 14         ON (e1.empno = e2.empno)
 15         --
 16         NATURAL JOIN
 17         emp  e3
 18         --
 19         CROSS JOIN
 20         dept d4;

  COUNT(*)
----------
       784

1 row selected.

ansi joins and the cbo

As we saw in the mutable joins section, the ANSI FROM syntax is ordered from left-to-right and datasets/tables are added and joined to the query one-by-one. While this implies an explicit ordering in the application of join predicates to the SQL, this does not mean that this is how the optimiser (CBO) will deal with the joins. As with queries that use standard Oracle syntax, the CBO is free to shuffle the join orders according to its own costs and arithmetic.

In most cases, the execution plans for equivalent ANSI and Oracle syntax queries will be the same as they are based on the same underlying statistics. The obvious exception to this is the FULL OUTER JOIN that we saw earlier, though we could code an Oracle equivalent based on what we now know about Oracle's treatment of the new ANSI method.

To demonstrate that the order of the ANSI joins does not dictate the actual join order chosen by the CBO, we will run a simple example that joins the D1, D2 and D3 sample tables created earlier in this article. Using Autotrace, we will first see how Oracle might choose to optimise such a query written with ANSI joins as follows.

SQL> set autotrace traceonly explain

SQL> SELECT *
  2  FROM   d1
  3         --
  4         INNER JOIN
  5         d2
  6         USING (deptno)
  7         --
  8         LEFT OUTER JOIN
  9         d3
 10         USING (deptno);

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=8 Card=3 Bytes=180)
   1    0   HASH JOIN (Cost=8 Card=3 Bytes=180)
   2    1     HASH JOIN (OUTER) (Cost=5 Card=3 Bytes=120)
   3    2       TABLE ACCESS (FULL) OF 'D2' (Cost=2 Card=3 Bytes=57)
   4    2       TABLE ACCESS (FULL) OF 'D3' (Cost=2 Card=2 Bytes=42)
   5    1     TABLE ACCESS (FULL) OF 'D1' (Cost=2 Card=4 Bytes=80)

We can see that the Explain Plan join order does not match the order of the tables in our SQL. The CBO has started with the LEFT OUTER JOIN between D2 and D3, hashing D3 first. If we write this statement using Oracle syntax, we can verify that the optimiser treats it the same as the ANSI equivalent.

SQL> SELECT *
  2  FROM   d1
  3  ,      d2
  4  ,      d3
  5  WHERE  d1.deptno = d2.deptno
  6  AND    d2.deptno = d3.deptno (+);

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=8 Card=3 Bytes=180)
   1    0   HASH JOIN (Cost=8 Card=3 Bytes=180)
   2    1     HASH JOIN (OUTER) (Cost=5 Card=3 Bytes=120)
   3    2       TABLE ACCESS (FULL) OF 'D2' (Cost=2 Card=3 Bytes=57)
   4    2       TABLE ACCESS (FULL) OF 'D3' (Cost=2 Card=2 Bytes=42)
   5    1     TABLE ACCESS (FULL) OF 'D1' (Cost=2 Card=4 Bytes=80)

Finally, we can add a hint to change the order in which we wish the CBO to join the tables. In the following example, we will try to force the CBO to begin with the D1 table by using the LEADING hint.

SQL> SELECT /*+ LEADING(d1) */
  2         *
  3  FROM   d1
  4         --
  5         INNER JOIN
  6         d2
  7         USING (deptno)
  8         --
  9         LEFT OUTER JOIN
 10         d3
 11         USING (deptno);

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=8 Card=3 Bytes=180)
   1    0   HASH JOIN (OUTER) (Cost=8 Card=3 Bytes=180)
   2    1     HASH JOIN (Cost=5 Card=3 Bytes=117)
   3    2       TABLE ACCESS (FULL) OF 'D1' (Cost=2 Card=4 Bytes=80)
   4    2       TABLE ACCESS (FULL) OF 'D2' (Cost=2 Card=3 Bytes=57)
   5    1     TABLE ACCESS (FULL) OF 'D3' (Cost=2 Card=2 Bytes=42)

This time, Explain Plan tells us that the CBO would start with D1 as we requested and the outer join is performed after D1 and D2 are hash joined.

further reading

For more information on ANSI join syntax, see the documentation for SELECT in the online SQL Reference.

source code

The source code for the examples in this article can be downloaded from here.

Adrian Billington, January 2003 (updated June 2007)

Back to Top