Outer Joining With ANSI SQL-89 and SQL-92

Outer Joining With ANSI SQL-89 and SQL-92

Introduction

I recently implemented a quick fix for the rule G-3130 to transform queries to ANSI SQL-92 syntax. One of the challenges was to handle Oracle’s outer-join syntax (+). In ANSI SQL-89, join and filter criteria are part of the WHERE clause. It’s not easy to distinguish them. In this blog post, I explain why it is important to do that when using outer joins, regardless of which join syntax you prefer.

Data Setup

All examples in this blog post are based on the tables DEPT and EMP, that were originally present in every Oracle Database instance as part of the SCOTT schema. Here’s a script to create these tables. The script requires an Oracle Database 23ai.

1) Create table DEPT and EMP
create table if not exists dept as select * from (values
          (10, 'ACCOUNTING', 'NEW YORK'),
          (20, 'RESEARCH',   'DALLAS'),
          (30, 'SALES',      'CHICAGO'),
          (40, 'OPERATIONS', 'BOSTON')
       ) s (deptno, dname, loc);

create table if not exists emp as select * from (values
          (7839, 'KING',   'PRESIDENT', null, date '1981-11-17', 5000, null, 10),
          (7566, 'JONES',  'MANAGER',   7839, date '1981-04-02', 2975, null, 20),
          (7698, 'BLAKE',  'MANAGER',   7839, date '1981-05-01', 2850, null, 30),
          (7782, 'CLARK',  'MANAGER',   7839, date '1981-06-09', 2450, null, 10),
          (7788, 'SCOTT',  'ANALYST',   7566, date '1987-04-19', 3000, null, 20),
          (7902, 'FORD',   'ANALYST',   7566, date '1981-12-03', 3000, null, 20),
          (7499, 'ALLEN',  'SALESMAN',  7698, date '1981-02-20', 1600,  300, 30),
          (7521, 'WARD',   'SALESMAN',  7698, date '1981-02-22', 1250,  500, 30),
          (7654, 'MARTIN', 'SALESMAN',  7698, date '1981-09-28', 1250, 1400, 30),
          (7844, 'TURNER', 'SALESMAN',  7698, date '1981-09-08', 1500,    0, 30),
          (7900, 'JAMES',  'CLERK',     7698, date '1981-12-03',  950, null, 30),
          (7934, 'MILLER', 'CLERK',     7782, date '1982-01-23', 1300, null, 10),
          (7369, 'SMITH',  'CLERK',     7902, date '1980-12-17',  800, null, 20),
          (7876, 'ADAMS',  'CLERK',     7788, date '1987-05-23', 1100, null, 20)
       ) s (empno, ename, job, mgr, hiredate, sal, comm, deptno);

Joins

The predecessor of the ANSI SQL-92 standard is the ANSI SQL-89 standard. Based on that standard, you could write joins only like this:

2) ANSI SQL-89 Join
select dept.deptno, emp.ename
  from dept, emp
 where emp.deptno = dept.deptno
 order by dept.deptno, emp.ename;
    DEPTNO ENAME 
---------- ------
        10 CLARK 
        10 KING  
        10 MILLER
        20 ADAMS 
        20 FORD  
        20 JONES 
        20 SCOTT 
        20 SMITH 
        30 ALLEN 
        30 BLAKE 
        30 JAMES 
        30 MARTIN
        30 TURNER
        30 WARD  

14 rows selected. 

In the FROM clause, we list the tables to be joined, and in the WHERE clause, we define the JOIN criteria. There are no filter criteria in this case.

There are several ways to transform this query to ANSI SQL-92. One rather silly way, but very similar to the original ANSI SQL-89 syntax, is the use of an explicit CROSS JOIN.

3) ANSI SQL-92 Cross Join
select dept.deptno, emp.ename
  from dept cross join emp
 where emp.deptno = dept.deptno
 order by dept.deptno, emp.ename;

While this is technically an ANSI SQL-92 compliant query, it’s not a good option. Why? – ANSI SQL-92 allows us to define the join criteria for each join. Using a CROSS JOIN implies that there is no join criterion between these tables, which is wrong.

A better option is to use an INNER JOIN, like in the next example.

4) ANSI SQL-92 Inner Join
select dept.deptno, emp.ename
  from dept join emp on emp.deptno = dept.deptno
 order by dept.deptno, emp.ename;

We see that the WHERE clause is gone. Why? – Because we do not have filter criteria in this query. No WHERE clause, no filter. As simple as that.

Outer Joins

Now let’s write an outer join with ANSI SQL-89. As I mentioned before, the standard back then only defined INNER JOINs. So, we need to work around it. One way is something like this:

5) ANSI SQL-89 Outer Join
select dept.deptno, emp.ename
  from dept, emp
 where emp.deptno = dept.deptno
union all
select dept.deptno, null
  from dept
 where not exists (
          select 1
            from emp
           where emp.deptno = dept.deptno
       )
 order by deptno, ename;
    DEPTNO ENAME 
---------- ------
        10 CLARK 
        10 KING  
        10 MILLER
        20 ADAMS 
        20 FORD  
        20 JONES 
        20 SCOTT 
        20 SMITH 
        30 ALLEN 
        30 BLAKE 
        30 JAMES 
        30 MARTIN
        30 TURNER
        30 WARD  
        40       

15 rows selected. 

Phew, that’s a lot of work. It’s no surprise that database vendors came up with proprietary solutions for outer joins. For example:

  • *= and =* operators in Sybase/SQL Server
  • (+) operator in Db2 and the Oracle Database

Let’s use the Oracle join syntax.

6) Oracle-style Outer Join
select dept.deptno, emp.ename
  from dept, emp
 where emp.deptno (+) = dept.deptno
 order by dept.deptno, emp.ename;

Much simpler. EMP is outer joined due to the (+) operator.

And here is an ANSI SQL-92 compliant variant:

7) ANSI SQL-92 Outer Join
select dept.deptno, emp.ename
  from dept left join emp on emp.deptno = dept.deptno
 order by dept.deptno, emp.ename;

No WHERE clause, no filter. So far, so good.

Outer Joins With Additional Filters

The problem starts as soon as you use outer join queries with filter predicates. Here’s an example.

8a) Outer Join With Filter – Oracle-style
select dept.deptno, emp.ename, emp.hiredate
  from dept, emp
 where emp.deptno (+) = dept.deptno
   and emp.hiredate (+) > date '1981-12-03'
 order by deptno, ename;
    DEPTNO ENAME  HIREDATE  
---------- ------ ----------
        10 MILLER 23.01.1982
        20 ADAMS  23.05.1987
        20 SCOTT  19.04.1987
        30                  
        40                  
8b) Outer Join With Filter – ANSI SQL-92
select dept.deptno, emp.ename, emp.hiredate
  from dept
  left join emp on emp.deptno = dept.deptno
   and emp.hiredate > date '1981-12-03'
 order by deptno, ename;
    DEPTNO ENAME  HIREDATE  
---------- ------ ----------
        10 MILLER 23.01.1982
        20 ADAMS  23.05.1987
        20 SCOTT  19.04.1987
        30                  
        40                  

Both query variants use the predicate emp.hiredate > date '1981-12-03' as part of the outer join criteria. As a result, the DEPT rows 30 and 40 are shown with empty EMP columns.

The result is correct. But is this the result we want?

I don’t think so. One option is that we don’t want an outer join at all. Something like this:

9a) Inner Join With Filter – ANSI SQL-89
select dept.deptno, emp.ename, emp.hiredate
  from dept, emp
 where emp.deptno = dept.deptno
   and emp.hiredate > date '1981-12-03'
 order by deptno, ename;
    DEPTNO ENAME  HIREDATE  
---------- ------ ----------
        10 MILLER 23.01.1982
        20 ADAMS  23.05.1987
        20 SCOTT  19.04.1987
9b) Inner Join With Filter – ANSI SQL-92
select dept.deptno, emp.ename, emp.hiredate
  from dept
  join emp on emp.deptno = dept.deptno
 where emp.hiredate > date '1981-12-03'
 order by deptno, ename;
    DEPTNO ENAME  HIREDATE  
---------- ------ ----------
        10 MILLER 23.01.1982
        20 ADAMS  23.05.1987
        20 SCOTT  19.04.1987

Now we use the predicate emp.hiredate > date '1981-12-03' as a filter criterion. It’s applied after the join. The result would be the same if we used a LEFT JOIN in 9b).

In most cases, join criteria are based only on the foreign key-primary key relationships. If I see something else, I typically need to dig deeper to understand what the requirements of the query are.

So, maybe we want to see only the employees that match the filter, including departments that have no employees. For that, we could rewrite the query as follows:

10a) Outer Join With Filter – Oracle-style
select dept.deptno, emp.ename, emp.hiredate
  from dept, emp
 where emp.deptno (+) = dept.deptno
   and (emp.hiredate > date '1981-12-03' or emp.deptno is null)
 order by deptno, ename;
    DEPTNO ENAME  HIREDATE  
---------- ------ ----------
        10 MILLER 23.01.1982
        20 ADAMS  23.05.1987
        20 SCOTT  19.04.1987
        40                  
10b) Outer Join With Filter – ANSI SQL-92
select dept.deptno, emp.ename, emp.hiredate
  from dept left join emp
    on emp.deptno = dept.deptno
 where (emp.hiredate > date '1981-12-03' or emp.deptno is null)
 order by deptno, ename;
    DEPTNO ENAME  HIREDATE  
---------- ------ ----------
        10 MILLER 23.01.1982
        20 ADAMS  23.05.1987
        20 SCOTT  19.04.1987
        40                  

This example shows why it’s important to distinguish between join criteria and filter criteria. The ANSI SQL-92 syntax makes this easier. Unfortunately, there are still a few bugs and limitations when using ANSI-92 syntax in every Oracle Database version. But there are fewer and fewer of these problems with each new version.

Therefore, I use the ANSI SQL-92 join syntax whenever possible and try to document the reason when I cannot.

Use dbLinter to Change Queries to ANSI SQL-92 Join Syntax

dbLinter can find queries that do not use ANSI SQL-92 join syntax. Simply put, it searches for multiple items in the FROM clause. It also provides a quick fix to convert ANSI SQL-89 and Oracle-style joins to ANSI SQL-92 syntax. Here’s a screenshot with the previously discussed queries:

dbLinter for VS Code showing G-3130 violations and the pop-up window to apply quick fixes

After applying the quick fix to all G-3130 violations in the file, the result looks like this:

dbLinter for VS Code after applying the quick fix for all G-3130 violations

The result looks quite good, except that the predicate in the 8a) example is not moved to the join condition (see red arrow in the screenshot above). Why is that the case? – The quick fix only moves equality predicates and predicates that are not part of a Boolean OR condition to the join condition. If there are leftover outer join operators (+) in the WHERE clause, they are surrounded by a TODO comment.

To get started with dbLinter, I recommend watching Philipp Harenfeller’s video. United Codes and Grisselbav are collaborating on this product.

Feedback is very much appreciated. Comment on this blog post or raise an issue on the dbLinter GitHub repository.

Thanks.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.