returning with insert..select

The RETURNING DML clause can be extremely useful at times. Unfortunately, most of the times I've wanted to use it, it has been to return data from an INSERT..SELECT. For versions up to and including 11g Release 1, Oracle doesn't support this construct.

the problem

For many developers, this restriction doesn't create too much of an issue, but recently I had to write some PL/SQL that really needed INSERT..SELECT..RETURNING. The scenario was as follows. A collection of an object type was being sent from the mid-tier application for INSERT into a table minus the surrogate key values for the new records. After populating the table, the package had to send back the collection but include the new PK values. If INSERT..SELECT..RETURNING was supported, this would be a simple, one-step operation, along the lines of the following pseudo-code.

   INSERT INTO table (cols)
   SELECT attrs FROM TABLE(p_incoming_collection)
   RETURNING object_type(cols) BULK COLLECT INTO p_outgoing_collection;

Unfortunately, the restriction meant that I was forced to code it as a two-step operation and fetch the sequence-generated primary keys after the insert and complete the collection accordingly.

setup

The workaround uses object and collection types. To replicate the above scenario, we need a target table, object type, collection type and a sequence.

SQL> CREATE TABLE t
  2  ( x INT
  3  , y CHAR(1)
  4  , z DATE );

Table created.

SQL> CREATE TYPE ot AS OBJECT
  2  ( x INT
  3  , y CHAR(1)
  4  , z DATE );
  5  /

Type created.

SQL> CREATE TYPE ntt AS TABLE OF ot;
  2  /

Type created.

SQL> CREATE SEQUENCE s;

Sequence created.

workaround

The following workaround supports 9i upwards as it makes use of the TREAT function. In the example, we emulate the work that INSERT..SELECT..RETURNING would theoretically do, but using FORALL..INSERT..VALUES..RETURNING instead. Note that the variable named "nt_passed_in" represents a collection parameter being passed in from a middle-tier application.

SQL> DECLARE
  2  
  3     nt_passed_in ntt;
  4     nt_to_return ntt;
  5  
  6     FUNCTION pretend_parameter RETURN ntt IS
  7        nt ntt;
  8     BEGIN
  9        SELECT ot(NULL, 'X', SYSDATE) BULK COLLECT INTO nt
 10        FROM   dual
 11        CONNECT BY ROWNUM <= 5;
 12        RETURN nt;
 13     END pretend_parameter;
 14  
 15  BEGIN
 16  
 17     nt_passed_in := pretend_parameter();
 18  
 19     FORALL i IN 1 .. nt_passed_in.COUNT
 20        INSERT INTO t (x, y, z)
 21        VALUES
 22        ( s.NEXTVAL
 23        , TREAT(nt_passed_in(i) AS ot).y
 24        , TREAT(nt_passed_in(i) AS ot).z
 25        )
 26        RETURNING ot(x, y, z)
 27        BULK COLLECT INTO nt_to_return;
 28  
 29     FOR i IN 1 .. nt_to_return.COUNT LOOP
 30        DBMS_OUTPUT.PUT_LINE(
 31           'Sequence value = [' || TO_CHAR(nt_to_return(i).x) || ']'
 32           );
 33     END LOOP;
 34  
 35  END;
 36  /

Sequence value = [1]
Sequence value = [2]
Sequence value = [3]
Sequence value = [4]
Sequence value = [5]

PL/SQL procedure successfully completed.

Note that the TREAT function enables us to reference individual attributes within a FORALL (despite the PLS-00436 implementation restriction). This enables us to build an INSERT statement that combines parts of the incoming collection with the sequence-generated primary key. And of course, FORALL does support RETURNING so we get most of the performance of an INSERT..SELECT, but without compromising on the solution.

This is yet another example of how object and collection types can extend PL/SQL and bail us out of a seemingly impossible situation.

update: an alternative example and workaround

David Routledge suggests an alternative workaround, based on a specific requirement he had to return an aggregate value from a set of data being inserted. David's requirement can be expressed in the following pseudo-code:

   INSERT INTO table (cols)
   SELECT cols FROM (tables)
   RETURNING SUM(numeric_column) INTO plsql_variable;

In other words, he needed to report the sum of a numeric column of some freshly-loaded data. If INSERT..SELECT..RETURNING was supported as in the above pseudo-code, this would be a simple one-pass statement. Without this support, a two-pass solution is inevitable, but as we will see below, David's workaround minimises the additional resources needed to calculate and report the aggregate value.

To begin, we will create the base table, as follows.

SQL> CREATE TABLE t
  2  ( x INT PRIMARY KEY
  3  , y NUMBER(10,2)
  4  , z VARCHAR2(80)
  5  );

Table created.

To simulate the data load requirement, we will generate 500,000 rows as the source data and load this into T. Once we have inserted the data, we will report the sum of column Y for the rows we loaded. Because INSERT..SELECT..RETURNING is not supported, we have two options:

Method 1 (typical): (Step 1) INSERT..SELECT the source data and (Step 2) SELECT the sum of column Y from table T for the data we loaded; or
Method 2 (workaround): (Step 1) INSERT ALL the source data into T and the sum of column Y into a global temporary table, then (Step 2) SELECT the pre-aggregated sum of column Y from the single-row global temporary table.

The typical method might look as follows.

SQL> DECLARE
  2     v_aggregate NUMBER;
  3  BEGIN
  4
  5     /* Step 1: load the data... */
  6     INSERT INTO t (x, y, z)
  7     SELECT ROWNUM                 AS x
  8     ,      ROWNUM/100             AS y
  9     ,      RPAD('x', 80)          AS z
 10     FROM   dual
 11     CONNECT BY ROWNUM <= 500000;
 12
 13     /* Step 2: query it back to get the reporting value... */
 14     SELECT SUM(y) INTO v_aggregate FROM t;
 15
 16     DBMS_OUTPUT.PUT_LINE(
 17        'Sum of column Y loaded = ' || TO_CHAR(v_aggregate)
 18        );
 19  END;
 20  /

Sum of column Y loaded = 1250002500

PL/SQL procedure successfully completed.

In this case, the cost of the missing INSERT..SELECT..RETURNING support is realised at line 14 (i.e. having to aggregate 500,000 rows from T after it has been loaded). We can see how an aggregated RETURNING construct (added in Oracle 10g for most other forms of DML) would have been extremely helpful in this scenario.

The workaround in this case is as follows.

SQL> CREATE GLOBAL TEMPORARY TABLE gtt
  2  ( aggregate_value NUMBER );

Table created.

SQL> DECLARE
  2     v_aggregate NUMBER;
  3  BEGIN
  4
  5     /* Step 1: load the data and the aggregate... */
  6     INSERT ALL
  7        --<>--
  8        WHEN 1=1 THEN
  9        INTO t (x, y, z)
 10        VALUES (x, y, z)
 11        --<>--
 12        WHEN rn=1 THEN
 13        INTO gtt (aggregate_value)
 14        VALUES (sum_of_y)
 15        --<>--
 16     SELECT ROWNUM                  AS rn
 17     ,      ROWNUM                  AS x
 18     ,      ROWNUM/100              AS y
 19     ,      RPAD('x', 80)           AS z
 20     ,      SUM(ROWNUM/100) OVER () AS sum_of_y
 21     FROM   dual
 22     CONNECT BY ROWNUM <= 500000;
 23
 24     /* Step 2: query the pre-aggregated value back... */
 25     SELECT aggregate_value INTO v_aggregate FROM gtt;
 26
 27     DBMS_OUTPUT.PUT_LINE(
 28        'Sum of column Y loaded = ' || TO_CHAR(v_aggregate)
 29        );
 30  END;
 31  /

Sum of column Y loaded = 1250002500

PL/SQL procedure successfully completed.

In this workaround, the INSERT ALL..SELECT statement is more complicated but essentially it removes the need to aggregate 500,000 rows after the load. The sum of column Y is pre-aggregated by the analytic function on line 20 and is inserted into a global temporary table on lines 12-14. The result is then fetched as a single row from the global temporary table on line 25.

This workaround is therefore predicated on the fact that it will need to access far less data than the typical method described above. Both methods will load the same volume of data into the base table, but they incur different costs in reporting the sum of column Y. In David's case, the typical method's additional I/O costs were far greater in performance impact than the cost of the analytic function in his workaround.

acknowledgements

Thanks to David Routledge for providing another good example of where support for INSERT..SELECT..RETURNING would have been extremely useful and also for suggesting an innovative workaround to reduce I/O costs.

source code

The source code for the examples in this article can be downloaded from here.

Adrian Billington, August 2005 (updated May 2009)

oracle-developer.net