decomposing sql%rowcount for merge

The MERGE statement (AKA "UPSERT") released in Oracle 9i is possibly one of the most useful ETL-enabling technologies built into the Oracle kernel. For those who have missed Oracle's headlines for the last year and a half and are unaware of what the MERGE statement does, it simply enables us to either UPDATE or INSERT a row into a target table in one statement. You simply tell Oracle your rules for determining whether a target row should updated or inserted from the source, and Oracle does the rest.

Prior to 9i, the alternative in SQL was to perform two individual DML statements (one UPDATE and one INSERT, each with opposing predicates). The alternative in PL/SQL was even worse - either try to INSERT a row and if it failed with a DUP_VAL_ON_INDEX exception, then UPDATE the row instead, or try to UPDATE a row, only inserting in the event of a SQL%NOTFOUND.

rowcounts

In the days of excessive auditing, many ETL tools keep track of the number of inserts and number of updates being performed during batch data loads. Unfortunately, Oracle is as yet unable to provide individual rowcounts for the UPDATE and INSERT components of the MERGE statement (who's betting this will be a 10g feature?). Instead, we still get the SQL%ROWCOUNT attribute, which tells us the overall number of records merged.

What I am demonstrating in this short article is my attempt at decomposing SQL%ROWCOUNT into its component DML counts. To enable this I am simply maintaining package variables to keep track of either the number of updates, inserts or both. Incrementing the package counters is done by simply "piggy-backing" one of the UPDATE SET columns or one of the INSERT VALUES columns. This will become clearer when you see the code.

demo setup

First, the setup. I am going to create two variations on the EMP table. EMP_SOURCE will be a full copy of the EMP table (all 14 rows of it!). EMP_TARGET is going to contain just eight of these rows. I am then going to MERGE EMP_SOURCE into EMP_TARGET, such that we will expect eight rows to be updated and six records to be inserted.

SQL> CREATE TABLE emp_source
  2  AS
  3     SELECT * FROM emp;

Table created.

SQL> SELECT COUNT(*) FROM emp_source;

  COUNT(*)
----------
        14

SQL> CREATE TABLE emp_target
  2  AS
  3     SELECT * FROM emp WHERE ROWNUM <= 8;

Table created.

SQL> SELECT COUNT(*) FROM emp_target;

  COUNT(*)
----------
         8

example merge

I'll now MERGE the EMP_SOURCE data into EMP_TARGET.

SQL> BEGIN
  2     MERGE INTO emp_target et
  3        USING ( SELECT * FROM emp_source ) es
  4        ON    ( et.empno = es.empno )
  5     WHEN MATCHED THEN
  6     UPDATE
  7        SET et.ename  = es.ename
  8        ,   et.sal    = es.sal
  9        ,   et.mgr    = es.mgr
 10        ,   et.deptno = es.deptno
 11     WHEN NOT MATCHED THEN
 12     INSERT
 13        ( et.empno, et.ename, et.sal, et.mgr, et.deptno )
 14        VALUES
 15        ( es.empno, es.ename, es.sal, es.mgr, es.deptno );
 16  
 17     DBMS_OUTPUT.PUT_LINE(TO_CHAR(SQL%ROWCOUNT) || ' rows merged.');
 18  END;
 19  /

14 rows merged.

PL/SQL procedure successfully completed.

An important caveat to note is that Oracle will generate an "ORA-30926: unable to get a stable set of rows in the source tables" error if there is either a many-to-one or many-to-many relationship between the source and target tables. This is not as serious as it sounds because you would normally have to MERGE a one-to-one or one-to-zero relationship as your join condition would be protected by the target's primary key.

etl package

As can be seen, the SQL%ROWCOUNT attribute provides us with the total MERGE count, but we have no idea of how many updates or inserts were performed, which doesn't help us when we have a production batch run to audit or debug. To enable us to keep track of this, I created a small package named ETL with three functions (plus two overloads) and one procedure. The source code for this is as follows.

CREATE OR REPLACE PACKAGE etl AS

   c_inserting CONSTANT PLS_INTEGER := 0;
   c_updating  CONSTANT PLS_INTEGER := 1;

   FUNCTION merge_counter (
            action_in IN PLS_INTEGER DEFAULT c_inserting
            ) RETURN PLS_INTEGER;

   FUNCTION get_merge_update_count RETURN PLS_INTEGER;

   FUNCTION get_merge_update_count (
            merge_count_in IN PLS_INTEGER
            ) RETURN PLS_INTEGER;

   FUNCTION get_merge_insert_count RETURN PLS_INTEGER;

   FUNCTION get_merge_insert_count (
            merge_count_in in PLS_INTEGER
            ) RETURN PLS_INTEGER;

   PROCEDURE reset_counters;

END etl;
/

CREATE OR REPLACE PACKAGE BODY etl AS

   g_update_counter PLS_INTEGER NOT NULL := 0;
   g_insert_counter PLS_INTEGER NOT NULL := 0;

   /*----------- FUNCTION merge_counter -----------*/
   FUNCTION merge_counter (
            action_in IN PLS_INTEGER DEFAULT c_inserting
            ) RETURN PLS_INTEGER IS
   BEGIN
      CASE action_in
         WHEN c_updating
            THEN g_update_counter := g_update_counter + 1;
         WHEN c_inserting
            THEN g_insert_counter := g_insert_counter + 1;
         ELSE
            RAISE PROGRAM_ERROR;
      END CASE;
      RETURN 0;
   END merge_counter;

   /*----------- FUNCTION get_merge_update_count V1 -----------*/
   FUNCTION get_merge_update_count
      RETURN PLS_INTEGER is
   BEGIN
      RETURN g_update_counter;
   END get_merge_update_count;

   /*----------- FUNCTION get_merge_update_count V2 -----------*/
   FUNCTION get_merge_update_count (
            merge_count_in IN PLS_INTEGER
            ) RETURN PLS_INTEGER IS
   BEGIN
      RETURN NVL( merge_count_in - g_insert_counter, 0 );
   END get_merge_update_count;

   /*----------- FUNCTION get_merge_insert_count V1 -----------*/
   FUNCTION get_merge_insert_count
      RETURN PLS_INTEGER IS
   BEGIN
      RETURN g_insert_counter;
   END get_merge_insert_count;

   /*----------- FUNCTION get_merge_insert_count V2 -----------*/
   FUNCTION get_merge_insert_count (
            merge_count_in IN PLS_INTEGER
            ) RETURN PLS_INTEGER IS
   BEGIN
      RETURN NVL( merge_count_in - g_update_counter, 0 );
   END get_merge_insert_count;

   /*----------- FUNCTION reset_counters -----------*/
   PROCEDURE reset_counters IS
   BEGIN
      g_update_counter := 0;
      g_insert_counter := 0;
   END reset_counters;

END etl;
/

Note that there is one function to set either the UPDATE or INSERT counter and two functions to retrieve INSERT or UPDATE counts (each with an overload - totalling four "get" functions). Finally, there is a small procedure to reset the counters.

decomposing sql%rowcount using the etl package

The following is an example of how we might "piggy-back" the earlier MERGE statement to decompose the SQL%ROWCOUNT.

SQL> ROLLBACK;

Rollback complete.

SQL> BEGIN
  2     MERGE INTO emp_target et
  3        USING ( SELECT * FROM emp_source ) es
  4        ON    ( et.empno = es.empno )
  5     WHEN MATCHED THEN
  6     UPDATE
  7        SET et.ename  = es.ename
  8        ,   et.sal    = es.sal
  9        ,   et.mgr    = es.mgr
 10        ,   et.deptno = es.deptno
 11     WHEN NOT MATCHED THEN
 12     INSERT
 13        ( et.empno
 14        , et.ename
 15        , et.sal
 16        , et.mgr
 17        , et.deptno
 18        )
 19        VALUES
 20        ( CASE etl.merge_counter(etl.c_inserting)
 21             WHEN 0 THEN es.empno
 22          END
 23        , es.ename
 24        , es.sal
 25        , es.mgr
 26        , es.deptno
 27        );
 28  
 29     DBMS_OUTPUT.PUT_LINE(
 30        TO_CHAR(SQL%ROWCOUNT) || ' rows merged.'
 31        );
 32     DBMS_OUTPUT.PUT_LINE(
 33        TO_CHAR(etl.get_merge_insert_count) || ' rows inserted.'
 34        );
 35     DBMS_OUTPUT.PUT_LINE(
 36        TO_CHAR(etl.get_merge_update_count( SQL%ROWCOUNT ))
 37        || ' rows updated.'
 38        );
 39     etl.reset_counters;
 40  END;
 41  /

14 rows merged.
6 rows inserted.
8 rows updated.

Some notes on the above PL/SQL block:

Line 20. I have wrapped the INSERT of es.empno in a simple CASE expression. This CASE expression calls the ETL function MERGE_COUNTER, telling it that I require an INSERT counter to be incremented. This function ALWAYS returns 0, so my first test is for 0, which of course guarantees that the es.empno value will be preserved in the overall INSERT statement.
Line 33. A simple call to the non-parameter overload of the ETL.GET_MERGE_INSERT_COUNT function returns me the value of the INSERT counter.
Line 36. I chose not to maintain an UPDATE counter in this example (because I know my data and I always get more updates than inserts, so to be as efficient as possible given the fact that I'm forcing all these extra function executions, I'll not bother with an UPDATE count). Instead, I supplied the overloaded ETL.GET_MERGE_UPDATE_COUNT function with the SQL%ROWCOUNT and this returned me the total MERGE count minus the retained INSERT count. You have the option of course to keep both counters going in one statement, but the overloads in the ETL package make this unnecessary.

conclusions

And that is my basic implementation of a MERGE INSERT and a MERGE UPDATE counter. There is a performance cost, naturally, but Oracle maintains that the cost of executing PL/SQL functions from SQL is continually falling. My tests in merging 146,000 rows, 99% of which were inserts showed very few identifiable resource costs other than CPU time, which was marginally increased as is to be expected. No doubt some of you will consider it a cost too much, but I would argue that in many cases, the extra cost of keeping a package variable updated will be marginal when set against a large, database-intensive MERGE statement. I will be happily using it in my ETL processing, until Oracle includes its own of course!

latest version of package

Note that since writing this article, a modified version of the MERGE counter package is available here.

source code

The source code for the examples in this article can be downloaded from here.

Adrian Billington, August 2003

oracle-developer.net