dml error logging performance in 10g release 2

This article is a follow-up to this overview of DML error logging; a major new feature of 10g Release 2 (10.2). DML error logging enables us to re-direct "bad data" to a log table during bulk DML operations without failing the overall statement. Readers who are unfamiliar with this new feature are advised to read the referenced article for an overview before continuing.

In this article, we will concentrate purely on the performance characteristics of DML error logging. In particular, we will examine the following:

the general performance cost of invoking DML error logging;
the cost of logging errors with the DML error logging feature; and
the performance of DML error logging in comparison with its nearest related technology; FORALL SAVE EXCEPTIONS.

invoking dml error logging

We will begin by looking at the general costs associated with invoking DML error logging. Of course, costs are relative and it depends how we measure them; or rather, what we measure them against. We might decide that we absolutely must use this technology in our batch systems and are prepared to "take a hit" in performance to save potential support costs that arise in the event of failure. In this article, of course, we define cost purely in performance terms: time and resources. We also assume that DML error logging will "cost" us something (nothing is for free) and we shall see if this is the case below.

setup: sample data

For our performance tests we will create a source and target table. The source table will be loaded with approximately 50,000 records. The target table will have various check constraints to make data violations simple to manufacture. Our test tables are created as follows.

SQL> CREATE TABLE src
  2  NOLOGGING
  3  AS
  4     SELECT ROWNUM        AS n1
  5     ,      created       AS d1
  6     ,      'X'           AS v1
  7     ,      RPAD('x',100) AS v2
  8     FROM   dba_objects;

Table created.

SQL> SELECT COUNT(*) FROM src;

  COUNT(*)
----------
     50332

1 row selected.

SQL> CREATE TABLE tgt
  2  (  n1 NUMBER    NOT NULL
  3  ,  d1 DATE      CHECK (EXTRACT(YEAR FROM d1) > 2004)
  4  ,  v1 CHAR(1)   CHECK (v1 IN ('X','Y'))
  5  ,  v2 CHAR(100)
  6  );

Table created.

We will need an error log table (required for DML error logging), which is created via the DBMS_ERRLOG package as follows.

SQL> BEGIN
  2     DBMS_ERRLOG.CREATE_ERROR_LOG(
  3        dml_table_name      => 'TGT',
  4        err_log_table_name  => 'TGT_EXCEPTIONS'
  5        );
  6  END;
  7  /

PL/SQL procedure successfully completed.

invocation costs: conventional path

To begin, we will compare two completely successful INSERT..SELECT statements; the latter with DML error logging enabled. We'll simply use the "wall-clock" to compare the runtimes. Note that none of our test data is currently in violation of any of the constraints on the target table. We'll start with conventional path insert.

SQL> INSERT INTO tgt
  2  SELECT *
  3  FROM   src;

50332 rows created.

Elapsed: 00:00:00.23

SQL> INSERT INTO tgt
  2  SELECT *
  3  FROM   src
  4  LOG ERRORS INTO tgt_exceptions;

50332 rows created.

Elapsed: 00:00:01.09

We can see immediately that simply including DML error logging in our DML statement increases the runtime by a factor of four. This is despite the fact that none of our data was re-directed to the error log table. Running the examples under SQL trace provides no clues for why this might be. We can probably conclude that this must be a cost incurred by the kernel code used to implement this feature.

We can look further at this "factor of four" using a larger dataset. In the following example, we scale up our source data to ten times its original volume and repeat the test. This time we will use Tom Kyte's RUNSTATS utility to report the differences in the resource costs between the two statements.

SQL> exec runstats_pkg.rs_start();

PL/SQL procedure successfully completed.

SQL> INSERT INTO tgt
  2  SELECT src.*
  3  FROM   src, (SELECT NULL FROM dual CONNECT BY ROWNUM <= 10) multi;

503320 rows created.

SQL> exec runstats_pkg.rs_middle();

PL/SQL procedure successfully completed.

SQL> INSERT INTO tgt
  2  SELECT src.*
  3  FROM   src, (SELECT NULL FROM dual CONNECT BY ROWNUM <= 10) multi
  4  LOG ERRORS INTO tgt_exceptions;

503320 rows created.

SQL> exec runstats_pkg.rs_stop(1000);

Run1 ran in 1190 hsecs
Run2 ran in 3522 hsecs
Run1 ran in 33.79% of the time

Name                                  Run1        Run2        Diff
LATCH.undo global data               8,652       9,663       1,011
STAT..user I/O wait time                34       1,199       1,165
STAT..table scan rows gotten        54,855      56,060       1,205
STAT..physical read IO request           1       1,232       1,231
STAT..physical reads                     1       1,232       1,231
STAT..physical reads cache               1       1,232       1,231
STAT..physical read total IO r          22       1,358       1,336
LATCH.enqueues                         439       1,856       1,417
LATCH.enqueue hash chains              469       1,992       1,523
STAT..DB time                        1,205       3,521       2,316
LATCH.row cache objects                893       3,211       2,318
STAT..consistent gets               17,715      14,237      -3,478
STAT..consistent gets from cac      17,715      14,237      -3,478
STAT..redo ordering marks              252       4,099       3,847
STAT..buffer is not pinned cou         158       4,272       4,114
STAT..no work - consistent rea       1,004       5,383       4,379
STAT..table scan blocks gotten         966       5,379       4,413
STAT..calls to kcmgas                  272       4,781       4,509
STAT..free buffer requested            268       5,998       5,730
STAT..free buffer inspected            128       7,415       7,287
STAT..calls to get snapshot sc       8,328         613      -7,715
STAT..hot buffers moved to hea          12       9,563       9,551
LATCH.checkpoint queue latch         9,464      20,325      10,861
LATCH.simulator lru latch           11,124      52,138      41,014
LATCH.simulator hash latch          11,188      53,606      42,418
LATCH.object queue header oper      16,576      59,074      42,498
LATCH.cache buffers lru chain       17,446      63,753      46,307
STAT..session pga memory          -262,144           0     262,144
STAT..session logical reads         80,503     577,836     497,333
STAT..db block gets                 62,788     563,599     500,811
STAT..db block gets from cache      62,788     563,599     500,811
STAT..redo entries                  36,485     537,305     500,820
STAT..db block changes              52,602   1,044,963     992,361
LATCH.cache buffers chains         338,003   2,962,511   2,624,508
STAT..physical read bytes            8,192  10,092,544  10,084,352
STAT..physical read total byte     335,872  12,058,624  11,722,752
STAT..physical write total byt   5,398,528  32,391,168  26,992,640
STAT..undo change vector size    1,995,448  32,153,796  30,158,348
STAT..redo size                 67,750,528 181,772,212 114,021,684


Run1 latches total versus run2 -- difference and pct
Run1        Run2        Diff        Pct
420,331   3,239,481   2,819,150     12.98%

PL/SQL procedure successfully completed.

This time we can see the costs that are incurred simply by invoking DML error logging. The logged statement took three times longer to run, used ten times more latches and generated three times more redo. It is this last statistic that is most alarming. This increase in volume is what we would see if we turned bulk inserts into row-by-row inserts. In fact, a number of STAT statistics with the largest differences show similarities with row-by-row insert costs. The following example highlights this, where we compare DML error logging to row-by-row PL/SQL processing.

SQL> exec runstats_pkg.rs_start();

PL/SQL procedure successfully completed.

SQL> INSERT INTO tgt
  2  SELECT src.*
  3  FROM   src, (SELECT NULL FROM dual CONNECT BY ROWNUM <= 10) multi
  4  LOG ERRORS INTO tgt_exceptions;

503320 rows created.

SQL> exec runstats_pkg.rs_middle();

PL/SQL procedure successfully completed.

SQL> ALTER SESSION SET PLSQL_OPTIMIZE_LEVEL = 0;

Session altered.

SQL> BEGIN
  2     FOR r IN (SELECT src.*
  3               FROM   src
  4               ,     (SELECT NULL FROM dual CONNECT BY ROWNUM <= 10) multi)
  5     LOOP
  6        INSERT INTO tgt VALUES r;
  7     END LOOP;
  8  END;
  9  /

PL/SQL procedure successfully completed.

SQL> exec runstats_pkg.rs_stop(1000);

Run1 ran in 4354 hsecs
Run2 ran in 6799 hsecs
Run1 ran in 64.04% of the time


Name                                  Run1        Run2        Diff
STAT..consistent gets                9,396      10,435       1,039
STAT..consistent gets from cac       9,396      10,435       1,039
STAT..redo entries                 536,928     538,013       1,085
STAT..user I/O wait time             1,868         688      -1,180
STAT..db block changes           1,044,494   1,045,733       1,239
LATCH.undo global data               8,672       9,990       1,318
LATCH.enqueue hash chains            1,358       2,796       1,438
LATCH.enqueues                       1,296       2,856       1,560
LATCH.messages                       2,392       4,079       1,687
LATCH.checkpoint queue latch        20,371      22,295       1,924
LATCH.row cache objects              1,073       3,198       2,125
STAT..DB time                        4,342       6,812       2,470
STAT..db block gets                562,261     564,936       2,675
STAT..db block gets from cache     562,261     564,936       2,675
LATCH.session allocation               793       3,627       2,834
STAT..free buffer requested          8,693       5,766      -2,927
STAT..physical reads                 4,598       1,665      -2,933
STAT..physical reads cache           4,598       1,665      -2,933
STAT..hot buffers moved to hea       2,992           1      -2,991
STAT..physical read IO request       4,598       1,038      -3,560
STAT..physical read total IO r       4,598       1,038      -3,560
STAT..session logical reads        571,657     575,371       3,714
STAT..CPU used when call start       1,122       5,446       4,324
STAT..CPU used by this session       1,118       5,446       4,328
STAT..recursive cpu usage                1       4,356       4,355
LATCH.cache buffers lru chain       55,983      51,094      -4,889
LATCH.object queue header oper      60,524      54,899      -5,625
STAT..free buffer inspected         10,712       4,251      -6,461
STAT..undo change vector size   32,147,080  32,157,192      10,112
STAT..table scan rows gotten        50,512      61,008      10,496
LATCH.cache buffers chains       2,970,511   2,944,162     -26,349
STAT..session uga memory max        65,464           0     -65,464
STAT..session pga memory max        65,536           0     -65,536
STAT..redo size                181,734,580 181,836,844     102,264
STAT..session pga memory          -393,216    -196,608     196,608
LATCH.shared pool                    2,055     503,833     501,778
STAT..execute count                     27     503,535     503,508
LATCH.library cache                  2,168   1,008,287   1,006,119
LATCH.library cache pin              1,023   1,007,788   1,006,765
STAT..recursive calls                  940   1,009,228   1,008,288
STAT..physical read bytes       37,666,816  13,639,680 -24,027,136
STAT..physical read total byte  37,666,816  13,639,680 -24,027,136


Run1 latches total versus run2 -- difference and pct
Run1        Run2        Diff        Pct
3,236,553   5,729,111   2,492,558     56.49%

PL/SQL procedure successfully completed.

We can see a similar statistics profile between DML error logging and row-by-row processing, especially with redo size, although row-based processing is still significantly slower. It has long been a fact that n rows inserted in bulk will generate less redo that n rows inserted one-by-one. With DML error logging enabled, it appears as though we lose this bulk SQL benefit.

invocation costs: direct path

Continuing with error-free inserts, we can now look at a direct path load. One of the key restrictions of DML error logging is that it does not support unique constraint violations in direct path inserts. This is a critical restriction and one that will deter many developers from using the feature. We've seen above that conventional path DML error logging is comparatively "slow". In the following example, we will compare direct path inserts at higher volumes with and without DML error logging.

SQL> exec runstats_pkg.rs_start();

PL/SQL procedure successfully completed.

SQL> INSERT /*+ APPEND */ INTO tgt
  2  SELECT src.*
  3  FROM   src, (SELECT NULL FROM dual CONNECT BY ROWNUM <= 10) multi;

503320 rows created.

SQL> COMMIT;

Commit complete.

SQL> exec runstats_pkg.rs_middle();

PL/SQL procedure successfully completed.

SQL> INSERT /*+ APPEND */ INTO tgt
  2  SELECT src.*
  3  FROM   src, (SELECT NULL FROM dual CONNECT BY ROWNUM <= 10) multi
  4  LOG ERRORS INTO tgt_exceptions;

503320 rows created.

SQL> COMMIT;

Commit complete.

SQL> exec runstats_pkg.rs_stop(1000);

Run1 ran in 292 hsecs
Run2 ran in 267 hsecs
Run1 ran in 109.36% of the time


Name                                  Run1        Run2        Diff
STAT..redo size                     55,412      52,200      -3,212
STAT..session pga memory           -65,536           0      65,536


Run1 latches total versus run2 -- difference and pct
Run1        Run2        Diff        Pct
14,649      14,440        -209    101.45%

PL/SQL procedure successfully completed.

In direct path mode, we see little difference in time or resource usage and, in this example, DML error logging is actually faster. Repeated executions of this example shows the timings and resource costs to be roughly equivalent for both statements. If the unique constraint restriction (or indeed any of the few restrictions) is irrelevant to our requirements, then direct path DML error logging appears to be a viable solution.

error logging costs

So far we have seen the performance characteristics of DML error logging without any errors. We can now move on to examine the costs of DML error logging when we actually have some exceptions. We will return to the original volume of 50,000 records and run a series of failed inserts as follows:

invalid datatype;
invalid date (check constraint violation);
invalid value (check constraint violation);
not null violation; and
invalid data length.

error costs: conventional path

To measure the cost of DML error logging at its most extreme, we will make every record an exception and set a reject limit of unlimited. We will begin with conventional path inserts.

SQL> set timing on

SQL> INSERT INTO tgt
  2  SELECT 'not a number'
  3  ,      d1
  4  ,      v1
  5  ,      v2
  6  FROM   src
  7  LOG ERRORS INTO tgt_exceptions ('INVALID DATATYPE')
  8  REJECT LIMIT UNLIMITED;

0 rows created.

Elapsed: 00:03:12.25

SQL> INSERT INTO tgt
  2  SELECT n1
  3  ,      DATE '2004-01-01'
  4  ,      v1
  5  ,      v2
  6  FROM   src
  7  LOG ERRORS INTO tgt_exceptions ('INVALID DATE')
  8  REJECT LIMIT UNLIMITED;

0 rows created.

Elapsed: 00:02:40.70

SQL> INSERT INTO tgt
  2  SELECT n1
  3  ,      d1
  4  ,      'a'
  5  ,      v2
  6  FROM   src
  7  LOG ERRORS INTO tgt_exceptions ('INVALID VALUE')
  8  REJECT LIMIT UNLIMITED;

0 rows created.

Elapsed: 00:02:43.50

SQL> INSERT INTO tgt
  2  SELECT NULL
  3  ,      d1
  4  ,      v1
  5  ,      v2
  6  FROM   src
  7  LOG ERRORS INTO tgt_exceptions ('NOT NULL VIOLATION')
  8  REJECT LIMIT UNLIMITED;

0 rows created.

Elapsed: 00:02:35.46

SQL> INSERT INTO tgt
  2  SELECT n1
  3  ,      d1
  4  ,      v1
  5  ,      RPAD(v2,200)
  6  FROM   src
  7  LOG ERRORS INTO tgt_exceptions ('INVALID DATA LENGTH')
  8  REJECT LIMIT UNLIMITED;

0 rows created.

Elapsed: 00:01:48.01

We can see that no rows were inserted into our target table. We can check the error log table as follows.

SQL> SELECT ora_err_tag$
  2  ,      COUNT(*)
  3  FROM   tgt_exceptions
  4  GROUP  BY
  5         ora_err_tag$;

ORA_ERR_TAG$                     COUNT(*)
------------------------------ ----------
INVALID DATA LENGTH                 50332
INVALID DATATYPE                    50332
NOT NULL VIOLATION                  50332
INVALID VALUE                       50332
INVALID DATE                        50332

5 rows selected.

It is immediately apparent that DML error logging is "costly". With every row raising an exception, the load times have increased from sub-second to approximately 2-3 minutes. DML error logging writes to the error log table in an autonomous transaction so the error data persists even if we rollback a failed insert. We can see how Oracle loads the error log table from a SQL trace file as follows.

********************************************************************************

INSERT INTO tgt
SELECT 'not a number'
,      d1
,      v1
,      v2
FROM   src
LOG ERRORS INTO tgt_exceptions ('INVALID DATATYPE')
REJECT LIMIT UNLIMITED

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.02          1         82          0           0
Execute      1    232.85     234.71          0      50337      50332           0
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2    232.87     234.73          1      50419      50332           0

Misses in library cache during parse: 1

********************************************************************************

INSERT INTO TGT_EXCEPTIONS (ORA_ERR_NUMBER$, ORA_ERR_MESG$, ORA_ERR_ROWID$, 
  ORA_ERR_OPTYP$, ORA_ERR_TAG$, N1, D1, V1, V2) 
VALUES
 (:1, :2, :3, :4, :5, :6, :7, :8, :9)


call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0          0          0           0
Execute  50332      6.35       6.76         13         83     106119       50332
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total    50333      6.35       6.76         13         83     106119       50332

Misses in library cache during parse: 1
Misses in library cache during execute: 1

********************************************************************************

The first statement is our insert and the second SQL statement is generated by Oracle to write to the error log table. The execute count for the generated statement tells us that all writes to the exceptions table are singletons and that there is no optimisation by Oracle to buffer these. This proves quite costly but this is only a fraction of the time spent on the statement overall. The performance issue must lie elsewhere.

error costs: direct path

To complete our examination of the general costs of error logging, we will re-run the slowest of the previous examples in direct path, as follows. This was the invalid datatype example, which took over 3 minutes to attempt an insert of 50,000 records in conventional path.

SQL> set timing on

SQL> INSERT /*+ APPEND */ INTO tgt
  2  SELECT 'not a number'
  3  ,      d1
  4  ,      v1
  5  ,      v2
  6  FROM   src
  7  LOG ERRORS INTO tgt_exceptions ('INVALID DATATYPE')
  8  REJECT LIMIT UNLIMITED;

0 rows created.

Elapsed: 00:02:01.35

In direct path, our DML error log insert is still relatively slow, but significantly less so than conventional path. Errors are written to the error log table in the same manner, but this time we see something new in the SQL trace output.

********************************************************************************

INSERT /*+ APPEND */ INTO tgt
SELECT 'not a number'
,      d1
,      v1
,      v2
FROM   src
LOG ERRORS INTO tgt_exceptions ('INVALID DATATYPE')
REJECT LIMIT UNLIMITED

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.00       0.00          0         82          0           0
Execute      1    118.95     119.24          0        856      50333           0
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------
total        2    118.95     119.25          0        938      50333           0

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 54  

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  LOAD AS SELECT  (cr=964 pr=0 pw=0 time=122658230 us)
  50332   ERROR LOGGING  (cr=24142121 pr=0 pw=0 time=3086272860618 us)
  50332   TABLE ACCESS FULL SRC (cr=856 pr=0 pw=0 time=201389 us)

The execution plan above shows a new step (highlighted in bold) which indicates where the time is being spent. What Oracle is actually doing, of course, is open to suggestion but we can see that the LIO is very high for a small table with no indexes. Note that this step does not appear in the trace file for the equivalent conventional path insert.

forall comparison

We can now move on to compare DML error logging with its nearest "relative", the FORALL SAVE EXCEPTIONS construct (introduced in Oracle 9i). FORALL SAVE EXCEPTIONS is a slightly different technology to DML error logging, but it is functionally equivalent in that it enables us to continue a data load despite the presence of erroneous data. Apart from the fact that DML error logging is SQL and FORALL is PL/SQL, the main ways in which they differ are listed as follows.

DML error logging is available in both SQL and PL/SQL;
DML error logging requires very little additional code;
DML error logging provides additional metadata about errors; and
DML error logging has transactional autonomy from the "parent" DML.

In the interest of making the performance comparison as fair as possible (given the above differences), we will make the FORALL SAVE EXCEPTIONS example behave as closely to DML error logging as possible. This will require additional coding on our part, but will enable us to compare the additional resource costs that come with the benefits (such as persistent error data and metadata).

forall comparison: sample data

For the following examples, we will manufacture some invalid data in roughly 10% of our source records. The invalid data will fail the check constraints on our target table, with the errors being distributed evenly.

SQL> UPDATE src
  2  SET    n1 = CASE MOD(ROWNUM,3)
  3                 WHEN 0
  4                 THEN NULL              --<-- fail the not null constraint
  5                 ELSE n1
  6              END
  7  ,      d1 = CASE MOD(ROWNUM,3)
  8                 WHEN 1
  9                 THEN DATE '2004-01-01' --<-- fail the date check
 10                 ELSE d1
 11              END
 12  ,      v1 = CASE MOD(ROWNUM,3)
 13                 WHEN 2
 14                 THEN 'Z'               --<-- fail 'X','Y' valid values
 15                 ELSE v1
 16              END
 17  WHERE  MOD(n1,10) = 0;

5033 rows updated.

SQL> COMMIT;

Commit complete.

forall comparison: dml error logging example

We will start by running an insert with DML error logging enabled.

SQL> exec runstats_pkg.rs_start();

PL/SQL procedure successfully completed.

SQL> exec runstats_pkg.rs_start();

PL/SQL procedure successfully completed.

SQL> INSERT INTO tgt
  2  SELECT *
  3  FROM   src
  4  LOG ERRORS INTO tgt_exceptions ('DML ERROR LOGGING')
  5  REJECT LIMIT UNLIMITED;

46977 rows created.

SQL> COMMIT;

Commit complete.

forall comparison: forall save exceptions example

Now we can run the FORALL SAVE EXCEPTIONS version. Note that to use FORALL in a sensible way with large volumes of data, we need to LIMIT it. This requires us to include additional code elements, such as an explicit cursor, loop and exit check. We will use 100 as a limit size as this is Oracle's built-in array size for PL/SQL optimization in 10g.

Given this, we can now run through a cycle of fetch->insert->load exceptions in much the same manner as DML error logging, but using FORALL SAVE EXCEPTIONS.

SQL> exec runstats_pkg.rs_middle();

PL/SQL procedure successfully completed.

SQL> DECLARE
  2
  3     /*
  4      * Source data cursor and associative array type. Needed to
  5      * enable LIMIT-based fetching...
  6      */
  7     CURSOR cur IS
  8        SELECT *
  9        FROM   src;
 10
 11     TYPE aat IS TABLE OF cur%ROWTYPE
 12        INDEX BY PLS_INTEGER;
 13     aa aat;
 14
 15     n PLS_INTEGER := 0;
 16
 17     /* "Exceptions encountered in FORALL" exception... */
 18     bulk_exceptions EXCEPTION;
 19     PRAGMA EXCEPTION_INIT ( bulk_exceptions, -24381 );
 20
 21     /* FORALL error-logging... */
 22     PROCEDURE error_logging IS
 23
 24        /* Associative array type of the exceptions table... */
 25        TYPE aat_exception IS TABLE OF tgt_exceptions%ROWTYPE
 26           INDEX BY PLS_INTEGER;
 27        aa_exceptions aat_exception;
 28
 29        v_indx PLS_INTEGER;
 30
 31        /* Emulate DML error logging behaviour... */
 32        PRAGMA AUTONOMOUS_TRANSACTION;
 33
 34     BEGIN
 35
 36        FOR i IN 1 .. SQL%BULK_EXCEPTIONS.COUNT LOOP
 37
 38           v_indx := SQL%BULK_EXCEPTIONS(i).ERROR_INDEX;
 39
 40           /* Populate as many values as available... */
 41           aa_exceptions(i).ora_err_number$ := SQL%BULK_EXCEPTIONS(i).ERROR_CODE;
 42           aa_exceptions(i).ora_err_mesg$ := SQLERRM( SQL%BULK_EXCEPTIONS(i).ERROR_CODE * -1 );
 43           aa_exceptions(i).ora_err_tag$ := 'FORALL ERROR LOGGING';
 44           aa_exceptions(i).ora_err_optyp$ := 'I';
 45           aa_exceptions(i).n1 := aa( v_indx ).n1;
 46           aa_exceptions(i).d1 := aa( v_indx ).d1;
 47           aa_exceptions(i).v1 := aa( v_indx ).v1;
 48           aa_exceptions(i).v2 := aa( v_indx ).v2;
 49
 50        END LOOP;
 51
 52        /* Load the exceptions into the exceptions table... */
 53        FORALL i IN INDICES OF aa_exceptions
 54           INSERT INTO tgt_exceptions
 55           VALUES aa_exceptions(i);
 56
 57        COMMIT;
 58
 59     END error_logging;
 60
 61  BEGIN
 62
 63     OPEN cur;
 64     LOOP
 65
 66        FETCH cur BULK COLLECT INTO aa LIMIT 100;
 67        EXIT WHEN aa.COUNT = 0;
 68
 69        BEGIN
 70           FORALL i IN INDICES OF aa SAVE EXCEPTIONS
 71              INSERT INTO tgt VALUES aa(i);
 72        EXCEPTION
 73           WHEN bulk_exceptions THEN
 74              n := n + SQL%ROWCOUNT;
 75              error_logging();
 76        END;
 77
 78     END LOOP;
 79     CLOSE cur;
 80
 81     DBMS_OUTPUT.PUT_LINE( n || ' rows inserted.' );
 82
 83  END;
 84  /

46977 rows inserted.

PL/SQL procedure successfully completed.

What is immediately apparent from the above example is how much code we need to write. To make this re-usable, we'd probably write a code-generator or a central exception handler for bulk exceptions from FORALL. The exception loading is this example in a forward-declared procedure to make it easier to follow the main executable section. It is also an autonomous transaction to emulate DML error logging, which commits the invalid data even if the main DML is rolled back.

forall comparison: runstats output

Both examples loaded the same number of records. We can compare the timings and resource costs by completing the snapshots and reporting large differences as follows. Note that the tests were run from a test harness so that the three runstats snapshots were executed without delay. The snapshots do not materially affect the runtimes.

SQL> exec runstats_pkg.rs_stop(5000);

Run1 ran in 759 hsecs
Run2 ran in 1138 hsecs
Run1 ran in 66.7% of the time


Name                                  Run1        Run2        Diff
LATCH.dml lock allocation            6,824       1,079      -5,745
STAT..enqueue requests               6,998       1,186      -5,812
STAT..enqueue releases               6,998       1,184      -5,814
LATCH.undo global data              11,075       3,429      -7,646
LATCH.row cache objects             11,110         636     -10,474
LATCH.enqueue hash chains           14,085       2,570     -11,515
STAT..table scan rows gotten        57,626      77,866      20,240
STAT..physical read bytes           32,768           0     -32,768
STAT..physical read total byte      32,768           0     -32,768
STAT..buffer is pinned count        49,435           0     -49,435
STAT..redo entries                  65,551       7,429     -58,122
STAT..session logical reads        103,906      41,650     -62,256
STAT..db block gets                 84,768      21,171     -63,597
STAT..db block gets from cache      84,768      21,171     -63,597
STAT..db block changes             126,794      15,225    -111,569
LATCH.cache buffers chains         392,366     100,002    -292,364
STAT..session pga memory                 0     393,216     393,216
STAT..IMU undo allocation size          52     734,680     734,628
STAT..IMU Redo allocation size         208     809,632     809,424
STAT..undo change vector size    3,662,408     460,448  -3,201,960
STAT..redo size                 20,762,952   7,936,224 -12,826,728


Run1 latches total versus run2 -- difference and pct
Run1        Run2        Diff        Pct
507,730     179,216    -328,514    283.31%

PL/SQL procedure successfully completed.

We can see from this output is that the bulk SQL solution was over 30% faster than the PL/SQL solution (most readers will not be surprised by this) yet it generated nearly 3 times the volume of redo and nearly 3 times the number of latches. Remember this was at a volume of 50,000 records with 10% "bad data" and that the FORALL example was deliberately coded to emulate DML error logging.

forall comparison: error log data

Finally, we can look at the error log data to show that similar information was recorded by both methods, though some metadata is unavailable in the FORALL implementation. We will use Tom Kyte's print_table procedure for convenience.

SQL> BEGIN
  2     print_table( q'[SELECT * FROM tgt_exceptions WHERE ora_err_tag$ LIKE 'DML%' AND ROWNUM = 1]' );
  3     print_table( q'[SELECT * FROM tgt_exceptions WHERE ora_err_tag$ LIKE 'FOR%' AND ROWNUM = 1]' );
  4  END;
  5  /

ORA_ERR_NUMBER$               : 2290
ORA_ERR_MESG$                 : ORA-02290: check constraint (SCOTT.SYS_C005451) violated
ORA_ERR_ROWID$                :
ORA_ERR_OPTYP$                : I
ORA_ERR_TAG$                  : DML ERROR LOGGING
N1                            : 4050
D1                            : 01-JAN-04
V1                            : X
V2                            : x
-----------------
ORA_ERR_NUMBER$               : 2290
ORA_ERR_MESG$                 : ORA-02290: check constraint (.) violated
ORA_ERR_ROWID$                :
ORA_ERR_OPTYP$                : I
ORA_ERR_TAG$                  : FORALL ERROR LOGGING
N1                            : 10
D1                            : 30-AUG-05
V1                            : Z
V2                            : x
-----------------

PL/SQL procedure successfully completed.

source code

The source code for the examples in this article can be downloaded from here.

Adrian Billington, August 2005

oracle-developer.net