Wednesday, July 24, 2013

dbms_comparison quicker than SQL MINUS



Overview


The DBMS_COMPARISON package is an Oracle-supplied package that you can use to compare database objects at two databases. This package also enables you converge the database objects so that they are consistent at different databases. Typically, this package is used in environments that share a database object at multiple databases. When copies of the same database object exist at multiple databases, the database object is a shared database object.
Using DBMS_COMPARISON is considered to be faster than traditional comparison  and used internally by database when using streams replication. When the data to be compared is huge, this package could speed up comparison.


Limitation

The index columns in a comparison must uniquely identify every row involved in a comparison. The following constraints satisfy this requirement:
  • A primary key constraint
  • A unique constraint on one or more non-NULL columns

Performance

   Internally the package does  a FAST INDEX SCAN on the primary key or unique key INDEX to find the MIN and MAX value of the table.
     Then, it does a FULL TABLE SCAN to populate the buckets. If there is  miss match in the checksum values of the data in these buckets, additional scans are done to identify the rows..

   The performance of the dbms_compare is equal to the time it takes to do 2 FAST INDEX SCANS and 2 FULL TABLE SCANS. 
    To speed this further, you could 
  • Alter the degree of the underlying tables 
  • enable PARALLEL DEGREE_POLICY to AUTO and PARALLEL_DEGREE_LIMIT=8 to speed it further. and after it completes, set the parameter back to original value  

Advantage over sql minus

To get all rows without dbms_compare will take 4 FTS with could take double the time as dbms_compare.  delta = (A-B) U (B-A)

  select * from table_A 
  minus
  select * from table_B
  union all
  (select * from table_B
  minus
  select * from table_A) ;
    
  • This would take 4 FULL TABLE SCANS compared to 2 in DBMS_COMPARE.
  •  it took a lot more temporary tablespace than dbms_compare as it has to sort the entire table with all the columns.
Also, the rowid of the divergent rows are stored for each comparison. So if we need to converge the data, it can done very quickly. 

   

Perform a Compare with DBMS_COMPARISON

To perform a comparison you will follow these steps:
  1. Create the comparison 
  2. Execute the comparison 
  3. Review the results of the comparison 
  4. Converge the data if desired
  5. Recheck data (the synced data -- optional) 
  6. Purge and Drop

Create Comparison


BEGIN
dbms_comparison.create_comparison(
comparison_name=>'VJ_COMPARE',
schema_name=>'SCOTT',
object_name=>'PO_TXN',
dblink_name=>null,
remote_schema_name=>'SCOTT',
remote_object_name=>'VJ_PO_TXN');
END;
/

Execute Comparison

-- If the Primary key or unique index are already defined on the table, then it is picked up automatically for comparision.

Set serveroutput on
declare
     compare_info dbms_comparison.comparison_type;
     compare_return boolean;
begin
     compare_return :=
     dbms_comparison.compare (
     comparison_name=>'VJ_COMPARE',
     scan_info=>compare_info,
     perform_row_dif=>TRUE);
     if compare_return=TRUE
     then
          dbms_output.put_line('the tables are equivalent.');
     else
          dbms_output.put_line('Bad news... there is data divergence.');
          dbms_output.put_line('Check the  and
               dba_comparison_scan_summary views for locate the differences for
               scan_id:'||compare_info.scan_id);
     end if;
end;
/


Data Converge (optional)

  If you want to converge the data, you can use DBMS_COMPARISON.CONVERGE to sync the tables. You need to specify which table is the master whose data is merged.
  
set serveroutput on
  declare
     compare_info dbms_comparison.comparison_type;
     scan_id_num number;
begin
      select scan_id into scan_id_num from DBA_COMPARISON_SCAN_SUMMARY where comparison_name='VJ_COMPARE' and status='ROW DIF';
     dbms_comparison.converge (
          comparison_name=>'VJ_COMPARE',
          scan_id=>scan_id_num,
          scan_info=>compare_info,
          converge_options=>dbms_comparison.CMP_CONVERGE_LOCAL_WINS);
          -- converge_options=>dbms_comparison.CMP_CONVERGE_REMOTE_WINS is the other option.
          dbms_output.put_line('--- Results ---');
          dbms_output.put_line('Local rows Merged by process:
                               '||compare_info.loc_rows_merged);
          dbms_output.put_line('Remote rows Merged by process:
                               '||compare_info.rmt_rows_merged);
          dbms_output.put_line('Local rows Deleted by process:
                               '||compare_info.loc_rows_deleted);
          dbms_output.put_line('Remote rows Deleted by process:
          '||compare_info.rmt_rows_deleted);
end;
/

Drop Comparison


 BEGIN
  DBMS_COMPARISON.PURGE_COMPARISON(
    comparison_name => 'VJ_COMPARE',
    scan_id         => NULL,
    purge_time      => NULL);
END;
/

BEGIN
  DBMS_COMPARISON.DROP_COMPARISON(
    comparison_name => 'VJ_COMPARE');
END;
/

No comments:

Post a Comment

Feedback welcome