Thursday, April 17, 2008

Materialized view refresh change in 10G and sql tracing

Complete refresh of a single materialized view used to do a truncate and insert on the mview table until 10G. Starting with 10G the refresh does a delete and insert on the mview table. This guarantees that the table is never empty in case of an error, the refresh process became an atomic operation.

There is another difference between 9.2 and 10G in the refresh process, which I have realized when trying to find out why a DDL trigger prevented a refresh operation. In 9.2 the refresh process runs an ALTER SUMMARY statement against the mview while in 10G it does not.

We have a DDL trigger in some of the test environments (all 9.2) for preventing some operations. The trigger checks the type of the DDL and allows or disallows it based on some conditions. After this trigger was put in place some developers started to complain about some periodic mview refresh operations failing a few weeks ago. I knew it was not because of the TRUNCATE because the trigger allowed truncate operations.

So I enabled sql tracing and found out what the refresh process was doing. Here is a simplified test case for it in 10G.


YAS@10G>create table master as select * from all_objects where rownum<=5;

Table created.

YAS@10G>alter table master add primary key (owner,object_name,object_type);

Table altered.

YAS@10G>create materialized view mview as select * from master;

Materialized view created.

YAS@10G>exec dbms_mview.refresh('MVIEW');

PL/SQL procedure successfully completed.

YAS@10G>create or replace trigger ddl_trigger before ddl on schema
2 begin
3 if ora_sysevent<>'TRUNCATE' then
4 raise_application_error(-20001,'DDL NOT ALLOWED');
5 end if;
6 end;
7 /

Trigger created.

YAS@10G>exec dbms_mview.refresh('MVIEW');

PL/SQL procedure successfully completed.

The refresh was successful in 10G even after the DDL trigger was active. In 9.2 it errors out after the DDL trigger is enabled.

SQL> exec dbms_mview.refresh('MVIEW');
BEGIN dbms_mview.refresh('MVIEW'); END;

*
ERROR at line 1:
ORA-04045: errors during recompilation/revalidation of YAS.MVIEW
ORA-20001: DDL NOT ALLOWED
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 820
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 877
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 858
ORA-06512: at line 1

After enabling sql trace and running the refresh again, the trace file shows the problem statement.

SQL> alter session set sql_trace=true;

Session altered.

SQL> exec dbms_mview.refresh('MVIEW');
BEGIN dbms_mview.refresh('MVIEW'); END;

*
ERROR at line 1:
ORA-04045: errors during recompilation/revalidation of YAS.MVIEW
ORA-20001: DDL NOT ALLOWED
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 820
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 877
ORA-06512: at "SYS.DBMS_SNAPSHOT", line 858
ORA-06512: at line 1

The last statement in the trace file is this:

ALTER SUMMARY "YAS"."MVIEW" COMPILE

So, in 9.2 refresh process runs an ALTER SUMMARY command against the mview. After changing the trigger to allow DDL operations on SUMMARY objects I could refresh the mview without errors.

SQL> r
1 create or replace trigger ddl_trigger before ddl on schema
2 begin
3 if ora_sysevent<>'TRUNCATE' and ora_dict_obj_type<>'SUMMARY' then
4 raise_application_error(-20001,'DDL NOT ALLOWED');
5 end if;
6* end;

Trigger created.

SQL> exec dbms_mview.refresh('MVIEW');

PL/SQL procedure successfully completed.
Tonguc Yilmaz had a recent post about using 10046 for purposes other than performance tuning, like finding out what statements a procedure runs. I find uses for sql tracing nearly everyday and the above case is one of them. When you are not able to understand why an error happens, it is sometimes very useful to turn on sql tracing and examine the trace file. Errors related to specific bind values, errors raised from a "when others" block (you know we do not like "when others", right?) are a couple of things you can dig and analyze with sql tracing.

Friday, April 11, 2008

Disqus

Last week I read a post in Andy C's blog about the service called Disqus. It is a service to keep track of comments in blogs. Later he made another post about it.

I have been looking for a solution to keep track of blog comments, both mine and other people's. Not all blogs have the option to subscribe to the comments, when you comment on a post you need to check later if someone commented further. I want a central repository where I can see all my comments on a post, all comments made by others on the same post and all comments made on my own blog.

I tried Cocomment before but I was not satisfied with it. So I decided to give Disqus a try and enabled it on this blog. Laurent Schneider decided to try it too.

Then Andy made another post about the concerns of some people about Disqus (one being Tim Hall). Their concerns make sense. Who owns the comments one makes, the commenter or the blog owner? Is it sensible to store the comments to your blog not in your blog database, but elsewhere? What if Disqus is not there next year, what if Disqus is inaccessible for some time? Is it possible to export the comments from Disqus and import them back to the blog?

Some of these may be irrelevant if you are using a public blogging service, like Blogger, because it means you are already storing your posts and comments somewhere else. The question "What if Blogger is not there next year?" comes to mind for example.

A solution suggested by Tim Hall is to dual post the comments. The comments will be in the blog and on Disqus also, this need the blogging service to provide a way to do it.

Another solution can be a strong export-import utility in Disqus. That way you can export the comments and put them back to the blog whenever you want. Disqus currently has an export utility but as far as I have read it is not reliable for now.

While I agree with these concerns I liked what Disqus provides. The one-stop page for all comments, the threading of comments, being able to follow other users are primary features I like. So, I will stick with it, at least for now.

Collect in 10G

Collections can be a great help in speeding up the PL/SQL programs. By using bulk collect operations it is possible to get great performance improvements.

In 9.2 we needed to use the bulk collect clause to fetch rows into a collection. 10G brings a new function called COLLECT, which takes a column as a parameter and returns a nested table containing the column values. Using this we can get the data into a collection without using bulk collect.

Here is a very simple demo of this new function.


YAS@10G>create table t as select * from all_objects;

Table created.

YAS@10G>create or replace type name_type as table of varchar2(30);
2 /

Type created.

YAS@10G>set serveroutput on

YAS@10G>r
1 declare
2 v_names name_type;
3 begin
4 select cast(collect(object_name) as name_type) into v_names from t;
5 dbms_output.put_line(v_names.count);
6 dbms_output.put_line(v_names(1));
7* end;
42268
ICOL$

PL/SQL procedure successfully completed.


One difference between this and bulk collect is, since this a sql function we need a sql type for this, it cannot be used with local PL/SQL types.

YAS@10G>r
1 declare
2 type name_type is table of varchar2(30);
3 v_names name_type;
4 begin
5 select collect(object_name) into v_names from t;
6* end;
select collect(object_name) into v_names from t;
*
ERROR at line 5:
ORA-06550: line 5, column 35:
PLS-00642: local collection types not allowed in SQL statements
ORA-06550: line 5, column 9:
PL/SQL: ORA-00932: inconsistent datatypes: expected CHAR got -
ORA-06550: line 5, column 2:
PL/SQL: SQL Statement ignored


Another difference, and a more important one, is the performance difference between these two constructs. Using Tom Kyte's runstats package, I did a test to compare the two. I ran the test several times, the results were similar.

YAS@10G>r
1 declare
2 v_names name_type;
3 begin
4 runStats_pkg.rs_start;
5
6 for i in 1..1000 loop
7 select object_name bulk collect into v_names from t;
8 end loop;
9
10 runStats_pkg.rs_middle;
11
12 for i in 1..1000 loop
13 select cast(collect(object_name) as name_type) into v_names from t;
14 end loop;
15
16 runStats_pkg.rs_stop;
17
18* end;
Run1 ran in 1329 hsecs
Run2 ran in 4060 hsecs
run 1 ran in 32.73% of the time

Name Run1 Run2 Diff
LATCH.qmn state object latch 0 1 1
STAT...redo entries 9 10 1
LATCH.JS slv state obj latch 1 0 -1
LATCH.qmn task queue latch 5 6 1
LATCH.transaction branch alloc 0 1 1
LATCH.sort extent pool 0 1 1
LATCH.resmgr:actses change gro 1 0 -1
LATCH.ncodef allocation latch 0 1 1
LATCH.slave class 0 1 1
LATCH.archive control 0 1 1
LATCH.FAL subheap alocation 0 1 1
LATCH.FAL request queue 0 1 1
STAT...heap block compress 6 5 -1
LATCH.session switching 0 1 1
LATCH.ksuosstats global area 1 2 1
LATCH.event group latch 1 0 -1
LATCH.threshold alerts latch 0 1 1
LATCH.list of block allocation 2 0 -2
LATCH.transaction allocation 2 0 -2
LATCH.dummy allocation 3 1 -2
LATCH.user lock 2 0 -2
LATCH.Consistent RBA 4 2 -2
STAT...calls to kcmgcs 4 6 2
STAT...active txn count during 4 6 2
STAT...cleanout - number of kt 4 6 2
STAT...consistent gets 587,009 587,011 2
STAT...consistent gets from ca 587,009 587,011 2
STAT...consistent gets - exami 4 6 2
LATCH.resmgr:free threads list 3 0 -3
LATCH.PL/SQL warning settings 3 0 -3
LATCH.OS process 6 3 -3
LATCH.compile environment latc 3 0 -3
LATCH.slave class create 0 3 3
LATCH.resmgr:actses active lis 3 0 -3
LATCH.cache buffers lru chain 0 3 3
LATCH.OS process allocation 10 14 4
LATCH.session state list latch 4 0 -4
LATCH.redo allocation 50 46 -4
STAT...consistent changes 17 22 5
LATCH.resmgr group change latc 5 0 -5
STAT...db block gets 17 22 5
STAT...db block gets from cach 17 22 5
LATCH.library cache pin alloca 6 0 -6
STAT...db block changes 26 32 6
STAT...session logical reads 587,026 587,033 7
LATCH.In memory undo latch 23 16 -7
LATCH.session idle bit 10 3 -7
LATCH.mostly latch-free SCN 6 14 8
LATCH.KMG MMAN ready and start 5 13 8
LATCH.lgwr LWN SCN 6 14 8
LATCH.simulator hash latch 34,010 34,001 -9
LATCH.simulator lru latch 34,010 34,001 -9
LATCH.session timer 5 14 9
LATCH.dml lock allocation 10 1 -9
LATCH.undo global data 20 10 -10
LATCH.post/wait queue 10 0 -10
LATCH.active checkpoint queue 4 15 11
LATCH.session allocation 13 2 -11
LATCH.archive process latch 4 15 11
LATCH.library cache lock alloc 16 0 -16
LATCH.object queue header oper 8 32 24
LATCH.client/application info 25 0 -25
LATCH.redo writing 24 51 27
LATCH.active service list 37 81 44
STAT...undo change vector size 2,124 2,200 76
LATCH.channel operations paren 60 197 137
LATCH.cache buffers chains 1,174,359 1,174,189 -170
LATCH.JS queue state obj latch 108 288 180
LATCH.messages 108 291 183
STAT...redo size 2,772 2,964 192
LATCH.checkpoint queue latch 80 282 202
LATCH.enqueue hash chains 269 652 383
LATCH.enqueues 248 645 397
LATCH.SQL memory manager worka 276 944 668
LATCH.shared pool 42 1,029 987
LATCH.library cache lock 170 2,018 1,848
LATCH.library cache pin 2,130 4,064 1,934
STAT...Elapsed Time 1,330 4,061 2,731
STAT...CPU used by this sessio 1,334 4,083 2,749
STAT...recursive cpu usage 1,268 4,064 2,796
LATCH.library cache 2,264 5,074 2,810
LATCH.row cache objects 49 9,015 8,966
STAT...session pga memory 2,097,152 1,441,792 -655,360

Run1 latches total versus runs -- difference and pct
Run1 Run2 Diff Pct
1,248,536 1,267,073 18,537 98.54%

PL/SQL procedure successfully completed.


As you see the bulk collect method completes in 1/3rd of the time of the collect function method. Using the collect function hits the library cache harder and uses more latch operations.

The COLLECT function can be used in sql statements to compare collections of columns. In Laurent Schneider's comment on my previous post you can find an example of it.

Wednesday, April 09, 2008

Relational algebra: division in sql

There was a question in one of the Turkish Oracle mailing lists which got me interested. The question simply was:

I have a table holding the parts of a product and another table holding the suppliers of these parts. How can I find the suppliers which supply all the parts?

Someone suggested using the division operation of relational algebra but did not provide how to do it in Oracle. So I started with the Wikipedia link he provided to solve the problem in sql.

Here are the tables:


SQL> create table parts (pid number);

Table created.

SQL> create table catalog (sid number,pid number);

Table created.

SQL> insert into parts select rownum from all_objects where rownum<=5;

5 rows created.

SQL> insert into catalog values (10,1);

1 row created.

SQL> insert into catalog select 1,pid from parts;

5 rows created.

SQL> select * from catalog;

SID PID
---------- ----------
10 1
1 1
1 2
1 3
1 4
1 5


So, the supplier which has all the parts is 1. How do we find that?

The division in relational algebra is done by some steps which are explained in the link. Let's follow those:

The simulation of the division with the basic operations is as follows. We assume that a1,...,an are the attribute names unique to R and b1,...,bm are the attribute names of S. In the first step we project R on its unique attribute names and construct all combinations with tuples in S:

T := πa1,...,an(R) × S

In our case we have the table CATALOG as R, the table PARTS as S. So if we write a sql to find out the above relation we need a cartesian join.

select sid,pid
from (select sid from catalog) ,parts
;


This give us all suppliers combined with all parts.

SID PID
---------- ----------
10 1
1 1
1 1
1 1
1 1
1 1
10 2
1 2
1 2
1 2
1 2

SID PID
---------- ----------
1 2
10 3
1 3
1 3
1 3
1 3
1 3
10 4
1 4
1 4
1 4

SID PID
---------- ----------
1 4
1 4
10 5
1 5
1 5
1 5
1 5
1 5

We now have all the possibilities for the supplier-part relation.

The second step is:

In the next step we subtract R from this relation:

U := T - R

To subtract the table CATALOG we need the MINUS operator.

select sid,pid
from (select sid from catalog) ,parts
minus
select sid,pid from catalog;

SID PID
---------- ----------
10 2
10 3
10 4
10 5





After we had all the possibilities we subtracted the ones which are already in the CATALOG table and we got the ones which are not present in the table.

On to the next step:

Note that in U we have the possible combinations that "could have" been in R, but weren't. So if we now take the projection on the attribute names unique to R then we have the restrictions of the tuples in R for which not all combinations with tuples in S were present in R:

V := πa1,...,an(U)

This step just gets the supplier id's from the previous query.

select sid from (
select sid,pid
from (select sid from catalog) ,parts
minus
select sid,pid from catalog
);

SID
----------
10
10
10
10



Now we have the supplier id which does not supply all of the parts. The next step is obvious, to find the other suppliers (the remaining ones supply all the parts).

So what remains to be done is take the projection of R on its unique attribute names and subtract those in V:

W := πa1,...,an(R) - V

To do this we need to subtract the previous query from the table CATALOG.

select sid from catalog
minus
select sid from (
select sid,pid
from (select sid from catalog) ,parts
minus
select sid,pid from catalog
);

SID
----------
1


In some pages (like this one) the same operation is done using a different sql (obtained by using "not exists" instead of "minus").

select distinct sid from catalog c1
where not exists (
select null from parts p
where not exists (select null from catalog where pid=p.pid and c1.sid=sid));


This one is harder for me to understand in the first look, I am more comfortable following the steps above.

It is great to follow the steps of relational algebra to solve a problem in sql. It helps very much in understanding the solution. Relational algebra rocks!