I have been trying to tune a plsql package which runs for more than 1.5 hours. When looking at the raw sql trace file, i saw that a select statement was causing thousands of db file sequential reads for some of the bind variables. The pseudo-code is like this:
Cursor c1 is select <col list> from <master table> order by <col>;
Cursor c2(v1 in number) is select <col list> from <table> where <col>=v_1;
Open c1;
loop
Fetch from c1;
Process c1 row;
Open c2(<value from c1>);
Loop
Fetch from c2;
Process c2 row;
End loop;
End loop;
The problem sql statement was the select in cursor c2. It came out that it was performing well for some parameters and running awful for some parameters. The problem is that for the first time c2 is opened it produces an execution plan for the related parameter value and uses that plan for all the next calls to the same statement. This is called bind variable peeking. So if the next value for the bind variable is not similar to the previous one in terms of statistics (number of matching rows, etc...) the same plan may not be appropriate for it.
If you a have a column that has a skewed data distribution, a histogram on it and if you are using bind variables against that column you may suffer from the same problem.
Here is a very very simplified test case of the problem.
SQL> create table bind_test as select 'MANY' object_name,object_id from all_objects;
Table created.
SQL> insert into bind_test select * from bind_test;
24221 rows created.
SQL> r
1* insert into bind_test select * from bind_test
48442 rows created.
SQL> r
1* insert into bind_test select * from bind_test
96884 rows created.
SQL> commit;
Commit complete.
SQL> create index bind_test_ind on bind_test(object_name);
Index created.
SQL> insert into bind_test select 'FEW',object_id from all_objects where rownum<=10
2 ;
10 rows created.
SQL> commit;
Commit complete.
SQL> select object_name,count(*) from bind_test group by object_name;
OBJE COUNT(*)
---- ----------
FEW 10
MANY 193768
SQL> EXEC dbms_stats.gather_table_stats(ownname=>null,tabname=>'BIND_TEST',cascade=>true,method_opt=>'for all indexed columns size 2');
PL/SQL procedure successfully completed.
Now, i have a table with many rows that have ‘MANY’ as the object_name and very few rows that have ‘FEW’ as the object_name and a histogram on the object_name column.
SQL> alter session set events '10046 trace name context forever, level 12';
Session altered.
SQL> declare
2 b1 varchar2(4);
3 begin
4 b1 := 'FEW';
5 for t in (select * from bind_test where object_name=b1) loop
6 null;
7 end loop;
8 b1 := 'MANY';
9 for t in (select * from bind_test where object_name=b1) loop
10 null;
11 end loop;
12 end;
13 /
I run the same sql first for the value ‘FEW’ and then for the value ‘MANY’. The trace output shows:
SELECT *
FROM
BIND_TEST WHERE OBJECT_NAME=:B1
call count cpu elapsed disk query current rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse 2 0.00 0.00 0 0 0 0
Execute 2 0.01 0.01 0 0 0 0
Fetch 193770 5.33 5.70 398 387539 0 193768
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total 193774 5.34 5.71 398 387539 0 193768
Misses in library cache during parse: 1
Optimizer goal: CHOOSE
Parsing user id: 27 (recursive depth: 1)
Rows Row Source Operation
------- ---------------------------------------------------
0 TABLE ACCESS BY INDEX ROWID BIND_TEST
0 INDEX RANGE SCAN BIND_TEST_IND (object id 82756)
It uses the index on object_name which is good for the value ‘FEW’ but bad for the value ‘MANY’. Because when the statement was parsed the bind variable had the value ‘FEW’ and the execution plan is produced for that value. Subsequent runs of the statement used the same plan even if the bind variable’s value has completely different statistics.
If you have a similar situation, you can use literals instead of bind variables. For this, you have to hard code the values as literals to the statements or you can use dynamic sql and produce the sql with literals at run time. Be aware of the consequences (parsing, many different sqls in the shared pool, etc...).
If the performance of the query for most of the bind values is ok with a specific plan and negligibly worse for the other values with the same plan, you can use hints to force the same plan for all of the bind values (unlike the above simple example). Or you can use stored outlines instead of hints. This was the way i chose. I created a stored outline to force the same plan for the bind values. The execution time dropped to less than 15 minutes from more than 1 hour.
Any other solutions to the bind peeking problem i could not remember?
If the execution plans are very different, you would have to resort to using different versions of c2 for parameters with different statistics - eg list the selected fields in a different order, or choose a different alias for the table.
ReplyDeletehi,
ReplyDeleteI have read your post, it's very interesting.
In my company we have 2 enviroments, development and production both have the same characteristics, parameters etc...
One query needs 0 seconds to be executed in the dev instance and 5 seconds in the production instance. The execution plan is equal in both instances. In theory production machine is powerfull than dev machine.
Do you know what is happening?
Very strange.
Greetings.
Sergi
Sergi,
ReplyDeleteWhen talking about execution plans don't relay on explain plain.
Enable event 10046 and check its dump trace file.