Here’s a little puzzle, highlighting a “bug that’s not a bug” that was “fixed but not fixed” some time in the 10.2 timeline. (If you want specifics about exactly when the fix became available and what patches might be available they’re in MOS – Bug 5099019 : DBMS_STATS DOESN’T COUNT LEAF_BLOCKS CORRECTLY.
Running 19.3.0.0, with the system statistics as shown:
begin dbms_stats.set_system_stats('MBRC',16); dbms_stats.set_system_stats('MREADTIM',10); dbms_stats.set_system_stats('SREADTIM',5); dbms_stats.set_system_stats('CPUSPEED',500); end; begin dbms_stats.gather_table_stats( ownname => user, tabname =>'T1', cascade => true, estimate_percent => 100, method_opt => 'for all columns size 1' ); end; / select index_name, leaf_blocks from user_indexes where table_name = 'T1' ; alter system flush buffer_cache; set autotrace traceonly select count(small_vc) from t1 where small_vc is not null ; set autotrace off
For my test run there are no uncommitted transactions, the gathering of table stats with cascade to indexes means all the blocks are clean so there’s no undo activity needed to produce an answer for the final query, and all that the query is going to do is run an index fast full scan to count the number of rows in the table because there’s an index on just (small_vc).
The system stats tell us that Oracle is going to cost the index fast full scan by taking the leaf block count, dividing by 16 (the MBRC) and multiplying by 2 ( mreadtim / sreadtim) and adding a bit for CPU usage. So here are the results from running the query and generating the plan with autotrace:
PL/SQL procedure successfully completed. INDEX_NAME LEAF_BLOCKS -------------------- ----------- T1_V1 153 1 row selected. System altered. 1 row selected. Execution Plan ---------------------------------------------------------- Plan hash value: 274579903 ------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 11 | 22 (5)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 11 | | | |* 2 | INDEX FAST FULL SCAN| T1_V1 | 9999 | 107K| 22 (5)| 00:00:01 | ------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("SMALL_VC" IS NOT NULL) Statistics ---------------------------------------------------------- 0 recursive calls 0 db block gets 1530 consistent gets 1522 physical reads 0 redo size 558 bytes sent via SQL*Net to client 425 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed
Points to notice:
- The cost is (as predicted): ceiling(153 leaf_blocks / 16 MBRC ) * 2 + a bit: = 20 + bit.
- The number of physical blocks read is 1522, not the 153 leaf blocks reported in index stats
- No recursive SQL, and I did say that there were no undo / read consistency issues to worry about
There is clearly an inconsistency between the size of the index and the (100% gathered) leaf_block count that the optimizer is using, and this is a point I made many years ago in “Cost Based Oracle – Fundamentals”. To gather the leaf block count Oracle has looked at the number of leaf blocks that, after cleanout – and here’s how I prepared this demo:
rem rem Script: index_ffs_cost.sql rem Author: Jonathan Lewis rem Dated: July 2012 rem create table t1 as with generator as ( select --+ materialize rownum id from dual connect by level <= 1e4 -- > comment to avoid wordpress format issue ) select rownum id, lpad(rownum,10,'0') small_vc, rpad('x',100) padding from generator v1, generator v2 where rownum <= 1e5 -- > comment to avoid wordpress format issue ; create index t1_v1 on t1(small_vc) pctfree 80; delete from t1 where id between 5000 and 95000 ; commit;
When I gathered stats on the index I had just deleted 90% of the data from the table – so 90% of the leaf blocks in the index were empty but still in the index structure, and only 10% of the leaf blocks held any current index entries. This is why Oracle reported leaf_blocks = 153 while the index fast full scan had to read through nearly 1530 blocks.
This is one of those contradictory problems – for an index fast full scan you need to know the size of the segment that will be scanned because that’s the number of blocks you will have to examine; but in most cases the number of populated index leaf blocks is the number you need to know about when you’re trying to cost an index range scan. Of course in most cases the nature of Oracle’s implementation of B-tree indexes will mean that the two counts will be within one or two percent of each other. But there are a few extreme cases where you could get an index into a state where the segment size is large and the data set is small and you don’t want the optimizer to think that an index fast full scan will be low-cost.
Oracle produced a mechanism for getting the segment block count captured as the leaf_blocks statistic late in 10.2, but it’s not implemented by default, and it’s not something you can tweak into a query with the opt_param() hint. Fix control 509019 has the description: “set leaf blocks to the number of blocks in the index extent map”, but it’s not a run-time fix, it’s a fix that has to be in place when you gather the index stats – thus:
alter session set "_fix_control"='5099019:1'; begin dbms_stats.gather_table_stats( ownname => user, tabname =>'T1', cascade => true, estimate_percent => 100, method_opt => 'for all columns size 1' ); end; / select index_name, leaf_blocks from user_indexes where table_name = 'T1' ; alter system flush buffer_cache; set autotrace traceonly select count(small_vc) from t1 where small_vc is not null ; set autotrace off =================================================== Session altered. PL/SQL procedure successfully completed. INDEX_NAME LEAF_BLOCKS -------------------- ----------- T1_V1 1555 1 row selected. System altered. 1 row selected. Execution Plan ---------------------------------------------------------- Plan hash value: 274579903 ------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 11 | 201 (3)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 11 | | | |* 2 | INDEX FAST FULL SCAN| T1_V1 | 9999 | 107K| 201 (3)| 00:00:01 | ------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter("SMALL_VC" IS NOT NULL) Statistics ---------------------------------------------------------- 1 recursive calls 0 db block gets 1530 consistent gets 1522 physical reads 0 redo size 558 bytes sent via SQL*Net to client 439 bytes received via SQL*Net from client 2 SQL*Net roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 1 rows processed Session altered.
As you can see Oracle has now set leaf_blocks = 1555. This is actually the number of blocks below the highwater mark in the segment. A further check with the dbms_space and dbms_space_usage packages showed that the number of blocks in the index structure was actually 1,523 with a further 32 blocks that we could infer were space management blocks. Of the 1,523 blocks in the index structure 157 were reported as “FULL” while 1364 were reported as FS2 which possibly ought to mean “available for re-use” (though still in the structure), although this didn’t quite seem to be the case a few years ago.
Although Oracle has supplied a fix to a problem I highlighted in CBO-F, I can understand why it’s not enabled by default, and I don’t think I’d want to take advantage of it in a production system given the way it’s a session setting at stats gathering time. The number of times it likely to matter I’d probably add hints to the SQL to stop the optimizer from using the index incorrectly, or do something at a stats-gathering moment to call dbms_stats.set_index_stats().