I’ll probably have to file this one under “Optimizer ignoring hints” – except that it should also go under “bugs”, and that’s one of the get-out clauses I use in my “hints are not hints” argument.
Sometimes an invisible index isn’t completely invisible.
Here’s a demonstration from 11.2.0.3 showing something which, to my mind, is a very annoying problem. The objects are in a tablespace that has been created with uniform extents of 1MB on an 8KB block size, using freelist management. I’ve rigged the Hakan factor to ensure that I get exactly 40 rows per block, and I’ve set the system statistics to ensure that a relatively small swing in cost results in a change in execution plan.
SQL> desc t1 Name Null? Type ----------------------------- -------- -------------------- COLX NUMBER COLY NUMBER PADDING VARCHAR2(150) truncate table t1; insert /*+ append */ into t1 with generator as ( select --+ materialize rownum id from dual connect by level <= 1e4 ) select trunc((rownum - 1) / 1000) colX, mod((rownum - 1) , 40) colY, rpad('x',150) padding from generator v1, generator v2 where rownum <= 1e6; commit; begin dbms_stats.gather_table_stats( ownname => user, tabname =>'T1', method_opt => 'for all columns size 1' ); end; / create index t1_one_col on t1(colX) nologging; create index t1_two_col on t1(colX, colY) nologging; select * from t1 where colX = 500 ;
You won’t be surprised to learn that if I run the query I’ve shown above, Oracle uses the index on (colX) to access the table; the 1,000 rows are all in a single cluster of 12 consecutive blocks in the table so even though it looks like quite a large number of rows to access by index, the indexed access path is still an efficient one. However, I’d like to drop this index because it has a huge functional overlap with the index (colX, colY), and I’d hope that the optimizer would simply use the larger index when I dropped the smaller. Just to play safe, though, I’ll make t1_one_col invisible and check the execution plan – and this is what I got (remember, this depends to some degree on my system stats):
-------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | -------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1000 | 154K| 859 (9)| 00:00:05 | |* 1 | TABLE ACCESS FULL| T1 | 1000 | 154K| 859 (9)| 00:00:05 | -------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("COLX"=500)
The optimizer has picked a full tablescan because the pattern of the data (combined with the definition of the index) has produced a much larger clustering_factor on the t1_two_col index than on the t1_one_col index; but that’s not a big problem, for testing purposes I can always put a hint into the SQL, and since the version is newer than 9i I can use the “index description” syntax so that I can tell the optimizer to use t1_one_col if it’s available, but the best index that starts with the same columns in the same order if t1_one_col isn’t available:
select /*+ index(t1(colX)) */ * from t1 where colX = 500 ;
This query should use the most cost-effective index on the table that starts with column colX – and since I’ve made t1_one_col invisible the optimizer should use index t1_two_col. Unfortunately the optimizer ignored my hint !
Since I was working with a small, private, data set the obvious thing to do next was to drop t1_one_col to show that the optimizer could be made to use index t1_two_col; and this is the resulting plan for exactly the same (hinted) query:
------------------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1000 | 154K| 1007 (1)| 00:00:06 | | 1 | TABLE ACCESS BY INDEX ROWID| T1 | 1000 | 154K| 1007 (1)| 00:00:06 | |* 2 | INDEX RANGE SCAN | T1_TWO_COL | 1000 | | 5 (0)| 00:00:01 | ------------------------------------------------------------------------------------------ Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("COLX"=500)
Normally if there is no exact match for an “index structure” hint the optimizer will associate the hint with any index that starts with the correct set of columns in the right order; if there is an exact match the hint is associated only with that index.
However it looks as if the selection of candidates that match the hint is made before the optimizer checks for index visibility. As a result, if you’ve used the new-style hints in your code and hope to have a period of running on production with invisible indexes as a way of testing a change in your indexing strategy (e.g. adding a column to an index to reduce visits to a table, dropping an index that is a prefix to another index) you may find that after a successful test period you still see plans change when you finally drop the indexes that you had made invisible.
Bonus blog note:
There are other cases when an invisible index isn’t quite as invisible as you might hope. Here’s am blog that I noticed a little while ago with example involving v$object_usage:
http://www.kelloggsdba.blogspot.co.uk/2012/08/vobjectusage-invisible-index-used.html
